1
|
Di Matteo A, Belloni E, Pradella D, Cappelletto A, Volf N, Zacchigna S, Ghigna C. Alternative splicing in endothelial cells: novel therapeutic opportunities in cancer angiogenesis. JOURNAL OF EXPERIMENTAL & CLINICAL CANCER RESEARCH : CR 2020; 39:275. [PMID: 33287867 PMCID: PMC7720527 DOI: 10.1186/s13046-020-01753-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 10/26/2020] [Indexed: 02/07/2023]
Abstract
Alternative splicing (AS) is a pervasive molecular process generating multiple protein isoforms, from a single gene. It plays fundamental roles during development, differentiation and maintenance of tissue homeostasis, while aberrant AS is considered a hallmark of multiple diseases, including cancer. Cancer-restricted AS isoforms represent either predictive biomarkers for diagnosis/prognosis or targets for anti-cancer therapies. Here, we discuss the contribution of AS regulation in cancer angiogenesis, a complex process supporting disease development and progression. We consider AS programs acting in a specific and non-redundant manner to influence morphological and functional changes involved in cancer angiogenesis. In particular, we describe relevant AS variants or splicing regulators controlling either secreted or membrane-bound angiogenic factors, which may represent attractive targets for therapeutic interventions in human cancer.
Collapse
Affiliation(s)
- Anna Di Matteo
- Istituto di Genetica Molecolare, "Luigi Luca Cavalli-Sforza", Consiglio Nazionale delle Ricerche, via Abbiategrasso 207, 27100, Pavia, Italy
| | - Elisa Belloni
- Istituto di Genetica Molecolare, "Luigi Luca Cavalli-Sforza", Consiglio Nazionale delle Ricerche, via Abbiategrasso 207, 27100, Pavia, Italy
| | - Davide Pradella
- Istituto di Genetica Molecolare, "Luigi Luca Cavalli-Sforza", Consiglio Nazionale delle Ricerche, via Abbiategrasso 207, 27100, Pavia, Italy
| | - Ambra Cappelletto
- Cardiovascular Biology Laboratory, International Centre for Genetic Engineering and Biotechnology (ICGEB), 34149, Trieste, Italy
| | - Nina Volf
- Cardiovascular Biology Laboratory, International Centre for Genetic Engineering and Biotechnology (ICGEB), 34149, Trieste, Italy
| | - Serena Zacchigna
- Cardiovascular Biology Laboratory, International Centre for Genetic Engineering and Biotechnology (ICGEB), 34149, Trieste, Italy. .,Department of Medical, Surgical and Health Sciences, University of Trieste, 34149, Trieste, Italy.
| | - Claudia Ghigna
- Istituto di Genetica Molecolare, "Luigi Luca Cavalli-Sforza", Consiglio Nazionale delle Ricerche, via Abbiategrasso 207, 27100, Pavia, Italy.
| |
Collapse
|
2
|
Vawter MP, Philibert R, Rollins B, Ruppel PL, Osborn TW. Exon Array Biomarkers for the Differential Diagnosis of Schizophrenia and Bipolar Disorder. MOLECULAR NEUROPSYCHIATRY 2018; 3:197-213. [PMID: 29888231 DOI: 10.1159/000485800] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Accepted: 11/16/2017] [Indexed: 12/26/2022]
Abstract
This study developed potential blood-based biomarker tests for diagnosing and differentiating schizophrenia (SZ), bipolar disorder type I (BD), and normal control (NC) subjects using mRNA gene expression signatures. A total of 90 subjects (n = 30 each for the three groups of subjects) provided blood samples at two visits. The Affymetrix exon microarray was used to profile the expression of over 1.4 million probesets. We selected potential biomarker panels using the temporal stability of the probesets and also back-tested them at two different visits for each subject. The 18-gene biomarker panels, using logistic regression modeling, correctly differentiated the three groups of subjects with high accuracy across the two different clinical visits (83-88% accuracy). The results are also consistent with the actual data and the "leave-one-out" analyses, indicating that the models should be predictive when applied to independent data cohorts. Many of the SZ and BD subjects were taking antipsychotic and mood stabilizer medications at the time of blood draw, raising the possibility that these drugs could have affected some of the differential transcription signatures. Using an independent Illumina data set of gene expression data from antipsychotic medication-free SZ subjects, the 18-gene biomarker panels produced a receiver operating characteristic curve accuracy greater than 0.866 in patients that were less than 30 years of age and medication free. We confirmed select transcripts by quantitative PCR and the nCounter® System. The episodic nature of psychiatric disorders might lead to highly variable results depending on when blood is collected in relation to the severity of the disease/symptoms. We have found stable trait gene panel markers for lifelong psychiatric disorders that may have diagnostic utility in younger undiagnosed subjects where there is a critical unmet need. The study requires replication in subjects for ultimate proof of the utility of the differential diagnosis.
Collapse
Affiliation(s)
- Marquis Philip Vawter
- Functional Genomics Laboratory, Department of Psychiatry, University of California, Irvine, California, USA
| | - Robert Philibert
- Department of Psychiatry, University of Iowa, Iowa City, Iowa, USA
| | - Brandi Rollins
- Functional Genomics Laboratory, Department of Psychiatry, University of California, Irvine, California, USA
| | | | | |
Collapse
|
3
|
Aguilar-Pontes MV, de Vries RP, Zhou M. (Post-)genomics approaches in fungal research. Brief Funct Genomics 2014; 13:424-39. [PMID: 25037051 DOI: 10.1093/bfgp/elu028] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
To date, hundreds of fungal genomes have been sequenced and many more are in progress. This wealth of genomic information has provided new directions to study fungal biodiversity. However, to further dissect and understand the complicated biological mechanisms involved in fungal life styles, functional studies beyond genomes are required. Thanks to the developments of current -omics techniques, it is possible to produce large amounts of fungal functional data in a high-throughput fashion (e.g. transcriptome, proteome, etc.). The increasing ease of creating -omics data has also created a major challenge for downstream data handling and analysis. Numerous databases, tools and software have been created to meet this challenge. Facing such a richness of techniques and information, hereby we provide a brief roadmap on current wet-lab and bioinformatics approaches to study functional genomics in fungi.
Collapse
|
4
|
Antisense transcription at the TRPM2 locus as a novel prognostic marker and therapeutic target in prostate cancer. Oncogene 2014; 34:2094-102. [PMID: 24931166 DOI: 10.1038/onc.2014.144] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2013] [Revised: 03/16/2014] [Accepted: 04/19/2014] [Indexed: 01/19/2023]
Abstract
Overwhelming evidence indicates that cancer is a genetic disease caused by the accumulation of mutations in oncogenes and tumor suppressor genes. It is also increasingly apparent, however, that cancer depends not only on mutations in these coding genes but also on alterations in the large class of non-coding RNAs. Here, we report that one such long non-coding RNA, TRPM2-AS, an antisense transcript of TRPM2, which encodes an oxidative stress-activated ion channel, is overexpressed in prostate cancer (PCa). The high expression of TRPM2-AS and its related gene signature were found to be linked to poor clinical outcome, with the related gene signature working also independently of the patient's Gleason score. Mechanistically, TRPM2-AS knockdown led to PCa cell apoptosis, with a transcriptional profile that indicated an unbearable increase in cellular stress in the dying cells, which was coupled to cell cycle arrest, an increase in intracellular hydrogen peroxide and activation of the sense TRPM2 gene. Moreover, targets of existing drugs and treatments were found to be consistently associated with high TRPM2-AS levels in both targeted cells and patients, ultimately suggesting that the measurement of the expression levels of TRPM2-AS allows not only for the early identification of aggressive PCa tumors, but also identifies a subset of at-risk patients who would benefit from currently available, but mostly differently purposed, therapeutic agents.
Collapse
|
5
|
Gao X, Sinha S, Belcastro M, Woodard C, Ramamurthy V, Stoilov P, Sokolov M. Splice isoforms of phosducin-like protein control the expression of heterotrimeric G proteins. J Biol Chem 2013; 288:25760-25768. [PMID: 23888055 DOI: 10.1074/jbc.m113.486258] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Heterotrimeric G proteins play an essential role in cellular signaling; however, the mechanism regulating their synthesis and assembly remains poorly understood. A line of evidence indicates that the posttranslational processing of G protein β subunits begins inside the protein-folding chamber of the chaperonin containing t-complex protein 1. This process is facilitated by the ubiquitously expressed phosducin-like protein (PhLP), which is thought to act as a CCT co-factor. Here we demonstrate that alternative splicing of the PhLP gene gives rise to a transcript encoding a truncated, short protein (PhLPs) that is broadly expressed in human tissues but absent in mice. Seeking to elucidate the function of PhLPs, we expressed this protein in the rod photoreceptors of mice and found that this manipulation caused a dramatic translational and posttranslational suppression of rod heterotrimeric G proteins. The investigation of the underlying mechanism revealed that PhLPs disrupts the folding of Gβ and the assembly of Gβ and Gγ subunits, events normally assisted by PhLP, by forming a stable and apparently inactive tertiary complex with CCT preloaded with nascent Gβ. As a result, the cellular levels of Gβ and Gγ, which depends on Gβ for stability, decline. In addition, PhLPs evokes a profound and rather specific down-regulation of the Gα transcript, leading to a complete disappearance of the protein. This study provides the first evidence of a generic mechanism, whereby the splicing of the PhLP gene could potentially and efficiently regulate the cellular levels of heterotrimeric G proteins.
Collapse
Affiliation(s)
- Xueli Gao
- From the Departments of Ophthalmology and
| | | | | | - Catherine Woodard
- Biochemistry, West Virginia University, Morgantown, West Virginia 26506
| | - Visvanathan Ramamurthy
- From the Departments of Ophthalmology and; Biochemistry, West Virginia University, Morgantown, West Virginia 26506
| | - Peter Stoilov
- Biochemistry, West Virginia University, Morgantown, West Virginia 26506
| | - Maxim Sokolov
- From the Departments of Ophthalmology and; Biochemistry, West Virginia University, Morgantown, West Virginia 26506.
| |
Collapse
|
6
|
Lahti L, Torrente A, Elo LL, Brazma A, Rung J. A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases. Nucleic Acids Res 2013; 41:e110. [PMID: 23563154 PMCID: PMC3664815 DOI: 10.1093/nar/gkt229] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Rapid accumulation of large and standardized microarray data collections is opening up novel opportunities for holistic characterization of genome function. The limited scalability of current preprocessing techniques has, however, formed a bottleneck for full utilization of these data resources. Although short oligonucleotide arrays constitute a major source of genome-wide profiling data, scalable probe-level techniques have been available only for few platforms based on pre-calculated probe effects from restricted reference training sets. To overcome these key limitations, we introduce a fully scalable online-learning algorithm for probe-level analysis and pre-processing of large microarray atlases involving tens of thousands of arrays. In contrast to the alternatives, our algorithm scales up linearly with respect to sample size and is applicable to all short oligonucleotide platforms. The model can use the most comprehensive data collections available to date to pinpoint individual probes affected by noise and biases, providing tools to guide array design and quality control. This is the only available algorithm that can learn probe-level parameters based on sequential hyperparameter updates at small consecutive batches of data, thus circumventing the extensive memory requirements of the standard approaches and opening up novel opportunities to take full advantage of contemporary microarray collections.
Collapse
Affiliation(s)
- Leo Lahti
- Department of Veterinary Bioscience, University of Helsinki, Agnes Sjöbergin katu 2, PO Box 66, FI-00014 University of Helsinki, Finland.
| | | | | | | | | |
Collapse
|
7
|
Patrick E, Buckley M, Yang YH. Estimation of data-specific constitutive exons with RNA-Seq data. BMC Bioinformatics 2013; 14:31. [PMID: 23360225 PMCID: PMC3656776 DOI: 10.1186/1471-2105-14-31] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2012] [Accepted: 01/13/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND RNA-Seq has the potential to answer many diverse and interesting questions about the inner workings of cells. Estimating changes in the overall transcription of a gene is not straightforward. Changes in overall gene transcription can easily be confounded with changes in exon usage which alter the lengths of transcripts produced by a gene. Measuring the expression of constitutive exons--xons which are consistently conserved after splicing--ffers an unbiased estimation of the overall transcription of a gene. RESULTS We propose a clustering-based method, exClust, for estimating the exons that are consistently conserved after splicing in a given data set. These are considered as the exons which are "constitutive" in this data. The method utilises information from both annotation and the dataset of interest. The method is implemented in an openly available R function package, sydSeq. CONCLUSION When used on two real datasets exClust includes more than three times as many reads as the standard UI method, and improves concordance with qRT-PCR data. When compared to other methods, our method is shown to produce robust estimates of overall gene transcription.
Collapse
Affiliation(s)
- Ellis Patrick
- School of Mathematics and Statistics, University of Sydney, Sydney NSW 2006, Australia
| | | | | |
Collapse
|
8
|
Gencheva M, Yang L, Lin GB, Lin RJ. Detection of Alternatively Spliced or Processed RNAs in Cancer Using Oligonucleotide Microarray. Cancer Treat Res 2013; 158:25-40. [PMID: 24222353 DOI: 10.1007/978-3-642-31659-3_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Deregulation of gene expression plays a pivotal role in tumorigenesis, so the ability to detect RNA alterations is of great value in cancer diagnosis and management. DNA microarrays have been used to measure changes in mRNA or microRNA level, but less often the change of RNA isoforms. Here we appraise the utilization of microarray in detecting alternatively processed RNAs, which have alternative splice forms, retained introns, or altered 3' untranslated regions. We cover the methodology and focus on cancer studies. Recent development in parallel or deep sequencing used in transcriptome analysis is also discussed.
Collapse
Affiliation(s)
- Marieta Gencheva
- Department of Molecular Biology, Beckman Research Institute of the City of Hope, 1500 East Duarte Road, Duarte, CA 91010-3000, USA
| | | | | | | |
Collapse
|
9
|
Butte MJ, Lee SJ, Jesneck J, Keir ME, Haining WN, Sharpe AH. CD28 costimulation regulates genome-wide effects on alternative splicing. PLoS One 2012; 7:e40032. [PMID: 22768209 PMCID: PMC3386953 DOI: 10.1371/journal.pone.0040032] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2012] [Accepted: 06/03/2012] [Indexed: 12/31/2022] Open
Abstract
CD28 is the major costimulatory receptor required for activation of naïve T cells, yet CD28 costimulation affects the expression level of surprisingly few genes over those altered by TCR stimulation alone. Alternate splicing of genes adds diversity to the proteome and contributes to tissue-specific regulation of genes. Here we demonstrate that CD28 costimulation leads to major changes in alternative splicing during activation of naïve T cells, beyond the effects of TCR alone. CD28 costimulation affected many more genes through modulation of alternate splicing than by modulation of transcription. Different families of biological processes are over-represented among genes alternatively spliced in response to CD28 costimulation compared to those genes whose transcription is altered, suggesting that alternative splicing regulates distinct biological effects. Moreover, genes dependent upon hnRNPLL, a global regulator of splicing in activated T cells, were enriched in T cells activated through TCR plus CD28 as compared to TCR alone. We show that hnRNPLL expression is dependent on CD28 signaling, providing a mechanism by which CD28 can regulate splicing in T cells and insight into how hnRNPLL can influence signal-induced alternative splicing in T cells. The effects of CD28 on alternative splicing provide a newly appreciated means by which CD28 can regulate T cell responses.
Collapse
Affiliation(s)
- Manish J. Butte
- Department of Microbiology and Immunobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Sun Jung Lee
- Department of Microbiology and Immunobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Jonathan Jesneck
- Department of Pediatric Oncology, Dana Farber Cancer Institute, Boston, Massachusetts, United States of America and Division of Pediatric Hematology/Oncology, Children’s Hospital, Boston, Massachusetts, United States of America
| | - Mary E. Keir
- Department of Microbiology and Immunobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - W. Nicholas Haining
- Department of Pediatric Oncology, Dana Farber Cancer Institute, Boston, Massachusetts, United States of America and Division of Pediatric Hematology/Oncology, Children’s Hospital, Boston, Massachusetts, United States of America
| | - Arlene H. Sharpe
- Department of Microbiology and Immunobiology, Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Pathology, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
10
|
Seok J, Xu W, Gao H, Davis RW, Xiao W. JETTA: junction and exon toolkits for transcriptome analysis. Bioinformatics 2012; 28:1274-5. [PMID: 22433281 DOI: 10.1093/bioinformatics/bts134] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
SUMMARY High-throughput genome-wide studies of alternatively spliced mRNA transcripts have become increasingly important in clinical research. Consequently, easy-to-use software tools are required to process data from these studies, for example, using exon and junction arrays. Here, we introduce JETTA, an integrated software package for the calculation of gene expression indices as well as the identification and visualization of alternative splicing events. We demonstrate the software using data of human liver and muscle samples hybridized on an exon-junction array. AVAILABILITY JETTA and its demonstrations are freely available at http://igenomed.stanford.edu/~junhee/JETTA/index.html
Collapse
Affiliation(s)
- Junhee Seok
- Stanford Genome Technology Center, 855 California Street, Palo Alto, CA 94304, USA
| | | | | | | | | |
Collapse
|
11
|
Ricci M, Xu Y, Hammond HL, Willoughby DA, Nathanson L, Rodriguez MM, Vatta M, Lipshultz SE, Lincoln J. Myocardial alternative RNA splicing and gene expression profiling in early stage hypoplastic left heart syndrome. PLoS One 2012; 7:e29784. [PMID: 22299024 PMCID: PMC3267718 DOI: 10.1371/journal.pone.0029784] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2011] [Accepted: 12/05/2011] [Indexed: 12/22/2022] Open
Abstract
Hypoplastic Left Heart Syndrome (HLHS) is a congenital defect characterized by underdevelopment of the left ventricle and pathological compensation of the right ventricle. If untreated, HLHS is invariably lethal due to the extensive increase in right ventricular workload and eventual failure. Despite the clinical significance, little is known about the molecular pathobiological state of HLHS. Splicing of mRNA transcripts is an important regulatory mechanism of gene expression. Tissue specific alterations of this process have been associated with several cardiac diseases, however, transcriptional signature profiles related to HLHS are unknown. In this study, we performed genome-wide exon array analysis to determine differentially expressed genes and alternatively spliced transcripts in the right ventricle (RV) of six neonates with HLHS, compared to the RV and left ventricle (LV) from non-diseased control subjects. In HLHS, over 180 genes were differentially expressed and 1800 were differentially spliced, leading to changes in a variety of biological processes involving cell metabolism, cytoskeleton, and cell adherence. Additional hierarchical clustering analysis revealed that differential gene expression and mRNA splicing patterns identified in HLHS are unique compared to non-diseased tissue. Our findings suggest that gene expression and mRNA splicing are broadly dysregulated in the RV myocardium of HLHS neonates. In addition, our analysis identified transcriptome profiles representative of molecular biomarkers of HLHS that could be used in the future for diagnostic and prognostic stratification to improve patient outcome.
Collapse
Affiliation(s)
- Marco Ricci
- Division of Cardiothoracic Surgery, University of Miami Miller School of Medicine and Holtz Children's Hospital/Jackson Memorial Hospital, Miami, Florida, United States of America.
| | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Mugal CF, Ellegren H. Substitution rate variation at human CpG sites correlates with non-CpG divergence, methylation level and GC content. Genome Biol 2011; 12:R58. [PMID: 21696599 PMCID: PMC3218846 DOI: 10.1186/gb-2011-12-6-r58] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2011] [Revised: 05/04/2011] [Accepted: 06/22/2011] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND A major goal in the study of molecular evolution is to unravel the mechanisms that induce variation in the germ line mutation rate and in the genome-wide mutation profile. The rate of germ line mutation is considerably higher for cytosines at CpG sites than for any other nucleotide in the human genome, an increase commonly attributed to cytosine methylation at CpG sites. The CpG mutation rate, however, is not uniform across the genome and, as methylation levels have recently been shown to vary throughout the genome, it has been hypothesized that methylation status may govern variation in the rate of CpG mutation. RESULTS Here, we use genome-wide methylation data from human sperm cells to investigate the impact of DNA methylation on the CpG substitution rate in introns of human genes. We find that there is a significant correlation between the extent of methylation and the substitution rate at CpG sites. Further, we show that the CpG substitution rate is positively correlated with non-CpG divergence, suggesting susceptibility to factors responsible for the general mutation rate in the genome, and negatively correlated with GC content. We only observe a minor contribution of gene expression level, while recombination rate appears to have no significant effect. CONCLUSIONS Our study provides the first direct empirical support for the hypothesis that variation in the level of germ line methylation contributes to substitution rate variation at CpG sites. Moreover, we show that other genomic features also impact on CpG substitution rate variation.
Collapse
Affiliation(s)
- Carina F Mugal
- Department of Evolutionary Biology, Uppsala University, Norbyvägen 18D, Uppsala, Sweden
| | | |
Collapse
|
13
|
Abstract
Owing to the growing knowledge about the cellular molecular network and its alterations in diseases, most of the diseases become considered as "systems distortion of the cellular molecular network". This view of diseases, which we call "systems pathology", has brought about a new usage of the disease Omics, that is, to identify the altered molecular network underlying the disease. In this chapter, we discuss the technologies and clinical applications for Omics-based identification of pathophysiological process. In doing so, we classify the methods into two classes: one is a "data-inductive approach" which infers gene regulatory and transcriptional networks by gene expression data from DNA microarrays, and the other is a "knowledge-referenced approach" which combines the differentially expressed genes from gene expression profiles with existing protein interaction networks or literature-curated pathways. Several typical methods such as ARACNe and eQTL are described with their recent clinical applications.
Collapse
Affiliation(s)
- Hiroshi Tanaka
- Department of Computational Biology, Graduate School of Biomedical Science, Tokyo Medical and Dental University, Tokyo, Japan.
| | | |
Collapse
|
14
|
Lahti L, Elo LL, Aittokallio T, Kaski S. Probabilistic analysis of probe reliability in differential gene expression studies with short oligonucleotide arrays. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:217-225. [PMID: 21071809 DOI: 10.1109/tcbb.2009.38] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Probe defects are a major source of noise in gene expression studies. While existing approaches detect noisy probes based on external information such as genomic alignments, we introduce and validate a targeted probabilistic method for analyzing probe reliability directly from expression data and independently of the noise source. This provides insights into the various sources of probe-level noise and gives tools to guide probe design.
Collapse
Affiliation(s)
- Leo Lahti
- Helsinki Institute for Information Technology, Department of Information and Computer Science, Aalto University School of Science and Technology, PO Box 15400, FI-00076 Aalto, Finland.
| | | | | | | |
Collapse
|
15
|
Lu ZX, Jiang P, Cai JJ, Xing Y. Context-dependent robustness to 5' splice site polymorphisms in human populations. Hum Mol Genet 2010; 20:1084-96. [PMID: 21224255 DOI: 10.1093/hmg/ddq553] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
There has been growing evidence for extensive diversity of alternative splicing in human populations. Genetic variants within the 5' splice site can cause splicing differences among human individuals and constitute an important class of human disease mutations. In this study, we explored whether natural variations of splicing could reveal important signals of 5' splice site recognition. In seven lymphoblastoid cell lines of Asian, European and African ancestry, we identified 1174 single nucleotide polymorphisms (SNPs) within the consensus 5' splice site. We selected 129 SNPs predicted to significantly alter the splice site activity, and quantitatively examined their splicing impact in the seven individuals. Surprisingly, outside of the essential GT dinucleotide position, only ∼14% of the tested SNPs altered splicing. Bioinformatic and minigene analyses identified signals that could modify the impact of 5' splice site polymorphisms, most notably a strong 3' splice site and the presence of intronic motifs downstream of the 5' splice site. Strikingly, we found that the poly-G run, a known intronic splicing enhancer, was the most significantly enriched motif downstream of exons unaffected by 5' splice site SNPs. In TRIM62, the upstream 3' splice site and downstream intronic poly-G runs functioned redundantly to protect an exon from its 5' splice site polymorphism. Collectively, our study reveals widespread context-dependent robustness to 5' splice site polymorphisms in human transcriptomes. Consequently, certain exons are more susceptible to 5' splice site mutations. Additionally, our work demonstrates that genetic diversity of alternative splicing can provide significant insights into the splicing code of mammalian cells.
Collapse
Affiliation(s)
- Zhi-xiang Lu
- Department of Internal Medicine, University of Iowa, 3294 CBRB, 285 Newton Rd, Iowa City, IA 52242, USA
| | | | | | | |
Collapse
|
16
|
Langer W, Sohler F, Leder G, Beckmann G, Seidel H, Gröne J, Hummel M, Sommer A. Exon array analysis using re-defined probe sets results in reliable identification of alternatively spliced genes in non-small cell lung cancer. BMC Genomics 2010; 11:676. [PMID: 21118496 PMCID: PMC3053589 DOI: 10.1186/1471-2164-11-676] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2010] [Accepted: 11/30/2010] [Indexed: 12/22/2022] Open
Abstract
Background Treatment of non-small cell lung cancer with novel targeted therapies is a major unmet clinical need. Alternative splicing is a mechanism which generates diverse protein products and is of functional relevance in cancer. Results In this study, a genome-wide analysis of the alteration of splicing patterns between lung cancer and normal lung tissue was performed. We generated an exon array data set derived from matched pairs of lung cancer and normal lung tissue including both the adenocarcinoma and the squamous cell carcinoma subtypes. An enhanced workflow was developed to reliably detect differential splicing in an exon array data set. In total, 330 genes were found to be differentially spliced in non-small cell lung cancer compared to normal lung tissue. Microarray findings were validated with independent laboratory methods for CLSTN1, FN1, KIAA1217, MYO18A, NCOR2, NUMB, SLK, SYNE2, TPM1, (in total, 10 events) and ADD3, which was analysed in depth. We achieved a high validation rate of 69%. Evidence was found that the activity of FOX2, the splicing factor shown to cause cancer-specific splicing patterns in breast and ovarian cancer, is not altered at the transcript level in several cancer types including lung cancer. Conclusions This study demonstrates how alternatively spliced genes can reliably be identified in a cancer data set. Our findings underline that key processes of cancer progression in NSCLC are affected by alternative splicing, which can be exploited in the search for novel targeted therapies.
Collapse
Affiliation(s)
- Wolfram Langer
- Bayer Schering Pharma AG, Global Drug Discovery (GDD)-Target Discovery, Müllerstrasse 178, 13342 Berlin, Germany.
| | | | | | | | | | | | | | | |
Collapse
|
17
|
Anton MA, Aramburu A, Rubio A. Improvements to previous algorithms to predict gene structure and isoform concentrations using Affymetrix Exon arrays. BMC Bioinformatics 2010; 11:578. [PMID: 21110835 PMCID: PMC3012675 DOI: 10.1186/1471-2105-11-578] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2010] [Accepted: 11/26/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Exon arrays provide a way to measure the expression of different isoforms of genes in an organism. Most of the procedures to deal with these arrays are focused on gene expression or on exon expression. Although the only biological analytes that can be properly assigned a concentration are transcripts, there are very few algorithms that focus on them. The reason is that previously developed summarization methods do not work well if applied to transcripts. In addition, gene structure prediction, i.e., the correspondence between probes and novel isoforms, is a field which is still unexplored. RESULTS We have modified and adapted a previous algorithm to take advantage of the special characteristics of the Affymetrix exon arrays. The structure and concentration of transcripts -some of them possibly unknown- in microarray experiments were predicted using this algorithm. Simulations showed that the suggested modifications improved both specificity (SP) and sensitivity (ST) of the predictions. The algorithm was also applied to different real datasets showing its effectiveness and the concordance with PCR validated results. CONCLUSIONS The proposed algorithm shows a substantial improvement in the performance over the previous version. This improvement is mainly due to the exploitation of the redundancy of the Affymetrix exon arrays. An R-Package of SPACE with the updated algorithms have been developed and is freely available.
Collapse
Affiliation(s)
- Miguel A Anton
- CEIT and TECNUN, University of Navarra, San Sebastián, Spain
| | | | | |
Collapse
|
18
|
Zhang YE, Vibranovski MD, Landback P, Marais GAB, Long M. Chromosomal redistribution of male-biased genes in mammalian evolution with two bursts of gene gain on the X chromosome. PLoS Biol 2010; 8. [PMID: 20957185 PMCID: PMC2950125 DOI: 10.1371/journal.pbio.1000494] [Citation(s) in RCA: 152] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2010] [Accepted: 08/16/2010] [Indexed: 01/20/2023] Open
Abstract
Mammalian X chromosomes evolved under various mechanisms including sexual antagonism, the faster-X process, and meiotic sex chromosome inactivation (MSCI). These forces may contribute to nonrandom chromosomal distribution of sex-biased genes. In order to understand the evolution of gene content on the X chromosome and autosome under these forces, we dated human and mouse protein-coding genes and miRNA genes on the vertebrate phylogenetic tree. We found that the X chromosome recently acquired a burst of young male-biased genes, which is consistent with fixation of recessive male-beneficial alleles by sexual antagonism. For genes originating earlier, however, this pattern diminishes and finally reverses with an overrepresentation of the oldest male-biased genes on autosomes. MSCI contributes to this dynamic since it silences X-linked old genes but not X-linked young genes. This demasculinization process seems to be associated with feminization of the X chromosome with more X-linked old genes expressed in ovaries. Moreover, we detected another burst of gene originations after the split of eutherian mammals and opossum, and these genes were quickly incorporated into transcriptional networks of multiple tissues. Preexisting X-linked genes also show significantly higher protein-level evolution during this period compared to autosomal genes, suggesting positive selection accompanied the early evolution of mammalian X chromosomes. These two findings cast new light on the evolutionary history of the mammalian X chromosome in terms of gene gain, sequence, and expressional evolution.
Collapse
Affiliation(s)
- Yong E. Zhang
- Department of Ecology and Evolution, the University of Chicago, Chicago, Illinois, United States of America
| | - Maria D. Vibranovski
- Department of Ecology and Evolution, the University of Chicago, Chicago, Illinois, United States of America
| | - Patrick Landback
- Department of Ecology and Evolution, the University of Chicago, Chicago, Illinois, United States of America
| | - Gabriel A. B. Marais
- Université de Lyon, Centre National de la Recherche Scientifique, Laboratoire de Biométrie et Biologie évolutive, Villeurbanne, France
| | - Manyuan Long
- Department of Ecology and Evolution, the University of Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
19
|
Liu S, Lin L, Jiang P, Wang D, Xing Y. A comparison of RNA-Seq and high-density exon array for detecting differential gene expression between closely related species. Nucleic Acids Res 2010; 39:578-88. [PMID: 20864445 PMCID: PMC3025565 DOI: 10.1093/nar/gkq817] [Citation(s) in RCA: 112] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
RNA-Seq has emerged as a revolutionary technology for transcriptome analysis. In this article, we report a systematic comparison of RNA-Seq and high-density exon array for detecting differential gene expression between closely related species. On a panel of human/chimpanzee/rhesus cerebellum RNA samples previously examined by the high-density human exon junction array (HJAY) and real-time qPCR, we generated 48.68 million RNA-Seq reads. Our results indicate that RNA-Seq has significantly improved gene coverage and increased sensitivity for differentially expressed genes compared with the high-density HJAY array. Meanwhile, we observed a systematic increase in the RNA-Seq error rate for lowly expressed genes. Specifically, between-species DEGs detected by array/qPCR but missed by RNA-Seq were characterized by relatively low expression levels, as indicated by lower RNA-Seq read counts, lower HJAY array expression indices and higher qPCR raw cycle threshold values. Furthermore, this issue was not unique to between-species comparisons of gene expression. In the RNA-Seq analysis of MicroArray Quality Control human reference RNA samples with extensive qPCR data, we also observed an increase in both the false-negative rate and the false-positive rate for lowly expressed genes. These findings have important implications for the design and data interpretation of RNA-Seq studies on gene expression differences between and within species.
Collapse
Affiliation(s)
- Song Liu
- Department of Biostatistics, Roswell Park Cancer Institute, The State University of New York at Buffalo, Buffalo, NY 14203, USA
| | | | | | | | | |
Collapse
|
20
|
Lin L, Shen S, Jiang P, Sato S, Davidson BL, Xing Y. Evolution of alternative splicing in primate brain transcriptomes. Hum Mol Genet 2010; 19:2958-73. [PMID: 20460271 DOI: 10.1093/hmg/ddq201] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Alternative splicing is a predominant form of gene regulation in higher eukaryotes. The evolution of alternative splicing provides an important mechanism for the acquisition of novel gene functions. In this work, we carried out a genome-wide phylogenetic survey of lineage-specific splicing patterns in the primate brain, via high-density exon junction array profiling of brain transcriptomes of humans, chimpanzees and rhesus macaques. We identified 509 genes showing splicing differences among these species. RT-PCR analysis of 40 exons confirmed the predicted splicing evolution of 33 exons. Of these 33 exons, outgroup analysis using rhesus macaques confirmed 13 exons with human-specific increase or decrease in transcript inclusion levels after humans diverged from chimpanzees. Some of the human-specific brain splicing patterns disrupt domains critical for protein-protein interactions, and some modulate translational efficiency of their host genes. Strikingly, for exons showing splicing differences across species, we observed a significant increase in the rate of silent substitutions within exons, coupled with accelerated sequence divergence in flanking introns. This indicates that evolution of cis-regulatory signals is a major contributor to the emergence of human-specific splicing patterns. In one gene (MAGOH), using minigene reporter assays, we demonstrated that the combination of two human-specific cis-sequence changes created its human-specific splicing pattern. Together, our data reveal widespread human-specific changes of alternative splicing in the brain and suggest an important role of splicing in the evolution of neuronal gene regulation and functions.
Collapse
Affiliation(s)
- Lan Lin
- Department of Internal Medicine, University of Iowa, Iowa City, IA 52242, USA
| | | | | | | | | | | |
Collapse
|
21
|
Qu Y, He F, Chen Y. Different effects of the probe summarization algorithms PLIER and RMA on high-level analysis of Affymetrix exon arrays. BMC Bioinformatics 2010; 11:211. [PMID: 20426803 PMCID: PMC2873539 DOI: 10.1186/1471-2105-11-211] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2009] [Accepted: 04/28/2010] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Alternative splicing is an important mechanism that increases protein diversity and functionality in higher eukaryotes. Affymetrix exon arrays are a commercialized platform used to detect alternative splicing on a genome-wide scale. Two probe summarization algorithms, PLIER (Probe Logarithmic Intensity Error) and RMA (Robust Multichip Average), are commonly used to compute gene-level and exon-level expression values. However, a systematic comparison of these two algorithms on their effects on high-level analysis of the arrays has not yet been reported. RESULTS In this study, we showed that PLIER summarization led to over-estimation of gene-level expression changes, relative to exon-level expression changes, in two-group comparisons. Consequently, it led to detection of substantially more skipped exons on up-regulated genes, as well as substantially more included (i.e., non-skipped) exons on down-regulated genes. In contrast, this bias was not observed for RMA-summarized data. By using a published human tissue dataset, we compared the tissue-specific expression and splicing detected by Affymetrix exon arrays with those detected based on expressed sequence databases. We found the tendency of PLIER was not supported by the expressed sequence data. CONCLUSION We showed that the tendency of PLIER in detection of alternative splicing is likely caused by a technical bias in the approach, rather than a biological bias. Moreover, we observed abnormal summarization results when using the PLIER algorithm, indicating that mathematical problems, such as numerical instability, may affect PLIER performance.
Collapse
Affiliation(s)
- Yi Qu
- National Engineering Center for Biochip at Shanghai, Libing Rd, 151, Shanghai, 201203, China
| | | | | |
Collapse
|
22
|
Zhang Z, Gasser DL, Rappaport EF, Falk MJ. Cross-platform expression microarray performance in a mouse model of mitochondrial disease therapy. Mol Genet Metab 2010; 99:309-18. [PMID: 19944634 PMCID: PMC2824080 DOI: 10.1016/j.ymgme.2009.10.179] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/26/2009] [Revised: 10/22/2009] [Accepted: 10/22/2009] [Indexed: 11/25/2022]
Abstract
UNLABELLED Microarray expression profiling has become a valuable tool in the evaluation of the genetic consequences of metabolic disease. Although 3'-biased gene expression microarray platforms were the first generation to have widespread availability, newer platforms are gradually emerging that have more up-to-date content and/or higher cost efficiency. Deciphering the relative strengths and weaknesses of these various platforms for metabolic pathway-level analyses can be daunting. We sought to determine the practical strengths and weaknesses of four leading commercially available expression array platforms relative to biologic investigations, as well as assess the feasibility of cross-platform data integration for purposes of biochemical pathway analyses. METHODS Liver RNA from B6.Alb/cre,Pdss2(loxP/loxP) mice having primary coenzyme Q deficiency was extracted either at baseline or following treatment with an antioxidant/antihyperlipidemic agent, probucol. Target RNA samples were prepared and hybridized to Affymetrix 430 2.0, Affymetrix Gene 1.0 ST, Affymetrix Exon 1.0 ST, and Illumina Mouse WG-6 expression arrays. Probes on all platforms were re-mapped to coding sequences in the current version of the mouse genome. Data processing and statistical analysis were performed by R/Bioconductor functions, and pathway analyses were carried out by KEGG Atlas and GSEA. RESULTS Expression measurements were generally consistent across platforms. However, intensive probe-level comparison suggested that differences in probe locations were a major source of inter-platform variance. In addition, genes expressed at low or intermediate levels had lower inter-platform reproducibility than highly expressed genes. All platforms showed similar patterns of differential expression between sample groups, with 'steroid biosynthesis' consistently identified as the most down-regulated metabolic pathway by probucol treatment. CONCLUSIONS This work offers a timely guide for metabolic disease investigators to enable informed end-user decisions regarding choice of expression microarray platform best-suited to specific research project goals. Successful cross-platform integration of biochemical pathway expression data is also demonstrated, especially for well-annotated and highly expressed genes. However, integration of gene-level expression data is limited by individual platform probe design and the expression level of target genes. Cross-platform analyses of biochemical pathway data will require additional data processing and novel computational bioinformatics tools to address unique statistical challenges.
Collapse
Affiliation(s)
- Zhe Zhang
- Division of Biomedical Informatics, Department of Pediatrics, The Children s Hospital of Philadelphia and University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | - David L. Gasser
- Department of Genetics, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | - Eric F. Rappaport
- Division of Biomedical Informatics, Department of Pediatrics, The Children s Hospital of Philadelphia and University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | - Marni J. Falk
- Division of Human Genetics, Department of Pediatrics, The Children s Hospital of Philadelphia and University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
- Corresponding Author: Marni J. Falk, MD, ARC 1002c, 3615 Civic Center Blvd, Philadelphia, PA 19104, office. 215-590-4564; fax 267-426-2876,
| |
Collapse
|
23
|
Lin CL, Evans V, Shen S, Xing Y, Richter JD. The nuclear experience of CPEB: implications for RNA processing and translational control. RNA (NEW YORK, N.Y.) 2010; 16:338-48. [PMID: 20040591 PMCID: PMC2811663 DOI: 10.1261/rna.1779810] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2009] [Accepted: 10/29/2009] [Indexed: 05/20/2023]
Abstract
CPEB is a sequence-specific RNA binding protein that promotes polyadenylation-induced translation in early development, during cell cycle progression and cellular senescence, and following neuronal synapse stimulation. It controls polyadenylation and translation through other interacting molecules, most notably the poly(A) polymerase Gld2, the deadenylating enzyme PARN, and the eIF4E-binding protein Maskin. Here, we report that CPEB shuttles between the nucleus and cytoplasm and that its export occurs via the CRM1-dependent pathway. In the nucleus of Xenopus oocytes, CPEB associates with lampbrush chromosomes and several proteins involved in nuclear RNA processing. CPEB also interacts with Maskin in the nucleus as well as with CPE-containing mRNAs. Although the CPE does not regulate mRNA export, it influences the degree to which mRNAs are translationally repressed in the cytoplasm. Moreover, CPEB directly or indirectly mediates the alternative splicing of at least one pre-mRNA in mouse embryo fibroblasts as well as certain mouse tissues. We propose that CPEB, together with Maskin, binds mRNA in the nucleus to ensure tight translational repression upon export to the cytoplasm. In addition, we propose that nuclear CPEB regulates specific pre-mRNA alternative splicing.
Collapse
Affiliation(s)
- Chien-Ling Lin
- University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA
| | | | | | | | | |
Collapse
|
24
|
Mugal CF, Wolf JBW, von Grünberg HH, Ellegren H. Conservation of neutral substitution rate and substitutional asymmetries in mammalian genes. Genome Biol Evol 2010; 2:19-28. [PMID: 20333222 PMCID: PMC2839347 DOI: 10.1093/gbe/evp056] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/22/2009] [Indexed: 12/21/2022] Open
Abstract
Local variation in neutral substitution rate across mammalian genomes is governed by several factors, including sequence context variables and structural variables. In addition, the interplay of replication and transcription, known to induce a strand bias in mutation rate, gives rise to variation in substitutional strand asymmetries. Here, we address the conservation of variation in mutation rate and substitutional strand asymmetries using primate- and rodent-specific repeat elements located within the introns of protein-coding genes. We find significant but weak conservation of local mutation rates between human and mouse orthologs. Likewise, substitutional strand asymmetries are conserved between human and mouse, where substitution rate asymmetries show a higher degree of conservation than mutation rate. Moreover, we provide evidence that replication and transcription are correlated to the strength of substitutional asymmetries. The effect of transcription is particularly visible for genes with highly conserved gene expression. In comparison with replication and transcription, mutation rate influences the strength of substitutional asymmetries only marginally.
Collapse
Affiliation(s)
- C F Mugal
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden.
| | | | | | | |
Collapse
|
25
|
Abstract
Alternative splicing plays an important role in regulation of normal cellular function. Alternative splicing of pre-mRNA leads to the diversity of downstream protein products in the cell. The Affymetrix Exon arrays allow for a high throughput evaluation of the differences in spliced mRNA expressed in a biological system. In this study, we describe a method using this technology to study the generation of alternative mRNA transcripts in breast cancer cells that differ in the levels of a particular integrin, alpha3beta1.
Collapse
|
26
|
Warzecha CC, Shen S, Xing Y, Carstens RP. The epithelial splicing factors ESRP1 and ESRP2 positively and negatively regulate diverse types of alternative splicing events. RNA Biol 2009; 6:546-62. [PMID: 19829082 DOI: 10.4161/rna.6.5.9606] [Citation(s) in RCA: 163] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Cell-type and tissue-specific alternative splicing events are regulated by combinatorial control involving both abundant RNA binding proteins as well as those with more discrete expression and specialized functions. Epithelial Splicing Regulatory Proteins 1 and 2 (ESRP1 and ESRP2) are recently discovered epithelial-specific RNA binding proteins that promote splicing of the epithelial variant of the FGFR2, ENAH, CD44 and CTNND1 transcripts. To catalogue a larger set of splicing events under the regulation of the ESRPs we profiled splicing changes induced by RNA interference-mediated knockdown of ES RP1 and ES RP2 expression in a human epithelial cell line using the splicing sensitive Affymetrix Exon ST1.0 Arrays. Analysis of the microarray data resulted in the identification of over a hundred candidate ESRP regulated splicing events. We were able to independently validate 38 of these targets by RT-PCR. The ESRP regulated events encompass all known types of alternative splicing events, most prominent being alternative cassette exons and splicing events leading to alternative 3' terminal exons. Importantly, a number of these regulated splicing events occur in gene transcripts that encode proteins with well-described roles in the regulation of actin cytoskeleton organization, cell-cell adhesion, cell polarity and cell migration. In sum, this work reveals a novel list of transcripts differentially spliced in epithelial and mesenchymal cells, implying that coordinated alternative splicing plays a critical role in determination of cell type identity. These results further establish ESRP1 and ESRP2 as global regulators of an epithelial splicing regulatory network.
Collapse
Affiliation(s)
- Claude C Warzecha
- Renal Division, Department of Medicine, University of Pennsylvania School of Medicine, Philadelphia, PA, USA
| | | | | | | |
Collapse
|
27
|
Grigoryev YA, Kurian SM, Nakorchevskiy AA, Burke JP, Campbell D, Head SR, Deng J, Kantor AB, Yates JR, Salomon DR. Genome-wide analysis of immune activation in human T and B cells reveals distinct classes of alternatively spliced genes. PLoS One 2009; 4:e7906. [PMID: 19936255 PMCID: PMC2775942 DOI: 10.1371/journal.pone.0007906] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2009] [Accepted: 10/17/2009] [Indexed: 12/22/2022] Open
Abstract
Alternative splicing of pre-mRNA is a mechanism that increases the protein diversity of a single gene by differential exon inclusion/exclusion during post-transcriptional processing. While alternative splicing is established to occur during lymphocyte activation, little is known about the role it plays during the immune response. Our study is among the first reports of a systematic genome-wide analysis of activated human T and B lymphocytes using whole exon DNA microarrays integrating alternative splicing and differential gene expression. Purified human CD2+ T or CD19+ B cells were activated using protocols to model the early events in post-transplant allograft immunity and sampled as a function of time during the process of immune activation. Here we show that 3 distinct classes of alternatively spliced and/or differentially expressed genes change in an ordered manner as a function of immune activation. We mapped our results to function-based canonical pathways and demonstrated that some are populated by only one class of genes, like integrin signaling, while other pathways, such as purine metabolism and T cell receptor signaling, are populated by all three classes of genes. Our studies augment the current view of T and B cell activation in immunity that has been based exclusively upon differential gene expression by providing evidence for a large number of molecular networks populated as a function of time and activation by alternatively spliced genes, many of which are constitutively expressed.
Collapse
Affiliation(s)
- Yevgeniy A Grigoryev
- Department of Molecular & Experimental Medicine, The Scripps Research Institute, La Jolla, California, United States of America
| | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Shen S, Warzecha CC, Carstens RP, Xing Y. MADS+: discovery of differential splicing events from Affymetrix exon junction array data. Bioinformatics 2009; 26:268-9. [PMID: 19933160 PMCID: PMC2804303 DOI: 10.1093/bioinformatics/btp643] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Motivation: The Affymetrix Human Exon Junction Array is a newly designed high-density exon-sensitive microarray for global analysis of alternative splicing. Contrary to the Affymetrix exon 1.0 array, which only contains four probes per exon and no probes for exon–exon junctions, this new junction array averages eight probes per probeset targeting all exons and exon–exon junctions observed in the human mRNA/EST transcripts, representing a significant increase in the probe density for alternative splicing events. Here, we present MADS+, a computational pipeline to detect differential splicing events from the Affymetrix exon junction array data. For each alternative splicing event, MADS+ evaluates the signals of probes targeting competing transcript isoforms to identify exons or splice sites with different levels of transcript inclusion between two sample groups. MADS+ is used routinely in our analysis of Affymetrix exon junction arrays and has a high accuracy in detecting differential splicing events. For example, in a study of the novel epithelial-specific splicing regulator ESRP1, MADS+ detects hundreds of exons whose inclusion levels are dependent on ESRP1, with a RT-PCR validation rate of 88.5% (153 validated out of 173 tested). Availability: MADS+ scripts, documentations and annotation files are available at http://www.medicine.uiowa.edu/Labs/Xing/MADSplus/. Contact:yi-xing@uiowa.edu
Collapse
Affiliation(s)
- Shihao Shen
- Department of Biostatistics, University of Iowa, Iowa City, IA, USA
| | | | | | | |
Collapse
|
29
|
Rasche A, Herwig R. ARH: predicting splice variants from genome-wide data with modified entropy. ACTA ACUST UNITED AC 2009; 26:84-90. [PMID: 19889797 DOI: 10.1093/bioinformatics/btp626] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Exon arrays allow the quantitative study of alternative splicing (AS) on a genome-wide scale. A variety of splicing prediction methods has been proposed for Affymetrix exon arrays mainly focusing on geometric correlation measures or analysis of variance. In this article, we introduce an information theoretic concept that is based on modification of the well-known entropy function. RESULTS We have developed an AS robust prediction method based on entropy (ARH). We can show that this measure copes with bias inherent in the analysis of AS such as the dependency of prediction performance on the number of exons or variable exon expression. In order to judge the performance of ARH, we have compared it with eight existing splicing prediction methods using experimental benchmark data and demonstrate that ARH is a well-performing new method for the prediction of splice variants. AVAILABILITY AND IMPLEMENTATION ARH is implemented in R and provided in the Supplementary Material.
Collapse
Affiliation(s)
- Axel Rasche
- Department of Vertebrate Genomics, Max-Planck-Institute for Molecular Genetics, Ihnestr. 63-73, D-14195 Berlin, Germany.
| | | |
Collapse
|
30
|
Buza TJ, Kumar R, Gresham CR, Burgess SC, McCarthy FM. Facilitating functional annotation of chicken microarray data. BMC Bioinformatics 2009; 10 Suppl 11:S2. [PMID: 19811685 PMCID: PMC3226191 DOI: 10.1186/1471-2105-10-s11-s2] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Background Modeling results from chicken microarray studies is challenging for researchers due to little functional annotation associated with these arrays. The Affymetrix GenChip chicken genome array, one of the biggest arrays that serve as a key research tool for the study of chicken functional genomics, is among the few arrays that link gene products to Gene Ontology (GO). However the GO annotation data presented by Affymetrix is incomplete, for example, they do not show references linked to manually annotated functions. In addition, there is no tool that facilitates microarray researchers to directly retrieve functional annotations for their datasets from the annotated arrays. This costs researchers amount of time in searching multiple GO databases for functional information. Results We have improved the breadth of functional annotations of the gene products associated with probesets on the Affymetrix chicken genome array by 45% and the quality of annotation by 14%. We have also identified the most significant diseases and disorders, different types of genes, and known drug targets represented on Affymetrix chicken genome array. To facilitate functional annotation of other arrays and microarray experimental datasets we developed an Array GO Mapper (AGOM) tool to help researchers to quickly retrieve corresponding functional information for their dataset. Conclusion Results from this study will directly facilitate annotation of other chicken arrays and microarray experimental datasets. Researchers will be able to quickly model their microarray dataset into more reliable biological functional information by using AGOM tool. The disease, disorders, gene types and drug targets revealed in the study will allow researchers to learn more about how genes function in complex biological systems and may lead to new drug discovery and development of therapies. The GO annotation data generated will be available for public use via AgBase website and will be updated on regular basis.
Collapse
Affiliation(s)
- Teresia J Buza
- Department of Basic Sciences, College of Veterinary Medicine, Mississippi State University, Mississippi State, MS 39762, USA.
| | | | | | | | | |
Collapse
|
31
|
Nicol JW, Helt GA, Blanchard SG, Raja A, Loraine AE. The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets. Bioinformatics 2009; 25:2730-1. [PMID: 19654113 PMCID: PMC2759552 DOI: 10.1093/bioinformatics/btp472] [Citation(s) in RCA: 476] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Summary: Experimental techniques that survey an entire genome demand flexible, highly interactive visualization tools that can display new data alongside foundation datasets, such as reference gene annotations. The Integrated Genome Browser (IGB) aims to meet this need. IGB is an open source, desktop graphical display tool implemented in Java that supports real-time zooming and panning through a genome; layout of genomic features and datasets in moveable, adjustable tiers; incremental or genome-scale data loading from remote web servers or local files; and dynamic manipulation of quantitative data via genome graphs. Availability: The application and source code are available from http://igb.bioviz.org and http://genoviz.sourceforge.net. Contact:aloraine@uncc.edu
Collapse
Affiliation(s)
- John W Nicol
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, North Carolina Research Campus, 600 Laureate Way, Kannapolis, NC 28081, USA
| | | | | | | | | |
Collapse
|
32
|
Lin L, Liu S, Brockway H, Seok J, Jiang P, Wong WH, Xing Y. Using high-density exon arrays to profile gene expression in closely related species. Nucleic Acids Res 2009; 37:e90. [PMID: 19474342 PMCID: PMC2709591 DOI: 10.1093/nar/gkp420] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Global comparisons of gene expression profiles between species provide significant insight into gene regulation, evolutionary processes and disease mechanisms. In this work, we describe a flexible and intuitive approach for global expression profiling of closely related species, using high-density exon arrays designed for a single reference genome. The high-density probe coverage of exon arrays allows us to select identical sets of perfect-match probes to measure expression levels of orthologous genes. This eliminates a serious confounding factor in probe affinity effects of species-specific microarray probes, and enables direct comparisons of estimated expression indexes across species. Using a newly designed Affymetrix exon array, with eight probes per exon for approximately 315 000 exons in the human genome, we conducted expression profiling in corresponding tissues from humans, chimpanzees and rhesus macaques. Quantitative real-time PCR analysis of differentially expressed candidate genes is highly concordant with microarray data, yielding a validation rate of 21/22 for human versus chimpanzee differences, and 11/11 for human versus rhesus differences. This method has the potential to greatly facilitate biomedical and evolutionary studies of gene expression in nonhuman primates and can be easily extended to expression array design and comparative analysis of other animals and plants.
Collapse
Affiliation(s)
- Lan Lin
- Department of Internal Medicine, University of Iowa, Iowa City, IA 52242, USA
| | | | | | | | | | | | | |
Collapse
|
33
|
Lin L, Jiang P, Shen S, Sato S, Davidson BL, Xing Y. Large-scale analysis of exonized mammalian-wide interspersed repeats in primate genomes. Hum Mol Genet 2009; 18:2204-14. [PMID: 19324900 DOI: 10.1093/hmg/ddp152] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Transposable elements (TEs) are major sources of new exons in higher eukaryotes. Almost half of the human genome is derived from TEs, and many types of TEs have the potential to exonize. In this work, we conducted a large-scale analysis of human exons derived from mammalian-wide interspersed repeats (MIRs), a class of old TEs which was active prior to the radiation of placental mammals. Using exon array data of 328 MIR-derived exons and RT-PCR analysis of 39 exons in 10 tissues, we identified 15 constitutively spliced MIR exons, and 15 MIR exons with tissue-specific shift in splicing patterns. Analysis of RNAs from multiple species suggests that the splicing events of many strongly included MIR exons have been established before the divergence of primates and rodents, while a small percentage result from recent exonization during primate evolution. Interestingly, exon array data suggest substantially higher splicing activities of MIR exons when compared with exons derived from Alu elements, a class of primate-specific retrotransposons. This appears to be a universal difference between exons derived from young and old TEs, as it is also observed when comparing Alu exons to exons derived from LINE1 and LINE2, two other groups of old TEs. Together, this study significantly expands current knowledge about exonization of TEs. Our data imply that with sufficient evolutionary time, numerous new exons could evolve beyond the evolutionary intermediate state and contribute functional novelties to modern mammalian genomes.
Collapse
Affiliation(s)
- Lan Lin
- Department of Internal Medicine, University of Iowa, 3294 CBRB, 285 Newton Road, Iowa City, IA 52242, USA
| | | | | | | | | | | |
Collapse
|
34
|
Zheng H, Hang X, Zhu J, Qian M, Qu W, Zhang C, Deng M. REMAS: a new regression model to identify alternative splicing events from exon array data. BMC Bioinformatics 2009; 10 Suppl 1:S18. [PMID: 19208117 PMCID: PMC2648792 DOI: 10.1186/1471-2105-10-s1-s18] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Background Alternative splicing (AS) is an important regulatory mechanism for gene expression and protein diversity in eukaryotes. Previous studies have demonstrated that it can be causative for, or specific to splicing-related diseases. Understanding the regulation of AS will be helpful for diagnostic efforts and drug discoveries on those splicing-related diseases. As a novel exon-centric microarray platform, exon array enables a comprehensive analysis of AS by investigating the expression of known and predicted exons. Identifying of AS events from exon array has raised much attention, however, new and powerful algorithms for exon array data analysis are still absent till now. Results Here, we considered identifying of AS events in the framework of variable selection and developed a regression method for AS detection (REMAS). Firstly, features of alternatively spliced exons were scaled by reasonably defined variables. Secondly, we designed a hierarchical model which can represent gene structure and transcriptional influence to exons, and the lasso type penalties were introduced in calculation because of huge variable size. Thirdly, an iterative two-step algorithm was developed to select alternatively spliced genes and exons. To avoid negative effects introduced by small sample size, we ranked genes as parameters indicating their AS capabilities in an iterative manner. After that, both simulation and real data evaluation showed that REMAS could efficiently identify potential AS events, some of which had been validated by RT-PCR or supported by literature evidence. Conclusion As a new lasso regression algorithm based on hierarchical model, REMAS has been demonstrated as a reliable and effective method to identify AS events from exon array data.
Collapse
Affiliation(s)
- Hao Zheng
- LMAM, School of Mathematical Sciences and Center for Theoretical Biology, Peking University, Beijing 100871, PR China.
| | | | | | | | | | | | | |
Collapse
|
35
|
Identifying differential exon splicing using linear models and correlation coefficients. BMC Bioinformatics 2009; 10:26. [PMID: 19154578 PMCID: PMC2636774 DOI: 10.1186/1471-2105-10-26] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2008] [Accepted: 01/20/2009] [Indexed: 01/17/2023] Open
Abstract
Background With the availability of the Affymetrix exon arrays a number of tools have been developed to enable the analysis. These however can be expensive or have several pre-installation requirements. This led us to develop an analysis workflow for analysing differential splicing using freely available software packages that are already being widely used for gene expression analysis. The workflow uses the packages in the standard installation of R and Bioconductor (BiocLite) to identify differential splicing. We use the splice index method with the LIMMA framework. The main drawback with this approach is that it relies on accurate estimates of gene expression from the probe-level data. Methods such as RMA and PLIER may misestimate when a large proportion of exons are spliced. We therefore present the novel concept of a gene correlation coefficient calculated using only the probeset expression pattern within a gene. We show that genes with lower correlation coefficients are likely to be differentially spliced. Results The LIMMA approach was used to identify several tissue-specific transcripts and splicing events that are supported by previous experimental studies. Filtering the data is necessary, particularly removing exons and genes that are not expressed in all samples and cross-hybridising probesets, in order to reduce the false positive rate. The LIMMA approach ranked genes containing single or few differentially spliced exons much higher than genes containing several differentially spliced exons. On the other hand we found the gene correlation coefficient approach better for identifying genes with a large number of differentially spliced exons. Conclusion We show that LIMMA can be used to identify differential exon splicing from Affymetrix exon array data. Though further work would be necessary to develop the use of correlation coefficients into a complete analysis approach, the preliminary results demonstrate their usefulness for identifying differentially spliced genes. The two approaches work complementary as they can potentially identify different subsets of genes (single/few spliced exons vs. large transcript structure differences).
Collapse
|
36
|
A genome-scale analysis of the cis-regulatory circuitry underlying sonic hedgehog-mediated patterning of the mammalian limb. Genes Dev 2008; 22:2651-63. [PMID: 18832070 DOI: 10.1101/gad.1693008] [Citation(s) in RCA: 231] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Sonic hedgehog (Shh) signals via Gli transcription factors to direct digit number and identity in the vertebrate limb. We characterized the Gli-dependent cis-regulatory network through a combination of whole-genome chromatin immunoprecipitation (ChIP)-on-chip and transcriptional profiling of the developing mouse limb. These analyses identified approximately 5000 high-quality Gli3-binding sites, including all known Gli-dependent enhancers. Discrete binding regions exhibit a higher-order clustering, highlighting the complexity of cis-regulatory interactions. Further, Gli3 binds inertly to previously identified neural-specific Gli enhancers, demonstrating the accessibility of their cis-regulatory elements. Intersection of DNA binding data with gene expression profiles predicted 205 putative limb target genes. A subset of putative cis-regulatory regions were analyzed in transgenic embryos, establishing Blimp1 as a direct Gli target and identifying Gli activator signaling in a direct, long-range regulation of the BMP antagonist Gremlin. In contrast, a long-range silencer cassette downstream from Hand2 likely mediates Gli3 repression in the anterior limb. These studies provide the first comprehensive characterization of the transcriptional output of a Shh-patterning process in the mammalian embryo and a framework for elaborating regulatory networks in the developing limb.
Collapse
|
37
|
Abstract
Motivation: Microarray designs have become increasingly probe-rich, enabling targeting of specific features, such as individual exons or single nucleotide polymorphisms. These arrays have the potential to achieve quantitative high-throughput estimates of transcript abundances, but currently these estimates are affected by biases due to cross-hybridization, in which probes hybridize to off-target transcripts. Results: To study cross-hybridization, we map Affymetrix exon array probes to a set of annotated mRNA transcripts, allowing a small number of mismatches or insertion/deletions between the two sequences. Based on a systematic study of the degree to which probes with a given match type to a transcript are affected by cross-hybridization, we developed a strategy to correct for cross-hybridization biases of gene-level expression estimates. Comparison with Solexa ultra high-throughput sequencing data demonstrates that correction for cross-hybridization leads to a significant improve-ment of gene expression estimates. Availability: We provide mappings between human and mouse exon array probes and off-target transcripts and provide software extending the GeneBASE program for generating gene-level expression estimates including the cross-hybridization correction http://biogibbs.stanford.edu/~kkapur/GeneBase/. Contact:whwong@stanford.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Karen Kapur
- Department of Statistics, Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA, USA
| | | | | | | |
Collapse
|
38
|
Lin L, Shen S, Tye A, Cai JJ, Jiang P, Davidson BL, Xing Y. Diverse splicing patterns of exonized Alu elements in human tissues. PLoS Genet 2008; 4:e1000225. [PMID: 18841251 PMCID: PMC2562518 DOI: 10.1371/journal.pgen.1000225] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2008] [Accepted: 09/15/2008] [Indexed: 12/22/2022] Open
Abstract
Exonization of Alu elements is a major mechanism for birth of new exons in primate genomes. Prior analyses of expressed sequence tags show that almost all Alu-derived exons are alternatively spliced, and the vast majority of these exons have low transcript inclusion levels. In this work, we provide genomic and experimental evidence for diverse splicing patterns of exonized Alu elements in human tissues. Using Exon array data of 330 Alu-derived exons in 11 human tissues and detailed RT-PCR analyses of 38 exons, we show that some Alu-derived exons are constitutively spliced in a broad range of human tissues, and some display strong tissue-specific switch in their transcript inclusion levels. Most of such exons are derived from ancient Alu elements in the genome. In SEPN1, mutations of which are linked to a form of congenital muscular dystrophy, the muscle-specific inclusion of an Alu-derived exon may be important for regulating SEPN1 activity in muscle. Realtime qPCR analysis of this SEPN1 exon in macaque and chimpanzee tissues indicates human-specific increase in its transcript inclusion level and muscle specificity after the divergence of humans and chimpanzees. Our results imply that some Alu exonization events may have acquired adaptive benefits during the evolution of primate transcriptomes. New exons have been created and added to existing functional genes during eukaryotic genome evolution. Alu elements, a class of primate-specific retrotransposons, are a major source of new exons in primates. However, recent analyses of expressed sequence tags suggest that the vast majority of Alu-derived exons are low-abundance splice forms and represent non-functional evolutionary intermediates. In order to elucidate the evolutionary impact of Alu-derived exons, we investigated the splicing of 330 Alu-derived exons in 11 human tissues using data from high-density exon arrays with multiple oligonucleotide probes for every exon in the human genome. Our exon array analysis and further RT-PCR experiments reveal surprisingly diverse splicing patterns of these exons. Some Alu-derived exons are constitutively spliced, and some are strongly tissue-specific. In SEPN1, a gene implicated in a form of congenital muscular dystrophy, our data suggest that the muscle-specific inclusion of an Alu-derived exon results from a human-specific splicing change after the divergence of humans and chimpanzees. Our study provides novel insight into the evolutionary significance of Alu exonization events. A subset of Alu-derived exons, especially those derived from more ancient Alu elements in the genome, may have contributed to functional novelties during primate evolution.
Collapse
Affiliation(s)
- Lan Lin
- Department of Internal Medicine, University of Iowa, Iowa City, Iowa, USA
| | | | | | | | | | | | | |
Collapse
|
39
|
Chang TY, Li YY, Jen CH, Yang TP, Lin CH, Hsu MT, Wang HW. easyExon--a Java-based GUI tool for processing and visualization of Affymetrix exon array data. BMC Bioinformatics 2008; 9:432. [PMID: 18851762 PMCID: PMC2579307 DOI: 10.1186/1471-2105-9-432] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2008] [Accepted: 10/14/2008] [Indexed: 12/22/2022] Open
Abstract
Background Alternative RNA splicing greatly increases proteome diversity and thereby contribute to species- or tissue-specific functions. The possibility to study alternative splicing (AS) events on a genomic scale using splicing-sensitive microarrays, including the Affymetrix GeneChip Exon 1.0 ST microarray (exon array), has appeared very recently. However, the application of this new technology is hindered by the lack of free and user-friendly software devoted to these novel platforms. Results In this study we present a Java-based freeware, easyExon , to process, filtrate and visualize exon array data with an analysis pipeline. This tool implements the most commonly used probeset summarization methods as well as AS-orientated filtration algorithms, e.g. MIDAS and PAC, for the detection of alternative splicing events. We include a biological filtration function according to GO terms, and provide a module to visualize and interpret the selected exons and transcripts. Furthermore, easyExon can integrate with other related programs, such as Integrate Genome Browser (IGB) and Affymetrix Power Tools (APT), to make the whole analysis more comprehensive. We applied easyExon on a public accessible colon cancer dataset as an example to illustrate the analysis pipeline of this tool. Conclusion EasyExon can efficiently process and analyze the Affymetrix exon array data. The simplicity, flexibility and brevity of easyExon make it a valuable tool for AS event identification in genomic research.
Collapse
Affiliation(s)
- Ting-Yu Chang
- Institute of Microbiology and Immunology, National Yang-Ming University, and Department of Education and Research, Taipei City Hospital, Taipei, Taiwan.
| | | | | | | | | | | | | |
Collapse
|
40
|
Abstract
This review examines the extent to which transcriptomic methods have lived up to their promise in the context of nutrition research, placing particular emphasis on examples from micronutrient research. A case is made that the high quality platform technologies now available, together with established standards and systems for data storage and exchange and powerful new methods of data analysis, mean that microarrays have reached a level of technical maturity at which they can be exploited to their full potential. In the context of nutrition and micronutrient research, transcriptomic methods have already been widely applied, albeit primarily in studies using cell lines and animal models. Using this type of approach, a multitude of genes regulated at the mRNA level by dietary components has been identified and this, in turn, has provided new insights into the biological processes affected by nutritional parameters. Evidence from the very limited number of published transcriptomics-based nutritional studies performed in human volunteers suggests that, with appropriate study design, it is feasible to apply transcriptomic methods successfully in dietary intervention trials. On the other hand, gene expression-based biomarker development still poses a major challenge. Here the use of expression profile 'signatures', rather than single genes, may provide a solution. Approaches designed to identify such 'signatures' are being developed and tested widely, primarily in the context of medical research. The applicability and power of such approaches should also be evaluated in the context of nutrition.
Collapse
|
41
|
Xing Y, Stoilov P, Kapur K, Han A, Jiang H, Shen S, Black DL, Wong WH. MADS: a new and improved method for analysis of differential alternative splicing by exon-tiling microarrays. RNA (NEW YORK, N.Y.) 2008; 14:1470-1479. [PMID: 18566192 PMCID: PMC2491471 DOI: 10.1261/rna.1070208] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2008] [Accepted: 05/09/2008] [Indexed: 05/26/2023]
Abstract
We describe a method, microarray analysis of differential splicing (MADS), for discovery of differential alternative splicing from exon-tiling microarray data. MADS incorporates a series of low-level analysis algorithms motivated by the "probe-rich" design of exon arrays, including background correction, iterative probe selection, and removal of sequence-specific cross-hybridization to off-target transcripts. We used MADS to analyze Affymetrix Exon 1.0 array data on a mouse neuroblastoma cell line after shRNA-mediated knockdown of the splicing factor polypyrimidine tract binding protein (PTB). From a list of exons with predetermined inclusion/exclusion profiles in response to PTB depletion, MADS recognized all exons known to have large changes in transcript inclusion levels and offered improvement over Affymetrix's analysis procedure. We also identified numerous novel PTB-dependent splicing events. Thirty novel events were tested by RT-PCR and 27 were confirmed. This work demonstrates that the exon-tiling microarray design is an efficient and powerful approach for global, unbiased analysis of pre-mRNA splicing.
Collapse
Affiliation(s)
- Yi Xing
- Department of Internal Medicine, University of Iowa, Iowa City, Iowa 52242, USA.
| | | | | | | | | | | | | | | |
Collapse
|
42
|
Isolation and Transcriptional Profiling of Purified Hepatic Cells Derived from Human Embryonic Stem Cells. Stem Cells 2008; 26:2032-41. [DOI: 10.1634/stemcells.2007-0964] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
43
|
Hu Z, Zimmermann BG, Zhou H, Wang J, Henson BS, Yu W, Elashoff D, Krupp G, Wong DT. Exon-level expression profiling: a comprehensive transcriptome analysis of oral fluids. Clin Chem 2008; 54:824-32. [PMID: 18356245 DOI: 10.1373/clinchem.2007.096164] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
BACKGROUND The application of global gene expression profiling to saliva samples is hampered by the presence of partially fragmented and degraded RNAs that are difficult to amplify and detect with the prevailing technologies. Moreover, the often limited volume of saliva samples is a challenge to quantitative PCR (qPCR) validation of multiple candidates. The aim of this study was to provide proof-of-concept data on the combination of a universal mRNA-amplification method with exon arrays for candidate selection and a multiplex preamplification method for easy validation. METHODS We used a universal mRNA-specific linear-amplification strategy in combination with Affymetrix Exon Arrays to amplify salivary RNA from 18 healthy individuals on the nanogram scale. Multiple selected candidates were preamplified in one multiplex reverse transcription PCR reaction, cleaned up enzymatically, and validated by qPCR. RESULTS We defined a salivary exon core transcriptome (SECT) containing 851 transcripts of genes that have highly similar expression profiles in healthy individuals. A subset of the SECT transcripts was verified by qPCR analysis. Informatics analysis of the SECT revealed several functional clusters and sequence motifs. Sex-specific salivary exon biomarkers were identified and validated in tests with samples from healthy individuals. CONCLUSIONS It is feasible to use samples containing fragmented RNAs to conduct high-resolution expression profiling with coverage of the entire transcriptome and to validate multiple targets from limited amounts of sample.
Collapse
Affiliation(s)
- Zhanzhi Hu
- Dental Research Institute, 73-017 Center for Health Sciences, University of California, Los Angeles, CA 90095-1668, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
44
|
A transcriptome database for astrocytes, neurons, and oligodendrocytes: a new resource for understanding brain development and function. J Neurosci 2008; 28:264-78. [PMID: 18171944 DOI: 10.1523/jneurosci.4178-07.2008] [Citation(s) in RCA: 2359] [Impact Index Per Article: 147.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Understanding the cell-cell interactions that control CNS development and function has long been limited by the lack of methods to cleanly separate neural cell types. Here we describe methods for the prospective isolation and purification of astrocytes, neurons, and oligodendrocytes from developing and mature mouse forebrain. We used FACS (fluorescent-activated cell sorting) to isolate astrocytes from transgenic mice that express enhanced green fluorescent protein (EGFP) under the control of an S100beta promoter. Using Affymetrix GeneChip Arrays, we then created a transcriptome database of the expression levels of >20,000 genes by gene profiling these three main CNS neural cell types at various postnatal ages between postnatal day 1 (P1) and P30. This database provides a detailed global characterization and comparison of the genes expressed by acutely isolated astrocytes, neurons, and oligodendrocytes. We found that Aldh1L1 is a highly specific antigenic marker for astrocytes with a substantially broader pattern of astrocyte expression than the traditional astrocyte marker GFAP. Astrocytes were enriched in specific metabolic and lipid synthetic pathways, as well as the draper/Megf10 and Mertk/integrin alpha(v)beta5 phagocytic pathways suggesting that astrocytes are professional phagocytes. Our findings call into question the concept of a "glial" cell class as the gene profiles of astrocytes and oligodendrocytes are as dissimilar to each other as they are to neurons. This transcriptome database of acutely isolated purified astrocytes, neurons, and oligodendrocytes provides a resource to the neuroscience community by providing improved cell-type-specific markers and for better understanding of neural development, function, and disease.
Collapse
|
45
|
Abstract
Alternative mRNA splicing is a rich source of transcript diversity in eukaryotic cells with broad roles in development and disease. Systems-wide experimental methods have started to define how global splicing regulation shapes complex biological properties and pathways. Here, we review these approaches, describe recent insights they have yielded, and discuss avenues of future investigation.
Collapse
Affiliation(s)
- Michael J Moore
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts 02115, USA
| | | |
Collapse
|
46
|
Robinson MD, Speed TP. A comparison of Affymetrix gene expression arrays. BMC Bioinformatics 2007; 8:449. [PMID: 18005448 PMCID: PMC2216046 DOI: 10.1186/1471-2105-8-449] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2007] [Accepted: 11/15/2007] [Indexed: 12/22/2022] Open
Abstract
Background Affymetrix GeneChips™ are an important tool in many facets of biological research. Recently, notable design changes to the chips have been made. In this study, we use publicly available data from Affymetrix to gauge the performance of three human gene expression arrays: Human Genome U133 Plus 2.0 (U133), Human Exon 1.0 ST (HuEx) and Human Gene 1.0 ST (HuGene). Results We studied probe-, exon- and gene-level reproducibility of technical and biological replicates from each of the 3 platforms. The U133 array has larger feature sizes so it is no surprise that probe-level variances are smaller, however the larger number of probes per gene on the HuGene array seems to produce gene-level summaries that have similar variances. The gene-level summaries of the HuEx array are less reproducible than the other two, despite having the largest average number of probes per gene. Greater than 80% of the content on the HuEx arrays is expressed at or near background. Biological variation seems to have a smaller effect on U133 data. Comparing the overlap of differentially expressed genes, we see a high overall concordance among all 3 platforms, with HuEx and HuGene having greater overlap, as expected given their design. We performed an analysis of detection rates and area under ROC curves using an experiment made up of several mixtures of 2 human tissues. Though it appears that the HuEx array has worse performance in terms of detection rates, all arrays have similar ability to separate differentially expressed and non-differentially expressed genes. Conclusion Despite noticeable differences in the probe-level reproducibility, gene-level reproducibility and differential expression detection are quite similar across the three platforms. The HuEx array, an all-encompassing array, has the flexibility of measuring all known or predicted exonic content. However, the HuEx array induces poorer reproducibility for genes with fewer exons. The HuGene measures just the well-annotated genome content and appears to perform well. The U133 array, though not able to measure across the full length of a transcript, appears to perform as well as the newer designs on the set of genes common to all 3 platforms.
Collapse
Affiliation(s)
- Mark D Robinson
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3050, Australia.
| | | |
Collapse
|
47
|
Kapur K, Xing Y, Ouyang Z, Wong WH. Exon arrays provide accurate assessments of gene expression. Genome Biol 2007; 8:R82. [PMID: 17504534 PMCID: PMC1929160 DOI: 10.1186/gb-2007-8-5-r82] [Citation(s) in RCA: 93] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2006] [Revised: 04/02/2007] [Accepted: 05/15/2007] [Indexed: 11/10/2022] Open
Abstract
A strategy for estimating gene expression on Affymetrix exon arrays suggests that these arrays may provide more accurate measurements of gene expression than traditional 3’ arrays. We have developed a strategy for estimating gene expression on Affymetrix Exon arrays. The method includes a probe-specific background correction and a probe selection strategy in which a subset of probes with highly correlated intensities across multiple samples are chosen to summarize gene expression. Our results demonstrate that the proposed background model offers improvements over the default Affymetrix background correction and that Exon arrays may provide more accurate measurements of gene expression than traditional 3' arrays.
Collapse
Affiliation(s)
- Karen Kapur
- Department of Statistics, Stanford University, Stanford, California, 94305, USA
| | - Yi Xing
- Department of Statistics, Stanford University, Stanford, California, 94305, USA
- Department of Internal Medicine, Roy J and Lucille A Carver College of Medicine, University of Iowa, Iowa City, Iowa, 52242, USA
| | - Zhengqing Ouyang
- Department of Biological Sciences, Stanford University, Stanford, California, 94305, USA
| | - Wing Hung Wong
- Department of Statistics, Stanford University, Stanford, California, 94305, USA
| |
Collapse
|
48
|
Muller J, Mehlen A, Vetter G, Yatskou M, Muller A, Chalmel F, Poch O, Friederich E, Vallar L. Design and evaluation of Actichip, a thematic microarray for the study of the actin cytoskeleton. BMC Genomics 2007; 8:294. [PMID: 17727702 PMCID: PMC2077341 DOI: 10.1186/1471-2164-8-294] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2007] [Accepted: 08/29/2007] [Indexed: 01/07/2023] Open
Abstract
Background The actin cytoskeleton plays a crucial role in supporting and regulating numerous cellular processes. Mutations or alterations in the expression levels affecting the actin cytoskeleton system or related regulatory mechanisms are often associated with complex diseases such as cancer. Understanding how qualitative or quantitative changes in expression of the set of actin cytoskeleton genes are integrated to control actin dynamics and organisation is currently a challenge and should provide insights in identifying potential targets for drug discovery. Here we report the development of a dedicated microarray, the Actichip, containing 60-mer oligonucleotide probes for 327 genes selected for transcriptome analysis of the human actin cytoskeleton. Results Genomic data and sequence analysis features were retrieved from GenBank and stored in an integrative database called Actinome. From these data, probes were designed using a home-made program (CADO4MI) allowing sequence refinement and improved probe specificity by combining the complementary information recovered from the UniGene and RefSeq databases. Actichip performance was analysed by hybridisation with RNAs extracted from epithelial MCF-7 cells and human skeletal muscle. Using thoroughly standardised procedures, we obtained microarray images with excellent quality resulting in high data reproducibility. Actichip displayed a large dynamic range extending over three logs with a limit of sensitivity between one and ten copies of transcript per cell. The array allowed accurate detection of small changes in gene expression and reliable classification of samples based on the expression profiles of tissue-specific genes. When compared to two other oligonucleotide microarray platforms, Actichip showed similar sensitivity and concordant expression ratios. Moreover, Actichip was able to discriminate the highly similar actin isoforms whereas the two other platforms did not. Conclusion Our data demonstrate that Actichip is a powerful alternative to commercial high density microarrays for cytoskeleton gene profiling in normal or pathological samples. Actichip is available upon request.
Collapse
Affiliation(s)
- Jean Muller
- Laboratoire de Biologie Moléculaire, d'Analyse Génique et de Modélisation, Centre de Recherche Public-Santé, 84 rue Val Fleuri, L-1526 Luxembourg, Luxembourg
- Laboratoire de Bioinformatique et Génomique Intégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire; Inserm, U596; CNRS, UMR7104, F-67400 Illkirch, Université Louis Pasteur, F-67000 Strasbourg, France
- Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, D-69117 Heidelberg, Germany
| | - André Mehlen
- Laboratoire de Biologie Moléculaire, d'Analyse Génique et de Modélisation, Centre de Recherche Public-Santé, 84 rue Val Fleuri, L-1526 Luxembourg, Luxembourg
| | - Guillaume Vetter
- Laboratoire de Biologie Moléculaire, d'Analyse Génique et de Modélisation, Centre de Recherche Public-Santé, 84 rue Val Fleuri, L-1526 Luxembourg, Luxembourg
- Cytoskeleton and cell plasticity laboratory, Life Sciences RU, University of Luxembourg, 162a Avenue de la faïencerie, L-1511 Luxembourg, Luxembourg
| | - Mikalai Yatskou
- Laboratoire de Biologie Moléculaire, d'Analyse Génique et de Modélisation, Centre de Recherche Public-Santé, 84 rue Val Fleuri, L-1526 Luxembourg, Luxembourg
| | - Arnaud Muller
- Laboratoire de Biologie Moléculaire, d'Analyse Génique et de Modélisation, Centre de Recherche Public-Santé, 84 rue Val Fleuri, L-1526 Luxembourg, Luxembourg
| | - Frédéric Chalmel
- Laboratoire de Bioinformatique et Génomique Intégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire; Inserm, U596; CNRS, UMR7104, F-67400 Illkirch, Université Louis Pasteur, F-67000 Strasbourg, France
- GERHM-Inserm U625, Université Rennes I, Campus de Beaulieu, Bt 13, Avenue du Général Leclerc, F-35042 Rennes cedex, France
| | - Olivier Poch
- Laboratoire de Bioinformatique et Génomique Intégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire; Inserm, U596; CNRS, UMR7104, F-67400 Illkirch, Université Louis Pasteur, F-67000 Strasbourg, France
| | - Evelyne Friederich
- Laboratoire de Biologie Moléculaire, d'Analyse Génique et de Modélisation, Centre de Recherche Public-Santé, 84 rue Val Fleuri, L-1526 Luxembourg, Luxembourg
- Cytoskeleton and cell plasticity laboratory, Life Sciences RU, University of Luxembourg, 162a Avenue de la faïencerie, L-1511 Luxembourg, Luxembourg
| | - Laurent Vallar
- Laboratoire de Biologie Moléculaire, d'Analyse Génique et de Modélisation, Centre de Recherche Public-Santé, 84 rue Val Fleuri, L-1526 Luxembourg, Luxembourg
| |
Collapse
|
49
|
Skotheim RI, Nees M. Alternative splicing in cancer: Noise, functional, or systematic? Int J Biochem Cell Biol 2007; 39:1432-49. [PMID: 17416541 DOI: 10.1016/j.biocel.2007.02.016] [Citation(s) in RCA: 157] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2006] [Revised: 02/13/2007] [Accepted: 02/22/2007] [Indexed: 12/22/2022]
Abstract
Pre-messenger RNA splicing is a fine-tuned process that generates multiple functional variants from individual genes. Various cell types and developmental stages regulate alternative splicing patterns differently in their generation of specific gene functions. In cancers, splicing is significantly altered, and understanding the underlying mechanisms and patterns in cancer will shed new light onto cancer biology. Cancer-specific transcript variants are promising biomarkers and targets for diagnostic, prognostic, and treatment purposes. In this review, we explore how alternative splicing cannot simply be considered as noise or an innocent bystander, but is actively regulated or deregulated in cancers. A special focus will be on aspects of cell biology and biochemistry of alternative splicing in cancer cells, addressing differences in splicing mechanisms between normal and malignant cells. The systems biology of splicing is only now applied to the field of cancer research. We explore functional annotations for some of the most intensely spliced gene classes, and provide a literature mining and clustering that reflects the most intensely investigated genes. A few well-established cancer-specific splice events, such as the CD44 antigen, are used to illustrate the potential behind the exploration of the mechanisms of their regulation. Accordingly, we describe the functional connection between the regulatory machinery (i.e., the spliceosome and its accessory proteins) and their global impact on qualitative transcript variation that are only now emerging from the use of genomic technologies such as microarrays. These studies are expected to open an entirely new level of genetic information that is currently still poorly understood.
Collapse
Affiliation(s)
- Rolf I Skotheim
- Department of Cancer Prevention, Institute for Cancer Research, Rikshospitalet-Radiumhospitalet Medical Center, Oslo, Norway
| | | |
Collapse
|
50
|
Du P, Kibbe WA, Lin SM. nuID: a universal naming scheme of oligonucleotides for illumina, affymetrix, and other microarrays. Biol Direct 2007; 2:16. [PMID: 17540033 PMCID: PMC1891274 DOI: 10.1186/1745-6150-2-16] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2007] [Accepted: 05/31/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Oligonucleotide probes that are sequence identical may have different identifiers between manufacturers and even between different versions of the same company's microarray; and sometimes the same identifier is reused and represents a completely different oligonucleotide, resulting in ambiguity and potentially mis-identification of the genes hybridizing to that probe. RESULTS We have devised a unique, non-degenerate encoding scheme that can be used as a universal representation to identify an oligonucleotide across manufacturers. We have named the encoded representation 'nuID', for nucleotide universal identifier. Inspired by the fact that the raw sequence of the oligonucleotide is the true definition of identity for a probe, the encoding algorithm uniquely and non-degenerately transforms the sequence itself into a compact identifier (a lossless compression). In addition, we added a redundancy check (checksum) to validate the integrity of the identifier. These two steps, encoding plus checksum, result in an nuID, which is a unique, non-degenerate, permanent, robust and efficient representation of the probe sequence. For commercial applications that require the sequence identity to be confidential, we have an encryption schema for nuID. We demonstrate the utility of nuIDs for the annotation of Illumina microarrays, and we believe it has universal applicability as a source-independent naming convention for oligomers. REVIEWERS This article was reviewed by Itai Yanai, Rong Chen (nominated by Mark Gerstein), and Gregory Schuler (nominated by David Lipman).
Collapse
Affiliation(s)
- Pan Du
- Robert H. Lurie Comprehensive Cancer Center, Northwestern University, Chicago, IL, 60611, USA
| | - Warren A Kibbe
- Robert H. Lurie Comprehensive Cancer Center, Northwestern University, Chicago, IL, 60611, USA
| | - Simon M Lin
- Robert H. Lurie Comprehensive Cancer Center, Northwestern University, Chicago, IL, 60611, USA
| |
Collapse
|