1
|
Sotillo E, Barrett DM, Black KL, Bagashev A, Oldridge D, Wu G, Sussman R, Lanauze C, Ruella M, Gazzara MR, Martinez NM, Harrington CT, Chung EY, Perazzelli J, Hofmann TJ, Maude SL, Raman P, Barrera A, Gill S, Lacey SF, Melenhorst JJ, Allman D, Jacoby E, Fry T, Mackall C, Barash Y, Lynch KW, Maris JM, Grupp SA, Thomas-Tikhonenko A. Convergence of Acquired Mutations and Alternative Splicing of CD19 Enables Resistance to CART-19 Immunotherapy. Cancer Discov 2015; 5:1282-95. [PMID: 26516065 PMCID: PMC4670800 DOI: 10.1158/2159-8290.cd-15-1020] [Citation(s) in RCA: 961] [Impact Index Per Article: 96.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2015] [Accepted: 10/01/2015] [Indexed: 01/20/2023]
Abstract
UNLABELLED The CD19 antigen, expressed on most B-cell acute lymphoblastic leukemias (B-ALL), can be targeted with chimeric antigen receptor-armed T cells (CART-19), but relapses with epitope loss occur in 10% to 20% of pediatric responders. We detected hemizygous deletions spanning the CD19 locus and de novo frameshift and missense mutations in exon 2 of CD19 in some relapse samples. However, we also discovered alternatively spliced CD19 mRNA species, including one lacking exon 2. Pull-down/siRNA experiments identified SRSF3 as a splicing factor involved in exon 2 retention, and its levels were lower in relapsed B-ALL. Using genome editing, we demonstrated that exon 2 skipping bypasses exon 2 mutations in B-ALL cells and allows expression of the N-terminally truncated CD19 variant, which fails to trigger killing by CART-19 but partly rescues defects associated with CD19 loss. Thus, this mechanism of resistance is based on a combination of deleterious mutations and ensuing selection for alternatively spliced RNA isoforms. SIGNIFICANCE CART-19 yield 70% response rates in patients with B-ALL, but also produce escape variants. We discovered that the underlying mechanism is the selection for preexisting alternatively spliced CD19 isoforms with the compromised CART-19 epitope. This mechanism suggests a possibility of targeting alternative CD19 ectodomains, which could improve survival of patients with B-cell neoplasms.
Collapse
|
Research Support, N.I.H., Extramural |
10 |
961 |
2
|
Xiong HY, Alipanahi B, Lee LJ, Bretschneider H, Merico D, Yuen RKC, Hua Y, Gueroussov S, Najafabadi HS, Hughes TR, Morris Q, Barash Y, Krainer AR, Jojic N, Scherer SW, Blencowe BJ, Frey BJ. RNA splicing. The human splicing code reveals new insights into the genetic determinants of disease. Science 2015; 347:1254806. [PMID: 25525159 PMCID: PMC4362528 DOI: 10.1126/science.1254806] [Citation(s) in RCA: 814] [Impact Index Per Article: 81.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
To facilitate precision medicine and whole-genome annotation, we developed a machine-learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of more than 650,000 intronic and exonic variants revealed widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations that are more than 30 nucleotides from any splice site alter splicing nine times as often as common variants, and missense exonic disease mutations that have the least impact on protein function are five times as likely as others to alter splicing. We detected tens of thousands of disease-causing mutations, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole-genome sequencing of individuals with autism revealed misspliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine.
Collapse
|
Research Support, N.I.H., Extramural |
10 |
814 |
3
|
Barash Y, Calarco JA, Gao W, Pan Q, Wang X, Shai O, Blencowe BJ, Frey BJ. Deciphering the splicing code. Nature 2010; 465:53-9. [PMID: 20445623 DOI: 10.1038/nature09000] [Citation(s) in RCA: 639] [Impact Index Per Article: 42.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2009] [Accepted: 03/09/2010] [Indexed: 12/16/2022]
Abstract
Alternative splicing has a crucial role in the generation of biological complexity, and its misregulation is often involved in human disease. Here we describe the assembly of a 'splicing code', which uses combinations of hundreds of RNA features to predict tissue-dependent changes in alternative splicing for thousands of exons. The code determines new classes of splicing patterns, identifies distinct regulatory programs in different tissues, and identifies mutation-verified regulatory sequences. Widespread regulatory strategies are revealed, including the use of unexpectedly large combinations of features, the establishment of low exon inclusion levels that are overcome by features in specific tissues, the appearance of features deeper into introns than previously appreciated, and the modulation of splice variant levels by transcript structure characteristics. The code detected a class of exons whose inclusion silences expression in adult tissues by activating nonsense-mediated messenger RNA decay, but whose exclusion promotes expression during embryogenesis. The code facilitates the discovery and detailed characterization of regulated alternative splicing events on a genome-wide scale.
Collapse
|
Research Support, Non-U.S. Gov't |
15 |
639 |
4
|
Vaquero-Garcia J, Barrera A, Gazzara MR, González-Vallinas J, Lahens NF, Hogenesch JB, Lynch KW, Barash Y. A new view of transcriptome complexity and regulation through the lens of local splicing variations. eLife 2016; 5:e11752. [PMID: 26829591 PMCID: PMC4801060 DOI: 10.7554/elife.11752] [Citation(s) in RCA: 295] [Impact Index Per Article: 32.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2015] [Accepted: 01/31/2016] [Indexed: 12/29/2022] Open
Abstract
Alternative splicing (AS) can critically affect gene function and disease, yet mapping splicing variations remains a challenge. Here, we propose a new approach to define and quantify mRNA splicing in units of local splicing variations (LSVs). LSVs capture previously defined types of alternative splicing as well as more complex transcript variations. Building the first genome wide map of LSVs from twelve mouse tissues, we find complex LSVs constitute over 30% of tissue dependent transcript variations and affect specific protein families. We show the prevalence of complex LSVs is conserved in humans and identify hundreds of LSVs that are specific to brain subregions or altered in Alzheimer's patients. Amongst those are novel isoforms in the Camk2 family and a novel poison exon in Ptbp1, a key splice factor in neurogenesis. We anticipate the approach presented here will advance the ability to relate tissue-specific splice variation to genetic variation, phenotype, and disease.
Collapse
|
Research Support, Non-U.S. Gov't |
9 |
295 |
5
|
Marion RM, Regev A, Segal E, Barash Y, Koller D, Friedman N, O'Shea EK. Sfp1 is a stress- and nutrient-sensitive regulator of ribosomal protein gene expression. Proc Natl Acad Sci U S A 2004; 101:14315-22. [PMID: 15353587 PMCID: PMC521938 DOI: 10.1073/pnas.0405353101] [Citation(s) in RCA: 288] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Yeast cells modulate their protein synthesis capacity in response to physiological needs through the transcriptional control of ribosomal protein (RP) genes. Here we demonstrate that the transcription factor Sfp1, previously shown to play a role in the control of cell size, regulates RP gene expression in response to nutrients and stress. Under optimal growth conditions, Sfp1 is localized to the nucleus, bound to the promoters of RP genes, and helps promote RP gene expression. In response to inhibition of target of rapamycin (TOR) signaling, stress, or changes in nutrient availability, Sfp1 is released from RP gene promoters and leaves the nucleus, and RP gene transcription is down-regulated. Additionally, cells lacking Sfp1 fail to appropriately modulate RP gene expression in response to environmental cues. We conclude that Sfp1 integrates information from nutrient- and stress-responsive signaling pathways to help control RP gene expression.
Collapse
|
Research Support, U.S. Gov't, P.H.S. |
21 |
288 |
6
|
Fagnani M, Barash Y, Ip JY, Misquitta C, Pan Q, Saltzman AL, Shai O, Lee L, Rozenhek A, Mohammad N, Willaime-Morawek S, Babak T, Zhang W, Hughes TR, van der Kooy D, Frey BJ, Blencowe BJ. Functional coordination of alternative splicing in the mammalian central nervous system. Genome Biol 2008; 8:R108. [PMID: 17565696 PMCID: PMC2394768 DOI: 10.1186/gb-2007-8-6-r108] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2006] [Revised: 01/22/2007] [Accepted: 06/12/2007] [Indexed: 12/16/2022] Open
Abstract
A microarray analysis provides new evidence suggesting that specific cellular processes in the mammalian CNS are coordinated at the level of alternative splicing, and that a complex splicing code underlies CNS-specific alternative splicing regulation. Background Alternative splicing (AS) functions to expand proteomic complexity and plays numerous important roles in gene regulation. However, the extent to which AS coordinates functions in a cell and tissue type specific manner is not known. Moreover, the sequence code that underlies cell and tissue type specific regulation of AS is poorly understood. Results Using quantitative AS microarray profiling, we have identified a large number of widely expressed mouse genes that contain single or coordinated pairs of alternative exons that are spliced in a tissue regulated fashion. The majority of these AS events display differential regulation in central nervous system (CNS) tissues. Approximately half of the corresponding genes have neural specific functions and operate in common processes and interconnected pathways. Differential regulation of AS in the CNS tissues correlates strongly with a set of mostly new motifs that are predominantly located in the intron and constitutive exon sequences neighboring CNS-regulated alternative exons. Different subsets of these motifs are correlated with either increased inclusion or increased exclusion of alternative exons in CNS tissues, relative to the other profiled tissues. Conclusion Our findings provide new evidence that specific cellular processes in the mammalian CNS are coordinated at the level of AS, and that a complex splicing code underlies CNS specific AS regulation. This code appears to comprise many new motifs, some of which are located in the constitutive exons neighboring regulated alternative exons. These data provide a basis for understanding the molecular mechanisms by which the tissue specific functions of widely expressed genes are coordinated at the level of AS.
Collapse
|
Research Support, Non-U.S. Gov't |
17 |
84 |
7
|
Thompson FL, Barash Y, Sawabe T, Sharon G, Swings J, Rosenberg E. Thalassomonas loyana sp. nov., a causative agent of the white plague-like disease of corals on the Eilat coral reef. Int J Syst Evol Microbiol 2006; 56:365-368. [PMID: 16449441 DOI: 10.1099/ijs.0.63800-0] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The taxonomic position of the coral pathogen strain CBMAI 722T was determined on the basis of molecular and phenotypic data. We clearly show that the novel isolate CBMAI 722 T is a member of the family Colwelliaceae, with Thalassomonas ganghwensis as the nearest neighbour (95 % 16S rRNA gene sequence similarity). CBMAI 722T can be differentiated from its nearest neighbour on the basis of phenotypic and chemotaxonomic features, including the utilization of cellobiose and L-arginine, the production of alginase and amylase, but not oxidase, and the presence of the fatty acids 12:0 3-OH and 14:0, but not 10:0 or 15:0. The DNA G+C content of CBMAI 722T is 39.3 mol%. We conclude that this strain represents a novel species for which we propose the name Thalassomonas loyana sp. nov., with the type strain CBMAI 722T (=LMG 22536T). This is the first report of the involvement of a member of the family Colwelliaceae in coral white plague-like disease.
Collapse
|
Research Support, Non-U.S. Gov't |
19 |
83 |
8
|
Aznarez I, Barash Y, Shai O, He D, Zielenski J, Tsui LC, Parkinson J, Frey BJ, Rommens JM, Blencowe BJ. A systematic analysis of intronic sequences downstream of 5' splice sites reveals a widespread role for U-rich motifs and TIA1/TIAL1 proteins in alternative splicing regulation. Genome Res 2008; 18:1247-58. [PMID: 18456862 DOI: 10.1101/gr.073155.107] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
To identify human intronic sequences associated with 5' splice site recognition, we performed a systematic search for motifs enriched in introns downstream of both constitutive and alternative cassette exons. Significant enrichment was observed for U-rich motifs within 100 nucleotides downstream of 5' splice sites of both classes of exons, with the highest enrichment between positions +6 and +30. Exons adjacent to U-rich intronic motifs contain lower frequencies of exonic splicing enhancers and higher frequencies of exonic splicing silencers, compared with exons not followed by U-rich intronic motifs. These findings motivated us to explore the possibility of a widespread role for U-rich motifs in promoting exon inclusion. Since cytotoxic granule-associated RNA binding protein (TIA1) and TIA1-like 1 (TIAL1; also known as TIAR) were previously shown in vitro to bind to U-rich motifs downstream of 5' splice sites, and to facilitate 5' splice site recognition in vitro and in vivo, we investigated whether these factors function more generally in the regulation of splicing of exons followed by U-rich intronic motifs. Simultaneous knockdown of TIA1 and TIAL1 resulted in increased skipping of 36/41 (88%) of alternatively spliced exons associated with U-rich motifs, but did not affect 32/33 (97%) alternatively spliced exons that are not associated with U-rich motifs. The increase in exon skipping correlated with the proximity of the first U-rich motif and the overall "U-richness" of the adjacent intronic region. The majority of the alternative splicing events regulated by TIA1/TIAL1 are conserved in mouse, and the corresponding genes are associated with diverse cellular functions. Based on our results, we estimate that approximately 15% of alternative cassette exons are regulated by TIA1/TIAL1 via U-rich intronic elements.
Collapse
|
Research Support, Non-U.S. Gov't |
17 |
83 |
9
|
Asnani M, Hayer KE, Naqvi AS, Zheng S, Yang SY, Oldridge D, Ibrahim F, Maragkakis M, Gazzara MR, Black KL, Bagashev A, Taylor D, Mourelatos Z, Grupp SA, Barrett D, Maris JM, Sotillo E, Barash Y, Thomas-Tikhonenko A. Retention of CD19 intron 2 contributes to CART-19 resistance in leukemias with subclonal frameshift mutations in CD19. Leukemia 2020; 34:1202-1207. [PMID: 31591467 PMCID: PMC7214268 DOI: 10.1038/s41375-019-0580-z] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Revised: 09/04/2019] [Accepted: 09/17/2019] [Indexed: 02/03/2023]
|
Letter |
5 |
72 |
10
|
Barash Y, Dehan E, Krupsky M, Franklin W, Geraci M, Friedman N, Kaminski N. Comparative analysis of algorithms for signal quantitation from oligonucleotide microarrays. Bioinformatics 2004; 20:839-46. [PMID: 14751998 DOI: 10.1093/bioinformatics/btg487] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Recent years' exponential increase in DNA microarrays experiments has motivated the development of many signal quantitation (SQ) algorithms. These algorithms perform various transformations on the actual measurements aimed to enable researchers to compare readings of different genes quantitatively within one experiment and across separate experiments. However, it is relatively unclear whether there is a 'best' algorithm to quantitate microarray data. The ability to compare and assess such algorithms is crucial for any downstream analysis. In this work, we suggest a methodology for comparing different signal quantitation algorithms for gene expression data. Our aim is to enable researchers to compare the effect of different SQ algorithms on the specific dataset they are dealing with. We combine two kinds of tests to assess the effect of an SQ algorithm in terms of signal to noise ratio. To assess noise, we exploit redundancy within the experimental dataset to test the variability of a given SQ algorithm output. For the effect of the SQ on the signal we evaluate the overabundance of differentially expressed genes using various statistical significance tests. RESULTS We demonstrate our analysis approach with three SQ algorithms for oligonucleotide microarrays. We compare the results of using the dChip software and the RMAExpress software to the ones obtained by using the standard Affymetrix MAS5 on a dataset containing pairs of repeated hybridizations. Our analysis suggests that dChip is more robust and stable than the MAS5 tools for about 60% of the genes while RMAExpress is able to achieve an even greater improvement in terms of signal to noise, for more than 95% of the genes.
Collapse
|
|
21 |
71 |
11
|
Lee DSM, Ghanem LR, Barash Y. Integrative analysis reveals RNA G-quadruplexes in UTRs are selectively constrained and enriched for functional associations. Nat Commun 2020; 11:527. [PMID: 31988292 PMCID: PMC6985247 DOI: 10.1038/s41467-020-14404-y] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 01/03/2020] [Indexed: 11/17/2022] Open
Abstract
G-quadruplex (G4) sequences are abundant in untranslated regions (UTRs) of human messenger RNAs, but their functional importance remains unclear. By integrating multiple sources of genetic and genomic data, we show that putative G-quadruplex forming sequences (pG4) in 5' and 3' UTRs are selectively constrained, and enriched for cis-eQTLs and RNA-binding protein (RBP) interactions. Using over 15,000 whole-genome sequences, we find that negative selection acting on central guanines of UTR pG4s is comparable to that of missense variation in protein-coding sequences. At multiple GWAS-implicated SNPs within pG4 UTR sequences, we find robust allelic imbalance in gene expression across diverse tissue contexts in GTEx, suggesting that variants affecting G-quadruplex formation within UTRs may also contribute to phenotypic variation. Our results establish UTR G4s as important cis-regulatory elements and point to a link between disruption of UTR pG4 and disease.
Collapse
|
research-article |
5 |
56 |
12
|
Abstract
The recent growth in genomic data and measurements of genome-wide expression patterns allows us to apply computational tools to examine gene regulation by transcription factors. In this work, we present a class of mathematical models that help in understanding the connections between transcription factors and functional classes of genes based on genetic and genomic data. Such a model represents the joint distribution of transcription factor binding sites and of expression levels of a gene in a unified probabilistic model. Learning a combined probability model of binding sites and expression patterns enables us to improve the clustering of the genes based on the discovery of putative binding sites and to detect which binding sites and experiments best characterize a cluster. To learn such models from data, we introduce a new search method that rapidly learns a model according to a Bayesian score. We evaluate our method on synthetic data as well as on real life data and analyze the biological insights it provides. Finally, we demonstrate the applicability of the method to other data analysis problems in gene expression data.
Collapse
|
|
23 |
50 |
13
|
Vaquero-Garcia J, Lalonde E, Ewens KG, Ebrahimzadeh J, Richard-Yutz J, Shields CL, Barrera A, Green CJ, Barash Y, Ganguly A. PRiMeUM: A Model for Predicting Risk of Metastasis in Uveal Melanoma. Invest Ophthalmol Vis Sci 2017; 58:4096-4105. [PMID: 28828481 PMCID: PMC6108308 DOI: 10.1167/iovs.17-22255] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Purpose To create an interactive web-based tool for the Prediction of Risk of Metastasis in Uveal Melanoma (PRiMeUM) that can provide a personalized risk estimate of developing metastases within 48 months of primary uveal melanoma (UM) treatment. The model utilizes routinely collected clinical and tumor characteristics on 1227 UM, with the option of including chromosome information when available. Methods Using a cohort of 1227 UM cases, Cox proportional hazard modeling was used to assess significant predictors of metastasis including clinical and chromosomal characteristics. A multivariate model to predict risk of metastasis was evaluated using machine learning methods including logistic regression, decision trees, survival random forest, and survival-based regression models. Based on cross-validation results, a logistic regression classifier was developed to compute an individualized risk of metastasis based on clinical and chromosomal information. Results The PRiMeUM model provides prognostic information for personalized risk of metastasis in UM. The accuracy of the risk prediction ranged between 80% (using chromosomal features only), 83% using clinical features only (age, sex, tumor location, and size), and 85% (clinical and chromosomal information). Kaplan-Meier analysis showed these risk scores to be highly predictive of metastasis (P < 0.0001). Conclusions PRiMeUM provides a tool for predicting an individual's personal risk of metastasis based on their individual and tumor characteristics. It will aid physicians with decisions concerning frequency of systemic surveillance and can be used as a criterion for entering clinical trials for adjuvant therapies.
Collapse
|
Research Support, N.I.H., Extramural |
8 |
48 |
14
|
Abstract
Motivation Advancements in sequencing technologies have highlighted the role of alternative splicing (AS) in increasing transcriptome complexity. This role of AS, combined with the relation of aberrant splicing to malignant states, motivated two streams of research, experimental and computational. The first involves a myriad of techniques such as RNA-Seq and CLIP-Seq to identify splicing regulators and their putative targets. The second involves probabilistic models, also known as splicing codes, which infer regulatory mechanisms and predict splicing outcome directly from genomic sequence. To date, these models have utilized only expression data. In this work, we address two related challenges: Can we improve on previous models for AS outcome prediction and can we integrate additional sources of data to improve predictions for AS regulatory factors. Results We perform a detailed comparison of two previous modeling approaches, Bayesian and Deep Neural networks, dissecting the confounding effects of datasets and target functions. We then develop a new target function for AS prediction in exon skipping events and show it significantly improves model accuracy. Next, we develop a modeling framework that leverages transfer learning to incorporate CLIP-Seq, knockdown and over expression experiments, which are inherently noisy and suffer from missing values. Using several datasets involving key splice factors in mouse brain, muscle and heart we demonstrate both the prediction improvements and biological insights offered by our new models. Overall, the framework we propose offers a scalable integrative solution to improve splicing code modeling as vast amounts of relevant genomic data become available. Availability and implementation Code and data available at: majiq.biociphers.org/jha_et_al_2017/ Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
Journal Article |
7 |
46 |
15
|
Rohacek AM, Bebee TW, Tilton RK, Radens CM, McDermott-Roe C, Peart N, Kaur M, Zaykaner M, Cieply B, Musunuru K, Barash Y, Germiller JA, Krantz ID, Carstens RP, Epstein DJ. ESRP1 Mutations Cause Hearing Loss due to Defects in Alternative Splicing that Disrupt Cochlear Development. Dev Cell 2017; 43:318-331.e5. [PMID: 29107558 PMCID: PMC5687886 DOI: 10.1016/j.devcel.2017.09.026] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2016] [Revised: 08/15/2017] [Accepted: 08/26/2017] [Indexed: 12/30/2022]
Abstract
Alternative splicing contributes to gene expression dynamics in many tissues, yet its role in auditory development remains unclear. We performed whole-exome sequencing in individuals with sensorineural hearing loss (SNHL) and identified pathogenic mutations in Epithelial Splicing-Regulatory Protein 1 (ESRP1). Patient-derived induced pluripotent stem cells showed alternative splicing defects that were restored upon repair of an ESRP1 mutant allele. To determine how ESRP1 mutations cause hearing loss, we evaluated Esrp1-/- mouse embryos and uncovered alterations in cochlear morphogenesis, auditory hair cell differentiation, and cell fate specification. Transcriptome analysis revealed impaired expression and splicing of genes with essential roles in cochlea development and auditory function. Aberrant splicing of Fgfr2 blocked stria vascularis formation due to erroneous ligand usage, which was corrected by reducing Fgf9 gene dosage. These findings implicate mutations in ESRP1 as a cause of SNHL and demonstrate the complex interplay between alternative splicing, inner ear development, and auditory function.
Collapse
|
Research Support, N.I.H., Extramural |
8 |
43 |
16
|
Xiong HY, Barash Y, Frey BJ. Bayesian prediction of tissue-regulated splicing using RNA sequence and cellular context. ACTA ACUST UNITED AC 2011; 27:2554-62. [PMID: 21803804 DOI: 10.1093/bioinformatics/btr444] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
MOTIVATION Alternative splicing is a major contributor to cellular diversity in mammalian tissues and relates to many human diseases. An important goal in understanding this phenomenon is to infer a 'splicing code' that predicts how splicing is regulated in different cell types by features derived from RNA, DNA and epigenetic modifiers. METHODS We formulate the assembly of a splicing code as a problem of statistical inference and introduce a Bayesian method that uses an adaptively selected number of hidden variables to combine subgroups of features into a network, allows different tissues to share feature subgroups and uses a Gibbs sampler to hedge predictions and ascertain the statistical significance of identified features. RESULTS Using data for 3665 cassette exons, 1014 RNA features and 4 tissue types derived from 27 mouse tissues (http://genes.toronto.edu/wasp), we benchmarked several methods. Our method outperforms all others, and achieves relative improvements of 52% in splicing code quality and up to 22% in classification error, compared with the state of the art. Novel combinations of regulatory features and novel combinations of tissues that share feature subgroups were identified using our method. CONTACT frey@psi.toronto.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
Research Support, Non-U.S. Gov't |
14 |
43 |
17
|
Nurnberg ST, Guerraty MA, Wirka RC, Rao HS, Pjanic M, Norton S, Serrano F, Perisic L, Elwyn S, Pluta J, Zhao W, Testa S, Park Y, Nguyen T, Ko YA, Wang T, Hedin U, Sinha S, Barash Y, Brown CD, Quertermous T, Rader DJ. Genomic profiling of human vascular cells identifies TWIST1 as a causal gene for common vascular diseases. PLoS Genet 2020; 16:e1008538. [PMID: 31917787 PMCID: PMC6975560 DOI: 10.1371/journal.pgen.1008538] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Revised: 01/22/2020] [Accepted: 11/25/2019] [Indexed: 02/02/2023] Open
Abstract
Genome-wide association studies have identified multiple novel genomic loci associated with vascular diseases. Many of these loci are common non-coding variants that affect the expression of disease-relevant genes within coronary vascular cells. To identify such genes on a genome-wide level, we performed deep transcriptomic analysis of genotyped primary human coronary artery smooth muscle cells (HCASMCs) and coronary endothelial cells (HCAECs) from the same subjects, including splicing Quantitative Trait Loci (sQTL), allele-specific expression (ASE), and colocalization analyses. We identified sQTLs for TARS2, YAP1, CFDP1, and STAT6 in HCASMCs and HCAECs, and 233 ASE genes, a subset of which are also GTEx eGenes in arterial tissues. Colocalization of GWAS association signals for coronary artery disease (CAD), migraine, stroke and abdominal aortic aneurysm with GTEx eGenes in aorta, coronary artery and tibial artery discovered novel candidate risk genes for these diseases. At the CAD and stroke locus tagged by rs2107595 we demonstrate colocalization with expression of the proximal gene TWIST1. We show that disrupting the rs2107595 locus alters TWIST1 expression and that the risk allele has increased binding of the NOTCH signaling protein RBPJ. Finally, we provide data that TWIST1 expression influences vascular SMC phenotypes, including proliferation and calcification, as a potential mechanism supporting a role for TWIST1 in CAD.
Collapse
|
Research Support, N.I.H., Extramural |
5 |
41 |
18
|
Gazzara MR, Mallory MJ, Roytenberg R, Lindberg JP, Jha A, Lynch KW, Barash Y. Ancient antagonism between CELF and RBFOX families tunes mRNA splicing outcomes. Genome Res 2017; 27:1360-1370. [PMID: 28512194 PMCID: PMC5538552 DOI: 10.1101/gr.220517.117] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Accepted: 05/08/2017] [Indexed: 12/11/2022]
Abstract
Over 95% of human multi-exon genes undergo alternative splicing, a process important in normal development and often dysregulated in disease. We sought to analyze the global splicing regulatory network of CELF2 in human T cells, a well-studied splicing regulator critical to T cell development and function. By integrating high-throughput sequencing data for binding and splicing quantification with sequence features and probabilistic splicing code models, we find evidence of splicing antagonism between CELF2 and the RBFOX family of splicing factors. We validate this functional antagonism through knockdown and overexpression experiments in human cells and find CELF2 represses RBFOX2 mRNA and protein levels. Because both families of proteins have been implicated in the development and maintenance of neuronal, muscle, and heart tissues, we analyzed publicly available data in these systems. Our analysis suggests global, antagonistic coregulation of splicing by the CELF and RBFOX proteins in mouse muscle and heart in several physiologically relevant targets, including proteins involved in calcium signaling and members of the MEF2 family of transcription factors. Importantly, a number of these coregulated events are aberrantly spliced in mouse models and human patients with diseases that affect these tissues, including heart failure, diabetes, or myotonic dystrophy. Finally, analysis of exons regulated by ancient CELF family homologs in chicken, Drosophila, and Caenorhabditis elegans suggests this antagonism is conserved throughout evolution.
Collapse
|
Research Support, N.I.H., Extramural |
8 |
41 |
19
|
Shinde MY, Sidoli S, Kulej K, Mallory MJ, Radens CM, Reicherter AL, Myers RL, Barash Y, Lynch KW, Garcia BA, Klein PS. Phosphoproteomics reveals that glycogen synthase kinase-3 phosphorylates multiple splicing factors and is associated with alternative splicing. J Biol Chem 2017; 292:18240-18255. [PMID: 28916722 DOI: 10.1074/jbc.m117.813527] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2017] [Revised: 09/08/2017] [Indexed: 11/06/2022] Open
Abstract
Glycogen synthase kinase-3 (GSK-3) is a constitutively active, ubiquitously expressed protein kinase that regulates multiple signaling pathways. In vitro kinase assays and genetic and pharmacological manipulations of GSK-3 have identified more than 100 putative GSK-3 substrates in diverse cell types. Many more have been predicted on the basis of a recurrent GSK-3 consensus motif ((pS/pT)XXX(S/T)), but this prediction has not been tested by analyzing the GSK-3 phosphoproteome. Using stable isotope labeling of amino acids in culture (SILAC) and MS techniques to analyze the repertoire of GSK-3-dependent phosphorylation in mouse embryonic stem cells (ESCs), we found that ∼2.4% of (pS/pT)XXX(S/T) sites are phosphorylated in a GSK-3-dependent manner. A comparison of WT and Gsk3a;Gsk3b knock-out (Gsk3 DKO) ESCs revealed prominent GSK-3-dependent phosphorylation of multiple splicing factors and regulators of RNA biosynthesis as well as proteins that regulate transcription, translation, and cell division. Gsk3 DKO reduced phosphorylation of the splicing factors RBM8A, SRSF9, and PSF as well as the nucleolar proteins NPM1 and PHF6, and recombinant GSK-3β phosphorylated these proteins in vitro RNA-Seq of WT and Gsk3 DKO ESCs identified ∼190 genes that are alternatively spliced in a GSK-3-dependent manner, supporting a broad role for GSK-3 in regulating alternative splicing. The MS data also identified posttranscriptional regulation of protein abundance by GSK-3, with ∼47 proteins (1.4%) whose levels increased and ∼78 (2.4%) whose levels decreased in the absence of GSK-3. This study provides the first unbiased analysis of the GSK-3 phosphoproteome and strong evidence that GSK-3 broadly regulates alternative splicing.
Collapse
|
Journal Article |
8 |
41 |
20
|
Zheng S, Gillespie E, Naqvi AS, Hayer KE, Ang Z, Torres-Diz M, Quesnel-Vallières M, Hottman DA, Bagashev A, Chukinas J, Schmidt C, Asnani M, Shraim R, Taylor DM, Rheingold SR, O'Brien MM, Singh N, Lynch KW, Ruella M, Barash Y, Tasian SK, Thomas-Tikhonenko A. Modulation of CD22 Protein Expression in Childhood Leukemia by Pervasive Splicing Aberrations: Implications for CD22-Directed Immunotherapies. Blood Cancer Discov 2022; 3:103-115. [PMID: 35015683 PMCID: PMC9780083 DOI: 10.1158/2643-3230.bcd-21-0087] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 09/30/2021] [Accepted: 11/10/2021] [Indexed: 11/16/2022] Open
Abstract
Downregulation of surface epitopes causes postimmunotherapy relapses in B-lymphoblastic leukemia (B-ALL). Here we demonstrate that mRNA encoding CD22 undergoes aberrant splicing in B-ALL. We describe the plasma membrane-bound CD22 Δex5-6 splice isoform, which is resistant to chimeric antigen receptor (CAR) T cells targeting the third immunoglobulin-like domain of CD22. We also describe splice variants skipping the AUG-containing exon 2 and failing to produce any identifiable protein, thereby defining an event that is rate limiting for epitope presentation. Indeed, forcing exon 2 skipping with morpholino oligonucleotides reduced CD22 protein expression and conferred resistance to the CD22-directed antibody-drug conjugate inotuzumab ozogamicin in vitro. Furthermore, among inotuzumab-treated pediatric patients with B-ALL, we identified one nonresponder in whose leukemic blasts Δex2 isoforms comprised the majority of CD22 transcripts. In a second patient, a sharp reduction in CD22 protein levels during relapse was driven entirely by increased CD22 exon 2 skipping. Thus, dysregulated CD22 splicing is a major mechanism of epitope downregulation and ensuing resistance to immunotherapy. SIGNIFICANCE The mechanism(s) underlying downregulation of surface CD22 following CD22-directed immunotherapy remains underexplored. Our biochemical and correlative studies demonstrate that in B-ALL, CD22 expression levels are controlled by inclusion/skipping of CD22 exon 2. Thus, aberrant splicing of CD22 is an important driver/biomarker of de novo and acquired resistance to CD22-directed immunotherapies. See related commentary by Bourcier and Abdel-Wahab, p. 87. This article is highlighted in the In This Issue feature, p. 85.
Collapse
|
Editorial |
3 |
39 |
21
|
Brady LK, Wang H, Radens CM, Bi Y, Radovich M, Maity A, Ivan C, Ivan M, Barash Y, Koumenis C. Transcriptome analysis of hypoxic cancer cells uncovers intron retention in EIF2B5 as a mechanism to inhibit translation. PLoS Biol 2017; 15:e2002623. [PMID: 28961236 PMCID: PMC5636171 DOI: 10.1371/journal.pbio.2002623] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Revised: 10/11/2017] [Accepted: 09/07/2017] [Indexed: 01/09/2023] Open
Abstract
Cells adjust to hypoxic stress within the tumor microenvironment by downregulating energy-consuming processes including translation. To delineate mechanisms of cellular adaptation to hypoxia, we performed RNA-Seq of normoxic and hypoxic head and neck cancer cells. These data revealed a significant down regulation of genes known to regulate RNA processing and splicing. Exon-level analyses classified > 1,000 mRNAs as alternatively spliced under hypoxia and uncovered a unique retained intron (RI) in the master regulator of translation initiation, EIF2B5. Notably, this intron was expressed in solid tumors in a stage-dependent manner. We investigated the biological consequence of this RI and demonstrate that its inclusion creates a premature termination codon (PTC), that leads to a 65kDa truncated protein isoform that opposes full-length eIF2Bε to inhibit global translation. Furthermore, expression of 65kDa eIF2Bε led to increased survival of head and neck cancer cells under hypoxia, providing evidence that this isoform enables cells to adapt to conditions of low oxygen. Additional work to uncover -cis and -trans regulators of EIF2B5 splicing identified several factors that influence intron retention in EIF2B5: a weak splicing potential at the RI, hypoxia-induced expression and binding of the splicing factor SRSF3, and increased binding of total and phospho-Ser2 RNA polymerase II specifically at the intron retained under hypoxia. Altogether, these data reveal differential splicing as a previously uncharacterized mode of translational control under hypoxia and are supported by a model in which hypoxia-induced changes to cotranscriptional processing lead to selective retention of a PTC-containing intron in EIF2B5.
Collapse
|
research-article |
8 |
36 |
22
|
Norton SS, Vaquero-Garcia J, Lahens NF, Grant GR, Barash Y. Outlier detection for improved differential splicing quantification from RNA-Seq experiments with replicates. Bioinformatics 2018; 34:1488-1497. [PMID: 29236961 PMCID: PMC6454425 DOI: 10.1093/bioinformatics/btx790] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Revised: 11/17/2017] [Accepted: 12/07/2017] [Indexed: 01/20/2023] Open
Abstract
Motivation A key component in many RNA-Seq-based studies is contrasting multiple replicates from different experimental conditions. In this setup, replicates play a key role as they allow to capture underlying biological variability inherent to the compared conditions, as well as experimental variability. However, what constitutes a 'bad' replicate is not necessarily well defined. Consequently, researchers might discard valuable data or downstream analysis may be hampered by failed experiments. Results Here we develop a probability model to weigh a given RNA-Seq sample as a representative of an experimental condition when performing alternative splicing analysis. We demonstrate that this model detects outlier samples which are consistently and significantly different compared with other samples from the same condition. Moreover, we show that instead of discarding such samples the proposed weighting scheme can be used to downweight samples and specific splicing variations suspected as outliers, gaining statistical power. These weights can then be used for differential splicing (DS) analysis, where the resulting algorithm offers a generalization of the MAJIQ algorithm. Using both synthetic and real-life data, we perform an extensive evaluation of the improved MAJIQ algorithm in different scenarios involving perturbed samples, mislabeled samples, same condition groups, and different levels of coverage, showing it compares favorably to other tools. Overall, this work offers an outlier detection algorithm that can be combined with any splicing pipeline, a generalized and improved version of MAJIQ for DS detection, and evaluation metrics with matching code and data for DS algorithms. Availability and implementation Software and data are accessible via majiq.biociphers.org/norton_et_al_2017/. Contact yosephb@upenn.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
Research Support, N.I.H., Extramural |
7 |
29 |
23
|
Barash Y, Vaquero-Garcia J, González-Vallinas J, Xiong HY, Gao W, Lee LJ, Frey BJ. AVISPA: a web tool for the prediction and analysis of alternative splicing. Genome Biol 2014; 14:R114. [PMID: 24156756 PMCID: PMC4014802 DOI: 10.1186/gb-2013-14-10-r114] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2013] [Accepted: 10/11/2013] [Indexed: 11/10/2022] Open
Abstract
Transcriptome complexity and its relation to numerous diseases underpins the need to predict in silico splice variants and the regulatory elements that affect them. Building upon our recently described splicing code, we developed AVISPA, a Galaxy-based web tool for splicing prediction and analysis. Given an exon and its proximal sequence, the tool predicts whether the exon is alternatively spliced, displays tissue-dependent splicing patterns, and whether it has associated regulatory elements. We assess AVISPA's accuracy on an independent dataset of tissue-dependent exons, and illustrate how the tool can be applied to analyze a gene of interest. AVISPA is available at http://avispa.biociphers.org.
Collapse
|
Research Support, Non-U.S. Gov't |
11 |
28 |
24
|
Jha A, K Aicher J, R Gazzara M, Singh D, Barash Y. Enhanced Integrated Gradients: improving interpretability of deep learning models using splicing codes as a case study. Genome Biol 2020; 21:149. [PMID: 32560708 PMCID: PMC7305616 DOI: 10.1186/s13059-020-02055-7] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 05/22/2020] [Indexed: 01/03/2023] Open
Abstract
Despite the success and fast adaptation of deep learning models in biomedical domains, their lack of interpretability remains an issue. Here, we introduce Enhanced Integrated Gradients (EIG), a method to identify significant features associated with a specific prediction task. Using RNA splicing prediction as well as digit classification as case studies, we demonstrate that EIG improves upon the original Integrated Gradients method and produces sets of informative features. We then apply EIG to identify A1CF as a key regulator of liver-specific alternative splicing, supporting this finding with subsequent analysis of relevant A1CF functional (RNA-seq) and binding data (PAR-CLIP).
Collapse
|
Research Support, N.I.H., Extramural |
5 |
27 |
25
|
Cortés-López M, Schulz L, Enculescu M, Paret C, Spiekermann B, Quesnel-Vallières M, Torres-Diz M, Unic S, Busch A, Orekhova A, Kuban M, Mesitov M, Mulorz MM, Shraim R, Kielisch F, Faber J, Barash Y, Thomas-Tikhonenko A, Zarnack K, Legewie S, König J. High-throughput mutagenesis identifies mutations and RNA-binding proteins controlling CD19 splicing and CART-19 therapy resistance. Nat Commun 2022; 13:5570. [PMID: 36138008 PMCID: PMC9500061 DOI: 10.1038/s41467-022-31818-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 07/05/2022] [Indexed: 11/29/2022] Open
Abstract
Following CART-19 immunotherapy for B-cell acute lymphoblastic leukaemia (B-ALL), many patients relapse due to loss of the cognate CD19 epitope. Since epitope loss can be caused by aberrant CD19 exon 2 processing, we herein investigate the regulatory code that controls CD19 splicing. We combine high-throughput mutagenesis with mathematical modelling to quantitatively disentangle the effects of all mutations in the region comprising CD19 exons 1-3. Thereupon, we identify ~200 single point mutations that alter CD19 splicing and thus could predispose B-ALL patients to developing CART-19 resistance. Furthermore, we report almost 100 previously unknown splice isoforms that emerge from cryptic splice sites and likely encode non-functional CD19 proteins. We further identify cis-regulatory elements and trans-acting RNA-binding proteins that control CD19 splicing (e.g., PTBP1 and SF3B4) and validate that loss of these factors leads to pervasive CD19 mis-splicing. Our dataset represents a comprehensive resource for identifying predictive biomarkers for CART-19 therapy. Multiple alternative splicing events in CD19 mRNA have been associated with resistance/relapse to CD19 CAR-T therapy in patients with B cell malignancies. Here, by combining patient data and a high-throughput mutagenesis screen, the authors identify single point mutations and RNA-binding proteins that can control CD19 splicing and be associated with CD19 CAR-T therapy resistance.
Collapse
|
|
3 |
25 |