1
|
On the fast computation of the Dirichlet-multinomial log-likelihood function. Comput Stat 2022. [DOI: 10.1007/s00180-022-01311-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
2
|
Wickramasinghe L, Leblanc A, Muthukumarana S. Model-based estimation of baseball batting metrics. J Appl Stat 2021; 48:1775-1797. [DOI: 10.1080/02664763.2020.1775792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
| | - Alexandre Leblanc
- Department of Statistics, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Saman Muthukumarana
- Department of Statistics, University of Manitoba, Winnipeg, Manitoba, Canada
| |
Collapse
|
3
|
|
4
|
Yu P, Li J, Deng SP, Zhang F, Grozdanov PN, Chin EWM, Martin SD, Vergnes L, Islam MS, Sun D, LaSalle JM, McGee SL, Goh E, MacDonald CC, Jin P. Integrated analysis of a compendium of RNA-Seq datasets for splicing factors. Sci Data 2020; 7:178. [PMID: 32546682 PMCID: PMC7297722 DOI: 10.1038/s41597-020-0514-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 03/13/2020] [Indexed: 02/05/2023] Open
Abstract
A vast amount of public RNA-sequencing datasets have been generated and used widely to study transcriptome mechanisms. These data offer precious opportunity for advancing biological research in transcriptome studies such as alternative splicing. We report the first large-scale integrated analysis of RNA-Seq data of splicing factors for systematically identifying key factors in diseases and biological processes. We analyzed 1,321 RNA-Seq libraries of various mouse tissues and cell lines, comprising more than 6.6 TB sequences from 75 independent studies that experimentally manipulated 56 splicing factors. Using these data, RNA splicing signatures and gene expression signatures were computed, and signature comparison analysis identified a list of key splicing factors in Rett syndrome and cold-induced thermogenesis. We show that cold-induced RNA-binding proteins rescue the neurite outgrowth defects in Rett syndrome using neuronal morphology analysis, and we also reveal that SRSF1 and PTBP1 are required for energy expenditure in adipocytes using metabolic flux analysis. Our study provides an integrated analysis for identifying key factors in diseases and biological processes and highlights the importance of public data resources for identifying hypotheses for experimental testing.
Collapse
Affiliation(s)
- Peng Yu
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China.
- Medical Big Data Center, Sichuan University, Chengdu, China.
| | - Jin Li
- Center for Epigenetics & Disease Prevention, Institute of Biosciences and Technology, College of Medicine, Texas A&M University, Houston, TX, 77030, USA
| | - Su-Ping Deng
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, Jiangsu, 215009, China
| | - Feiran Zhang
- Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Petar N Grozdanov
- Department of Cell Biology & Biochemistry, Texas Tech University Health Sciences Center, Lubbock, Texas, 79430, USA
| | - Eunice W M Chin
- Neuroscience Academic Clinical Programme, Duke-NUS Medical School, NA, Singapore
| | - Sheree D Martin
- Metabolic Reprogramming Laboratory, Metabolic Research Unit, School of Medicine and Centre for Molecular and Medical Research, Deakin University, Geelong, Victoria, Australia
| | - Laurent Vergnes
- Department of Human Genetics, David Geffen School of Medicine, University of California-Los Angeles, Los Angeles, CA, USA
| | - M Saharul Islam
- Department of Medical Microbiology and Immunology, Genome Center, and MIND Institute, University of California Davis, Davis, CA, USA
| | - Deqiang Sun
- Center for Epigenetics & Disease Prevention, Institute of Biosciences and Technology, College of Medicine, Texas A&M University, Houston, TX, 77030, USA
| | - Janine M LaSalle
- Department of Medical Microbiology and Immunology, Genome Center, and MIND Institute, University of California Davis, Davis, CA, USA
| | - Sean L McGee
- Metabolic Reprogramming Laboratory, Metabolic Research Unit, School of Medicine and Centre for Molecular and Medical Research, Deakin University, Geelong, Victoria, Australia
| | - Eyleen Goh
- Neuroscience Academic Clinical Programme, Duke-NUS Medical School, NA, Singapore
| | - Clinton C MacDonald
- Department of Cell Biology & Biochemistry, Texas Tech University Health Sciences Center, Lubbock, Texas, 79430, USA
| | - Peng Jin
- Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia, USA
| |
Collapse
|
5
|
Li J, Deng SP, Vieira J, Thomas J, Costa V, Tseng CS, Ivankovic F, Ciccodicola A, Yu P. RBPMetaDB: a comprehensive annotation of mouse RNA-Seq datasets with perturbations of RNA-binding proteins. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:5040291. [PMID: 29931156 PMCID: PMC6009576 DOI: 10.1093/database/bay054] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Accepted: 05/17/2018] [Indexed: 01/03/2023]
Abstract
RNA-binding proteins (RBPs) may play a critical role in gene regulation in various diseases or biological processes by controlling post-transcriptional events such as polyadenylation, splicing and mRNA stabilization via binding activities to RNA molecules. Owing to the importance of RBPs in gene regulation, a great number of studies have been conducted, resulting in a large amount of RNA-Seq datasets. However, these datasets usually do not have structured organization of metadata, which limits their potentially wide use. To bridge this gap, the metadata of a comprehensive set of publicly available mouse RNA-Seq datasets with perturbed RBPs were collected and integrated into a database called RBPMetaDB. This database contains 292 mouse RNA-Seq datasets for a comprehensive list of 187 RBPs. These RBPs account for only ∼10% of all known RBPs annotated in Gene Ontology, indicating that most are still unexplored using high-throughput sequencing. This negative information provides a great pool of candidate RBPs for biologists to conduct future experimental studies. In addition, we found that DNA-binding activities are significantly enriched among RBPs in RBPMetaDB, suggesting that prior studies of these DNA- and RNA-binding factors focus more on DNA-binding activities instead of RNA-binding activities. This result reveals the opportunity to efficiently reuse these data for investigation of the roles of their RNA-binding activities. A web application has also been implemented to enable easy access and wide use of RBPMetaDB. It is expected that RBPMetaDB will be a great resource for improving understanding of the biological roles of RBPs. Database URL: http://rbpmetadb.yubiolab.org
Collapse
Affiliation(s)
- Jin Li
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA.,TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX 77843, USA
| | - Su-Ping Deng
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA.,TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX 77843, USA
| | - Jacob Vieira
- The Department of Microbiology, University of Massachusetts Amherst, Amherst, MA, USA
| | - James Thomas
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Valerio Costa
- Institute of Genetics and Biophysics "Adriano Buzzati-Traverso", Consiglio Nazionale delle Ricerche, Via P. Castellino 111, 80131 Naples, Italy
| | - Ching-San Tseng
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei 11529, Taiwan
| | - Franjo Ivankovic
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Alfredo Ciccodicola
- Institute of Genetics and Biophysics "Adriano Buzzati-Traverso", Consiglio Nazionale delle Ricerche, Via P. Castellino 111, 80131 Naples, Italy.,Department of Science and Technology, University Parthenope of Naples, 80131 Naples, Italy
| | - Peng Yu
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA.,TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX 77843, USA
| |
Collapse
|
6
|
Belanger K, Nutter CA, Li J, Tasnim S, Liu P, Yu P, Kuyumcu-Martinez MN. CELF1 contributes to aberrant alternative splicing patterns in the type 1 diabetic heart. Biochem Biophys Res Commun 2018; 503:3205-3211. [PMID: 30158053 DOI: 10.1016/j.bbrc.2018.08.126] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2018] [Accepted: 08/20/2018] [Indexed: 12/20/2022]
Abstract
Dysregulated alternative splicing (AS) that contributes to diabetes pathogenesis has been identified, but little is known about the RNA binding proteins (RBPs) involved. We have previously found that the RBP CELF1 is upregulated in the diabetic heart; however, it is unclear if CELF1 contributes to diabetes-induced AS changes. Utilizing genome wide approaches, we identified extensive changes in AS patterns in Type 1 diabetic (T1D) mouse hearts. We discovered that many aberrantly spliced genes in T1D hearts have CELF1 binding sites. CELF1-regulated AS affects key genes within signaling pathways relevant to diabetes pathogenesis. Disruption of CELF1 binding sites impairs AS regulation by CELF1. In sum, our results indicate that CELF1 target RNAs are aberrantly spliced in the T1D heart leading to abnormal gene expression. These discoveries pave the way for targeting RBPs and their RNA networks as novel therapies for cardiac complications of diabetes.
Collapse
Affiliation(s)
- KarryAnne Belanger
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX, 77555, USA
| | - Curtis A Nutter
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX, 77555, USA
| | - Jin Li
- Department of Electrical and Computer Engineering & TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX 77843, USA
| | - Sadia Tasnim
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX, 77555, USA
| | - Peiru Liu
- Ball High School, Galveston, TX, 77555, USA
| | - Peng Yu
- Department of Electrical and Computer Engineering & TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX 77843, USA.
| | - Muge N Kuyumcu-Martinez
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX, 77555, USA; Department of Neuroscience,Cell Biology and Anatomy, University of Texas Medical Branch, Galveston, TX, 77555, USA; Institute for Translational Sciences, University of Texas Medical Branch, Galveston, TX, 77555, USA.
| |
Collapse
|
7
|
A simple computer vision pipeline reveals the effects of isolation on social interaction dynamics in Drosophila. PLoS Comput Biol 2018; 14:e1006410. [PMID: 30161262 PMCID: PMC6135522 DOI: 10.1371/journal.pcbi.1006410] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 09/12/2018] [Accepted: 07/31/2018] [Indexed: 12/14/2022] Open
Abstract
Isolation profoundly influences social behavior in all animals. In humans, isolation has serious effects on health. Drosophila melanogaster is a powerful model to study small-scale, temporally-transient social behavior. However, longer-term analysis of large groups of flies is hampered by the lack of effective and reliable tools. We built a new imaging arena and improved the existing tracking algorithm to reliably follow a large number of flies simultaneously. Next, based on the automatic classification of touch and graph-based social network analysis, we designed an algorithm to quantify changes in the social network in response to prior social isolation. We observed that isolation significantly and swiftly enhanced individual and local social network parameters depicting near-neighbor relationships. We explored the genome-wide molecular correlates of these behavioral changes and found that whereas behavior changed throughout the six days of isolation, gene expression alterations occurred largely on day one. These changes occurred mostly in metabolic genes, and we verified the metabolic changes by showing an increase of lipid content in isolated flies. In summary, we describe a highly reliable tracking and analysis pipeline for large groups of flies that we use to unravel the behavioral, molecular and physiological impact of isolation on social network dynamics in Drosophila. Social isolation severely affects the behavior and physiology of social animals, including humans. The fruit fly is a powerful model for studying the mechanisms of development, health and disease and is also used to study social behaviors such as mating and aggression. However, these studies are limited to examining few individuals for shorts amounts of time, due to the lack of effective computational tools for the analysis of large groups over prolonged time. To overcome this hurdle, we built a new behavioral arena and developed new software that accurately tracks many flies simultaneously over long time periods. The arena is cheap and easy to build and the software works with low resolution videos. Using these improved tools, we studied social isolation in groups of male flies. We found that isolation caused flies to form stronger interactions with neighboring flies in their social network. These behavioral changes were preceded by transient changes in the expression of metabolism genes and eventually resulted in isolated flies accumulating fat, as has been previously observed in studies in mice and humans. Our study opens the door for the use of fruit flies in future studies of social isolation.
Collapse
|
8
|
Monedero Cobeta I, Stadler CB, Li J, Yu P, Thor S, Benito-Sipos J. Specification of Drosophila neuropeptidergic neurons by the splicing component brr2. PLoS Genet 2018; 14:e1007496. [PMID: 30133436 PMCID: PMC6122834 DOI: 10.1371/journal.pgen.1007496] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 09/04/2018] [Accepted: 06/18/2018] [Indexed: 02/07/2023] Open
Abstract
During embryonic development, a number of genetic cues act to generate neuronal diversity. While intrinsic transcriptional cascades are well-known to control neuronal sub-type cell fate, the target cells can also provide critical input to specific neuronal cell fates. Such signals, denoted retrograde signals, are known to provide critical survival cues for neurons, but have also been found to trigger terminal differentiation of neurons. One salient example of such target-derived instructive signals pertains to the specification of the Drosophila FMRFamide neuropeptide neurons, the Tv4 neurons of the ventral nerve cord. Tv4 neurons receive a BMP signal from their target cells, which acts as the final trigger to activate the FMRFa gene. A recent FMRFa-eGFP genetic screen identified several genes involved in Tv4 specification, two of which encode components of the U5 subunit of the spliceosome: brr2 (l(3)72Ab) and Prp8. In this study, we focus on the role of RNA processing during target-derived signaling. We found that brr2 and Prp8 play crucial roles in controlling the expression of the FMRFa neuropeptide specifically in six neurons of the VNC (Tv4 neurons). Detailed analysis of brr2 revealed that this control is executed by two independent mechanisms, both of which are required for the activation of the BMP retrograde signaling pathway in Tv4 neurons: (1) Proper axonal pathfinding to the target tissue in order to receive the BMP ligand. (2) Proper RNA splicing of two genes in the BMP pathway: the thickveins (tkv) gene, encoding a BMP receptor subunit, and the Medea gene, encoding a co-Smad. These results reveal involvement of specific RNA processing in diversifying neuronal identity within the central nervous system. The nervous system displays daunting cellular diversity, largely generated through complex regulatory input operating on stem cells and their neural lineages during development. Most of the reported mechanisms acting to generate neural diversity pertain to transcriptional regulation. In contrast, little is known regarding the post-transcriptional mechanisms involved. Here, we use a specific group of neurons, Apterous neurons, in the ventral nerve cord of Drosophila melanogaster as our model, to analyze the function of two essential components of the spliceosome; Brr2 and Prp8. Apterous neurons require a BMP retrograde signal for terminal differentiation, and we find that brr2 and Prp8 play crucial roles during this process. brr2 is critical for two independent events; axon pathfinding and BMP signaling, both of which are required for the activation of the retrograde signaling pathway necessary for Apterous neurons. These results identify a post-transcriptional mechanism as key for specifying neuronal identity, by ensuring the execution of a retrograde signal.
Collapse
Affiliation(s)
- Ignacio Monedero Cobeta
- Dept. of Biología, Facultad de Ciencias, Universidad Autónoma de Madrid, Madrid, Spain
- Dept. of Clinical and Experimental Medicine, Linkoping University, Linkoping, Sweden
| | | | - Jin Li
- Department of Electrical and Computer Engineering Texas A&M University, College Station, Texas, United States of America
- TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, Texas, United States of America
| | - Peng Yu
- Department of Electrical and Computer Engineering Texas A&M University, College Station, Texas, United States of America
| | - Stefan Thor
- Dept. of Clinical and Experimental Medicine, Linkoping University, Linkoping, Sweden
| | - Jonathan Benito-Sipos
- Dept. of Biología, Facultad de Ciencias, Universidad Autónoma de Madrid, Madrid, Spain
- * E-mail:
| |
Collapse
|
9
|
Activity-dependent aberrations in gene expression and alternative splicing in a mouse model of Rett syndrome. Proc Natl Acad Sci U S A 2018; 115:E5363-E5372. [PMID: 29769330 DOI: 10.1073/pnas.1722546115] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Rett syndrome (RTT) is a severe neurodevelopmental disorder that affects about 1 in 10,000 female live births. The underlying cause of RTT is mutations in the X-linked gene, methyl-CpG-binding protein 2 (MECP2); however, the molecular mechanism by which these mutations mediate the RTT neuropathology remains enigmatic. Specifically, although MeCP2 is known to act as a transcriptional repressor, analyses of the RTT brain at steady-state conditions detected numerous differentially expressed genes, while the changes in transcript levels were mostly subtle. Here we reveal an aberrant global pattern of gene expression, characterized predominantly by higher levels of expression of activity-dependent genes, and anomalous alternative splicing events, specifically in response to neuronal activity in a mouse model for RTT. Notably, the specific splicing modalities of intron retention and exon skipping displayed a significant bias toward increased retained introns and skipped exons, respectively, in the RTT brain compared with the WT brain. Furthermore, these aberrations occur in conjunction with higher seizure susceptibility in response to neuronal activity in RTT mice. Our findings advance the concept that normal MeCP2 functioning is required for fine-tuning the robust and immediate changes in gene transcription and for proper regulation of alternative splicing induced in response to neuronal stimulation.
Collapse
|
10
|
Genome-wide transcriptome analysis identifies alternative splicing regulatory network and key splicing factors in mouse and human psoriasis. Sci Rep 2018. [PMID: 29515135 PMCID: PMC5841439 DOI: 10.1038/s41598-018-22284-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Psoriasis is a chronic inflammatory disease that affects the skin, nails, and joints. For understanding the mechanism of psoriasis, though, alternative splicing analysis has received relatively little attention in the field. Here, we developed and applied several computational analysis methods to study psoriasis. Using psoriasis mouse and human datasets, our differential alternative splicing analyses detected hundreds of differential alternative splicing changes. Our analysis of conservation revealed many exon-skipping events conserved between mice and humans. In addition, our splicing signature comparison analysis using the psoriasis datasets and our curated splicing factor perturbation RNA-Seq database, SFMetaDB, identified nine candidate splicing factors that may be important in regulating splicing in the psoriasis mouse model dataset. Three of the nine splicing factors were confirmed upon analyzing the human data. Our computational methods have generated predictions for the potential role of splicing in psoriasis. Future experiments on the novel candidates predicted by our computational analysis are expected to provide a better understanding of the molecular mechanism of psoriasis and to pave the way for new therapeutic treatments.
Collapse
|
11
|
A Dirichlet-Multinomial Bayes Classifier for Disease Diagnosis with Microbial Compositions. mSphere 2017; 2:mSphere00536-17. [PMID: 29242838 PMCID: PMC5729222 DOI: 10.1128/mspheredirect.00536-17] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Accepted: 11/29/2017] [Indexed: 11/20/2022] Open
Abstract
By incorporating prior information on disease prevalence, Bayes classifiers have the potential to estimate disease probability better than other common machine-learning methods. Thus, it is important to develop Bayes classifiers specifically tailored for microbiome data. Our method shows higher classification accuracy than the only existing Bayesian classifier and the popular random forest method, and thus provides an alternative option for using microbial compositions for disease diagnosis. Dysbiosis of microbial communities is associated with various human diseases, raising the possibility of using microbial compositions as biomarkers for disease diagnosis. We have developed a Bayes classifier by modeling microbial compositions with Dirichlet-multinomial distributions, which are widely used to model multicategorical count data with extra variation. The parameters of the Dirichlet-multinomial distributions are estimated from training microbiome data sets based on maximum likelihood. The posterior probability of a microbiome sample belonging to a disease or healthy category is calculated based on Bayes’ theorem, using the likelihood values computed from the estimated Dirichlet-multinomial distribution, as well as a prior probability estimated from the training microbiome data set or previously published information on disease prevalence. When tested on real-world microbiome data sets, our method, called DMBC (for Dirichlet-multinomial Bayes classifier), shows better classification accuracy than the only existing Bayesian microbiome classifier based on a Dirichlet-multinomial mixture model and the popular random forest method. The advantage of DMBC is its built-in automatic feature selection, capable of identifying a subset of microbial taxa with the best classification accuracy between different classes of samples based on cross-validation. This unique ability enables DMBC to maintain and even improve its accuracy at modeling species-level taxa. The R package for DMBC is freely available at https://github.com/qunfengdong/DMBC. IMPORTANCE By incorporating prior information on disease prevalence, Bayes classifiers have the potential to estimate disease probability better than other common machine-learning methods. Thus, it is important to develop Bayes classifiers specifically tailored for microbiome data. Our method shows higher classification accuracy than the only existing Bayesian classifier and the popular random forest method, and thus provides an alternative option for using microbial compositions for disease diagnosis.
Collapse
|