1
|
Deshpande D, Chhugani K, Chang Y, Karlsberg A, Loeffler C, Zhang J, Muszyńska A, Munteanu V, Yang H, Rotman J, Tao L, Balliu B, Tseng E, Eskin E, Zhao F, Mohammadi P, P. Łabaj P, Mangul S. RNA-seq data science: From raw data to effective interpretation. Front Genet 2023; 14:997383. [PMID: 36999049 PMCID: PMC10043755 DOI: 10.3389/fgene.2023.997383] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 02/24/2023] [Indexed: 03/14/2023] Open
Abstract
RNA sequencing (RNA-seq) has become an exemplary technology in modern biology and clinical science. Its immense popularity is due in large part to the continuous efforts of the bioinformatics community to develop accurate and scalable computational tools to analyze the enormous amounts of transcriptomic data that it produces. RNA-seq analysis enables genes and their corresponding transcripts to be probed for a variety of purposes, such as detecting novel exons or whole transcripts, assessing expression of genes and alternative transcripts, and studying alternative splicing structure. It can be a challenge, however, to obtain meaningful biological signals from raw RNA-seq data because of the enormous scale of the data as well as the inherent limitations of different sequencing technologies, such as amplification bias or biases of library preparation. The need to overcome these technical challenges has pushed the rapid development of novel computational tools, which have evolved and diversified in accordance with technological advancements, leading to the current myriad of RNA-seq tools. These tools, combined with the diverse computational skill sets of biomedical researchers, help to unlock the full potential of RNA-seq. The purpose of this review is to explain basic concepts in the computational analysis of RNA-seq data and define discipline-specific jargon.
Collapse
Affiliation(s)
- Dhrithi Deshpande
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Karishma Chhugani
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Yutong Chang
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Aaron Karlsberg
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Caitlin Loeffler
- Department of Computer Science, University of California, Los Angeles, CA, United States
| | - Jinyang Zhang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
| | - Agata Muszyńska
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Institute of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
| | - Viorel Munteanu
- Department of Computers, Informatics and Microelectronics, Technical University of Moldova, Chisinau, Moldova
| | - Harry Yang
- Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, CA, United States
| | - Jeremy Rotman
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Laura Tao
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | | | - Eleazar Eskin
- Department of Computer Science, University of California, Los Angeles, CA, United States
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, United States
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States
| | - Paweł P. Łabaj
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Department of Biotechnology, Boku University Vienna, Vienna, Austria
| | - Serghei Mangul
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
- Department of Quantitative and Computational Biology, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA, United States
- *Correspondence: Serghei Mangul,
| |
Collapse
|
2
|
Silicone Breast Implant Surface Texture Impacts Gene Expression in Periprosthetic Fibrous Capsules. Plast Reconstr Surg 2023; 151:85-95. [PMID: 36205692 DOI: 10.1097/prs.0000000000009800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
BACKGROUND Silicone breast implants with smooth outer shells are associated with higher rates of capsular contracture, whereas textured implants have been linked to the development of breast implant-associated anaplastic large cell lymphoma. By assessing the gene expression profile of fibrous capsules formed in response to smooth and textured implants, insight into the development of breast implant-associated abnormalities can be gained. METHODS Miniature smooth or textured silicone implants were surgically inserted into female rats ( n = 10) and harvested for the surrounding capsules at postoperative week 6. RNA sequencing and quantitative polymerase chain reaction were performed to identify genes differentially expressed between smooth and textured capsules. For clinical correlation, the expression of candidate genes was assayed in implant capsules harvested from human patients with and without capsular contracture. RESULTS Of 18,555 differentially expressed transcripts identified, three candidate genes were selected: matrix metalloproteinase-3 ( MMP3 ), troponin-T3 ( TNNT3 ), and neuregulin-1 ( NRG1 ). In textured capsules, relative gene expression and immunostaining of MMP3 and TNNT3 was up-regulated, whereas NRG1 was down-regulated compared to smooth capsules [mean relative fold change, 8.79 ( P = 0.0059), 4.81 ( P = 0.0056), and 0.40 ( P < 0.0001), respectively]. Immunostaining of human specimens with capsular contracture revealed similar gene expression patterns to those of animal-derived smooth capsules. CONCLUSIONS An expression pattern of low MMP3 /low TNNT3 /high NRG1 is specifically associated with smooth implant capsules and human implant capsules with capsular contracture. The authors' clinically relevant breast implant rat model provides a strong foundation to further explore the molecular genetics of implant texture and its effect on breast implant-associated abnormalities. CLINICAL RELEVANCE STATEMENT The authors have demonstrated that there are distinct gene expression profiles in response to smooth versus textured breast implants. Since surface texture may be linked to implant-related pathology, further molecular analysis of periprosthetic capsules may yield strategies to mitigate implant-related complications.
Collapse
|
3
|
Mehmood A, Laiho A, Elo LL. Exon-level estimates improve the detection of differentially expressed genes in RNA-seq studies. RNA Biol 2021; 18:1739-1746. [PMID: 33522408 PMCID: PMC8582999 DOI: 10.1080/15476286.2020.1868151] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Detection of differentially expressed genes (DEGs) between different biological conditions is a key data analysis step of most RNA-sequencing studies. Conventionally, computational tools have used gene-level read counts as input to test for differential gene expression between sample condition groups. Recently, it has been suggested that statistical testing could be performed with increased power at a lower feature level prior to aggregating the results to the gene level. In this study, we systematically compared the performance of calling the DEGs when using read count data at different levels (gene, transcript, and exon) as input, in the context of two publicly available data sets. Additionally, we tested two different methods for aggregating the lower feature-level p-values to gene-level: Lancaster and empirical Brown’s method. Our results show that detection of DEGs is improved compared to the conventional gene-level approach regardless of the lower feature-level used for statistical testing. The overall best balance between accuracy and false discovery rate was obtained using the exon-level approach with empirical Brown’s aggregation method, which we provide as a freely available Bioconductor package EBSEA (https://bioconductor.org/packages/release/bioc/html/EBSEA.html).
Collapse
Affiliation(s)
- Arfa Mehmood
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland.,Institute of Biomedicine, University of Turku, Turku, Finland
| | - Asta Laiho
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland
| | - Laura L Elo
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland.,Institute of Biomedicine, University of Turku, Turku, Finland
| |
Collapse
|
4
|
Chen J, Mi X, Ning J, He X, Hu J. A tail-based test to detect differential expression in RNA-sequencing data. Stat Methods Med Res 2020; 30:261-276. [PMID: 32867604 DOI: 10.1177/0962280220951907] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
RNA sequencing data have been abundantly generated in biomedical research for biomarker discovery and other studies. Such data at the exon level are usually heavily tailed and correlated. Conventional statistical tests based on the mean or median difference for differential expression likely suffer from low power when the between-group difference occurs mostly in the upper or lower tail of the distribution of gene expression. We propose a tail-based test to make comparisons between groups in terms of a specific distribution area rather than a single location. The proposed test, which is derived from quantile regression, adjusts for covariates and accounts for within-sample dependence among the exons through a specified correlation structure. Through Monte Carlo simulation studies, we show that the proposed test is generally more powerful and robust in detecting differential expression than commonly used tests based on the mean or a single quantile. An application to TCGA lung adenocarcinoma data demonstrates the promise of the proposed method in terms of biomarker discovery.
Collapse
Affiliation(s)
- Jiong Chen
- Data Science, LinkedIn, Mountain View, CA, USA
| | - Xinlei Mi
- Department of Biostatistics, Columbia University, New York, NY, USA
| | - Jing Ning
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Xuming He
- Department of Statistics, University of Michigan at Ann Arbor, Ann Arbor, MI, USA
| | - Jianhua Hu
- Department of Biostatistics, Columbia University, New York, NY, USA
| |
Collapse
|
5
|
Surlis C, Earley B, McGee M, Keogh K, Cormican P, Blackshields G, Tiernan K, Dunn A, Morrison S, Arguello A, Waters SM. Blood immune transcriptome analysis of artificially fed dairy calves and naturally suckled beef calves from birth to 7 days of age. Sci Rep 2018; 8:15461. [PMID: 30337646 PMCID: PMC6194081 DOI: 10.1038/s41598-018-33627-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2018] [Accepted: 09/03/2018] [Indexed: 01/03/2023] Open
Abstract
Neonatal calves possess a very immature and naïve immune system and are reliant on the intake of maternal colostrum for passive transfer of immunoglobulins. Variation in colostrum management of beef and dairy calves is thought to affect early immune development. Therefore, the objective of this study was to examine changes in gene expression and investigate molecular pathways involved in the immune-competence development of neonatal Holstein dairy calves and naturally suckled beef calves using next generation RNA-sequencing during the first week of life. Jugular whole blood samples were collected from Holstein (H) dairy calves (n = 8) artificially fed 5% B.W. colostrum, and from beef calves which were the progenies of Charolais-Limousin (CL; n = 7) and Limousin-Friesian beef suckler cows (LF; n = 7), for subsequent RNA isolation. In dairy calves, there was a surge in pro-inflammatory cytokine gene expression possibly due to the stress of separation from the dam. LF calves exhibited early signs of humoral immune development with observed increases in the expression genes coding for Ig receptors, which was not evident in the other breeds by 7 days of age. Immune and health related DEGs identified as upregulated in beef calves are prospective contender genes for the classification of biomarkers for immune-competence development, and will contribute towards a greater understanding of the development of an immune response in neonatal calves.
Collapse
Affiliation(s)
- C Surlis
- Teagasc Animal and Bioscience Research Department, Grange, Dunsany, Meath, Ireland.
| | - B Earley
- Teagasc Animal and Bioscience Research Department, Grange, Dunsany, Meath, Ireland
| | - M McGee
- Teagasc Animal and Bioscience Research Department, Grange, Dunsany, Meath, Ireland
| | - K Keogh
- Teagasc Animal and Bioscience Research Department, Grange, Dunsany, Meath, Ireland
| | - P Cormican
- Teagasc Animal and Bioscience Research Department, Grange, Dunsany, Meath, Ireland
| | - G Blackshields
- Teagasc Animal and Bioscience Research Department, Grange, Dunsany, Meath, Ireland
| | - K Tiernan
- Teagasc Animal and Bioscience Research Department, Grange, Dunsany, Meath, Ireland
| | - A Dunn
- Sustainable Livestock, Agri-food and Bio-sciences Institute, BT26 6DR, Hillsborough, United Kingdom
| | - S Morrison
- Sustainable Livestock, Agri-food and Bio-sciences Institute, BT26 6DR, Hillsborough, United Kingdom
| | - A Arguello
- Teagasc Animal and Bioscience Research Department, Grange, Dunsany, Meath, Ireland
| | - S M Waters
- Teagasc Animal and Bioscience Research Department, Grange, Dunsany, Meath, Ireland.
| |
Collapse
|
6
|
Keogh K, Waters SM, Cormican P, Kelly AK, O’Shea E, Kenny DA. Effect of dietary restriction and subsequent re-alimentation on the transcriptional profile of bovine ruminal epithelium. PLoS One 2017; 12:e0177852. [PMID: 28545102 PMCID: PMC5435337 DOI: 10.1371/journal.pone.0177852] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Accepted: 05/04/2017] [Indexed: 11/19/2022] Open
Abstract
Compensatory growth (CG) is utilised worldwide in beef production systems as a management approach to reduce feed costs. However the underlying biology regulating the expression of CG remains to be fully elucidated. The objective of this study was to examine the effect of dietary restriction and subsequent re-alimentation induced CG on the global gene expression profile of ruminal epithelial papillae. Holstein Friesian bulls (n = 60) were assigned to one of two groups: restricted feed allowance (RES; n = 30) for 125 days (Period 1) followed by ad libitum access to feed for 55 days (Period 2) or (ii) ad libitum access to feed throughout (ADLIB; n = 30). At the end of each period, 15 animals from each treatment were slaughtered and rumen papillae harvested. mRNA was isolated from all papillae samples collected. cDNA libraries were then prepared and sequenced. Resultant reads were subsequently analysed bioinformatically and differentially expressed genes (DEGs) are defined as having a Benjamini-Hochberg P value of <0.05. During re-alimentation in Period 2, RES animals displayed CG, growing at 1.8 times the rate of their ADLIB contemporary animals in Period 2 (P < 0.001). At the end of Period 1, 64 DEGs were identified between RES and ADLIB, with only one DEG identified at the end of Period 2. When analysed within RES treatment (RES, Period 2 v Period 1), 411 DEGs were evident. Genes identified as differentially expressed in response to both dietary restriction and subsequent CG included those involved in processes such as cellular interactions and transport, protein folding and gene expression, as well as immune response. This study provides an insight into the molecular mechanisms underlying the expression of CG in rumen papillae of cattle; however the results suggest that the role of the ruminal epithelium in supporting overall animal CG may have declined by day 55 of re-alimentation.
Collapse
Affiliation(s)
- Kate Keogh
- Animal and Bioscience Research Department, Animal and Grassland Research and Innovation Centre, Teagasc, Grange, Dunsany, Co. Meath, Ireland
| | - Sinead M. Waters
- Animal and Bioscience Research Department, Animal and Grassland Research and Innovation Centre, Teagasc, Grange, Dunsany, Co. Meath, Ireland
| | - Paul Cormican
- Animal and Bioscience Research Department, Animal and Grassland Research and Innovation Centre, Teagasc, Grange, Dunsany, Co. Meath, Ireland
| | - Alan K. Kelly
- School of Agriculture and Food Science, University College Dublin, Belfield, Dublin, Ireland
| | - Emma O’Shea
- Animal and Bioscience Research Department, Animal and Grassland Research and Innovation Centre, Teagasc, Grange, Dunsany, Co. Meath, Ireland
| | - David A. Kenny
- Animal and Bioscience Research Department, Animal and Grassland Research and Innovation Centre, Teagasc, Grange, Dunsany, Co. Meath, Ireland
- * E-mail:
| |
Collapse
|
7
|
The centrosomal OFD1 protein interacts with the translation machinery and regulates the synthesis of specific targets. Sci Rep 2017; 7:1224. [PMID: 28450740 PMCID: PMC5430665 DOI: 10.1038/s41598-017-01156-x] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2016] [Accepted: 03/08/2017] [Indexed: 01/03/2023] Open
Abstract
Protein synthesis is traditionally associated with specific cytoplasmic compartments. We now show that OFD1, a centrosomal/basal body protein, interacts with components of the Preinitiation complex of translation (PIC) and of the eukaryotic Initiation Factor (eIF)4F complex and modulates the translation of specific mRNA targets in the kidney. We demonstrate that OFD1 cooperates with the mRNA binding protein Bicc1 to functionally control the protein synthesis machinery at the centrosome where also the PIC and eIF4F components were shown to localize in mammalian cells. Interestingly, Ofd1 and Bicc1 are both involved in renal cystogenesis and selected targets were shown to accumulate in two models of inherited renal cystic disease. Our results suggest a possible role for the centrosome as a specialized station to modulate translation for specific functions of the nearby ciliary structures and may provide functional clues for the understanding of renal cystic disease.
Collapse
|
8
|
Accurate Detection of Differential Expression and Splicing Using Low-Level Features. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2016; 1507:141-151. [PMID: 27832538 DOI: 10.1007/978-1-4939-6518-2_11] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Gene expression can be quantified in high throughput using microarray technology. Here we describe how to accurately detect differential expression and splicing using a probe-level expression change averaging (PECA) method. PECA is available as an R package from Bioconductor ( https://www.bioconductor.org ), and it supports multiple operating systems.
Collapse
|
9
|
Mohamed TMA, Abou-Leisa R, Stafford N, Maqsood A, Zi M, Prehar S, Baudoin-Stanley F, Wang X, Neyses L, Cartwright EJ, Oceandy D. The plasma membrane calcium ATPase 4 signalling in cardiac fibroblasts mediates cardiomyocyte hypertrophy. Nat Commun 2016; 7:11074. [PMID: 27020607 PMCID: PMC4820544 DOI: 10.1038/ncomms11074] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2015] [Accepted: 02/17/2016] [Indexed: 12/26/2022] Open
Abstract
The heart responds to pathological overload through myocyte hypertrophy. Here we show that this response is regulated by cardiac fibroblasts via a paracrine mechanism involving plasma membrane calcium ATPase 4 (PMCA4). Pmca4 deletion in mice, both systemically and specifically in fibroblasts, reduces the hypertrophic response to pressure overload; however, knocking out Pmca4 specifically in cardiomyocytes does not produce this effect. Mechanistically, cardiac fibroblasts lacking PMCA4 produce higher levels of secreted frizzled related protein 2 (sFRP2), which inhibits the hypertrophic response in neighbouring cardiomyocytes. Furthermore, we show that treatment with the PMCA4 inhibitor aurintricarboxylic acid (ATA) inhibits and reverses cardiac hypertrophy induced by pressure overload in mice. Our results reveal that PMCA4 regulates the development of cardiac hypertrophy and provide proof of principle for a therapeutic approach to treat this condition.
Collapse
Affiliation(s)
- Tamer M A Mohamed
- Institute of Cardiovascular Sciences, University of Manchester, AV Hill Building, Manchester M13 9PT, UK.,J David Gladstone Research Institutes, San Francisco, California 94158, USA.,Faculty of Pharmacy, Zagazig University, Zagazig 44519, Egypt
| | - Riham Abou-Leisa
- Institute of Cardiovascular Sciences, University of Manchester, AV Hill Building, Manchester M13 9PT, UK
| | - Nicholas Stafford
- Institute of Cardiovascular Sciences, University of Manchester, AV Hill Building, Manchester M13 9PT, UK
| | - Arfa Maqsood
- Institute of Cardiovascular Sciences, University of Manchester, AV Hill Building, Manchester M13 9PT, UK
| | - Min Zi
- Institute of Cardiovascular Sciences, University of Manchester, AV Hill Building, Manchester M13 9PT, UK
| | - Sukhpal Prehar
- Institute of Cardiovascular Sciences, University of Manchester, AV Hill Building, Manchester M13 9PT, UK
| | - Florence Baudoin-Stanley
- Institute of Cardiovascular Sciences, University of Manchester, AV Hill Building, Manchester M13 9PT, UK
| | - Xin Wang
- Faculty of Life Sciences, University of Manchester, Manchester M13 9PT, UK
| | - Ludwig Neyses
- Institute of Cardiovascular Sciences, University of Manchester, AV Hill Building, Manchester M13 9PT, UK
| | - Elizabeth J Cartwright
- Institute of Cardiovascular Sciences, University of Manchester, AV Hill Building, Manchester M13 9PT, UK
| | - Delvac Oceandy
- Institute of Cardiovascular Sciences, University of Manchester, AV Hill Building, Manchester M13 9PT, UK
| |
Collapse
|
10
|
Effect of dietary restriction and subsequent re-alimentation on the transcriptional profile of hepatic tissue in cattle. BMC Genomics 2016; 17:244. [PMID: 26984536 PMCID: PMC4794862 DOI: 10.1186/s12864-016-2578-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Accepted: 03/08/2016] [Indexed: 12/12/2022] Open
Abstract
Background Compensatory growth (CG) is an accelerated growth phenomenon observed in animals upon re-alimentation following a period of dietary restriction. It is typically utilised in livestock systems to reduce feed costs during periods of reduced feed availability. The biochemical mechanisms controlling this phenomenon, however, are yet to be elucidated. This study aimed to uncover the molecular mechanisms regulating the hepatic expression of CG in cattle, utilising RNAseq. RNAseq was performed on hepatic tissue of bulls following 125 days of dietary restriction (RES) and again following 55 days of subsequent re-alimentation during which the animals exhibited significant CG. The data were compared with those of control animals offered the same diet on an ad libitum basis throughout (ADLIB). Elucidation of the molecular control of CG may yield critical information on genes and pathways which could be targeted as putative molecular biomarkers for the selection of animals with improved CG potential. Results Following a period of differential feeding, body-weight and liver weight were 161 and 4 kg higher, respectively, for ADLIB compared with RES animals. At this time RNAseq analysis of liver tissue revealed 1352 significantly differentially expressed genes (DEG) between the two treatments. DEGs indicated down-regulation of processes including nutrient transport, cell division and proliferation in RES. In addition, protein synthesis genes were up-regulated in RES following a period of restricted feeding. The subsequent 55 days of ad libitum feeding for both groups resulted in the body-weight difference reduced to 84 kg, with no difference in liver weight between treatment groups. At the end of 55 days of unrestricted feeding, 49 genes were differentially expressed between animals undergoing CG and their continuously fed counterparts. In particular, hepatic expression of cell proliferation and growth genes were greater in animals undergoing CG. Conclusions Greater expression of cell cycle and cell proliferation genes during CG was associated with a 100 % recovery of liver weight during re-alimentation. Additionally, an apparent up-regulation in capacity for cellular protein synthesis during restricted feeding may contribute to and sustain CG during re-alimentation. DEGs identified are potential candidate genes for the identification of biomarkers for CG, which may be incorporated into future breeding programmes. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2578-5) contains supplementary material, which is available to authorized users.
Collapse
|
11
|
Effect of Dietary Restriction and Subsequent Re-Alimentation on the Transcriptional Profile of Bovine Skeletal Muscle. PLoS One 2016; 11:e0149373. [PMID: 26871690 PMCID: PMC4752344 DOI: 10.1371/journal.pone.0149373] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Accepted: 02/01/2016] [Indexed: 11/19/2022] Open
Abstract
Compensatory growth (CG), an accelerated growth phenomenon which occurs following a period of dietary restriction is exploited worldwide in animal production systems as a method to lower feed costs. However the molecular mechanisms regulated CG expression remain to be elucidated fully. This study aimed to uncover the underlying biology regulating CG in cattle, through an examination of skeletal muscle transcriptional profiles utilising next generation mRNA sequencing technology. Twenty Holstein Friesian bulls were fed either a restricted diet for 125 days, with a target growth rate of 0.6 kg/day (Period 1), following which they were allowed feed ad libitum for a further 55 days (Period 2) or fed ad libitum for the entirety of the trial. M. longissimus dorsi biopsies were harvested from all bulls on days 120 and 15 of periods 1 and 2 respectively and RNAseq analysis was performed. During re-alimentation in Period 2, previously restricted animals displayed CG, growing at 1.8 times the rate of the ad libitum control animals. Compensating animals were also more feed efficient during re-alimentation and compensated for 48% of their previous dietary restriction. 1,430 and 940 genes were identified as significantly differentially expressed (Benjamini Hochberg adjusted P < 0.1) in periods 1 and 2 respectively. Additionally, 2,237 genes were differentially expressed in animals undergoing CG relative to dietary restriction. Dietary restriction in Period 1 was associated with altered expression of genes involved in lipid metabolism and energy production. CG expression in Period 2 occurred in association with greater expression of genes involved in cellular function and organisation. This study highlights some of the molecular mechanisms regulating CG in cattle. Differentially expressed genes identified are potential candidate genes for the identification of biomarkers for CG and feed efficiency, which may be incorporated into future breeding programmes.
Collapse
|
12
|
Zytynska SE, Jourdie V, Naseeb S, Delneri D, Preziosi RF. Induced expression of defence-related genes in barley is specific to aphid genotype. Biol J Linn Soc Lond 2015. [DOI: 10.1111/bij.12715] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Sharon E. Zytynska
- Terrestrial Ecology Research Group; Department of Ecology and Ecosystem Management; School of Life Sciences Weihenstephan; Technische Universität München; Hans-Carl-von-Carlowitz-Platz 2 85354 Freising Germany
- Faculty of Life Sciences; The University of Manchester; Oxford Road M13 9PT Manchester UK
| | - Violaine Jourdie
- Faculty of Life Sciences; The University of Manchester; Oxford Road M13 9PT Manchester UK
| | - Samina Naseeb
- Faculty of Life Sciences; The University of Manchester; Oxford Road M13 9PT Manchester UK
| | - Daniela Delneri
- Faculty of Life Sciences; The University of Manchester; Oxford Road M13 9PT Manchester UK
| | - Richard F. Preziosi
- Faculty of Life Sciences; The University of Manchester; Oxford Road M13 9PT Manchester UK
| |
Collapse
|
13
|
Liu X, Shi X, Chen C, Zhang L. Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate. BMC Bioinformatics 2015; 16:332. [PMID: 26475308 PMCID: PMC4609108 DOI: 10.1186/s12859-015-0750-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Accepted: 09/24/2015] [Indexed: 12/05/2022] Open
Abstract
Background The high-throughput sequencing technology, RNA-Seq, has been widely used to quantify gene and isoform expression in the study of transcriptome in recent years. Accurate expression measurement from the millions or billions of short generated reads is obstructed by difficulties. One is ambiguous mapping of reads to reference transcriptome caused by alternative splicing. This increases the uncertainty in estimating isoform expression. The other is non-uniformity of read distribution along the reference transcriptome due to positional, sequencing, mappability and other undiscovered sources of biases. This violates the uniform assumption of read distribution for many expression calculation approaches, such as the direct RPKM calculation and Poisson-based models. Many methods have been proposed to address these difficulties. Some approaches employ latent variable models to discover the underlying pattern of read sequencing. However, most of these methods make bias correction based on surrounding sequence contents and share the bias models by all genes. They therefore cannot estimate gene- and isoform-specific biases as revealed by recent studies. Results We propose a latent variable model, NLDMseq, to estimate gene and isoform expression. Our method adopts latent variables to model the unknown isoforms, from which reads originate, and the underlying percentage of multiple spliced variants. The isoform- and exon-specific read sequencing biases are modeled to account for the non-uniformity of read distribution, and are identified by utilizing the replicate information of multiple lanes of a single library run. We employ simulation and real data to verify the performance of our method in terms of accuracy in the calculation of gene and isoform expression. Results show that NLDMseq obtains competitive gene and isoform expression compared to popular alternatives. Finally, the proposed method is applied to the detection of differential expression (DE) to show its usefulness in the downstream analysis. Conclusions The proposed NLDMseq method provides an approach to accurately estimate gene and isoform expression from RNA-Seq data by modeling the isoform- and exon-specific read sequencing biases. It makes use of a latent variable model to discover the hidden pattern of read sequencing. We have shown that it works well in both simulations and real datasets, and has competitive performance compared to popular methods. The method has been implemented as a freely available software which can be found at https://github.com/PUGEA/NLDMseq. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0750-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xuejun Liu
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, 29 Jiangjun Rd., Nanjing, 211106, China.
| | - Xinxin Shi
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, 29 Jiangjun Rd., Nanjing, 211106, China.
| | - Chunlin Chen
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, 29 Jiangjun Rd., Nanjing, 211106, China.
| | - Li Zhang
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, 29 Jiangjun Rd., Nanjing, 211106, China.
| |
Collapse
|
14
|
Liu X, Zhang L, Chen S. Modeling Exon-Specific Bias Distribution Improves the Analysis of RNA-Seq Data. PLoS One 2015; 10:e0140032. [PMID: 26448625 PMCID: PMC4598124 DOI: 10.1371/journal.pone.0140032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2015] [Accepted: 09/21/2015] [Indexed: 11/29/2022] Open
Abstract
RNA-seq technology has become an important tool for quantifying the gene and transcript expression in transcriptome study. The two major difficulties for the gene and transcript expression quantification are the read mapping ambiguity and the overdispersion of the read distribution along reference sequence. Many approaches have been proposed to deal with these difficulties. A number of existing methods use Poisson distribution to model the read counts and this easily splits the counts into the contributions from multiple transcripts. Meanwhile, various solutions were put forward to account for the overdispersion in the Poisson models. By checking the similarities among the variation patterns of read counts for individual genes, we found that the count variation is exon-specific and has the conserved pattern across the samples for each individual gene. We introduce Gamma-distributed latent variables to model the read sequencing preference for each exon. These variables are embedded to the rate parameter of a Poisson model to account for the overdispersion of read distribution. The model is tractable since the Gamma priors can be integrated out in the maximum likelihood estimation. We evaluate the proposed approach, PGseq, using four real datasets and one simulated dataset, and compare its performance with other popular methods. Results show that PGseq presents competitive performance compared to other alternatives in terms of accuracy in the gene and transcript expression calculation and in the downstream differential expression analysis. Especially, we show the advantage of our method in the analysis of low expression.
Collapse
Affiliation(s)
- Xuejun Liu
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
- * E-mail:
| | - Li Zhang
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
| | - Songcan Chen
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
| |
Collapse
|
15
|
Keogh K, Kenny DA, Kelly AK, Waters SM. Insulin secretion and signaling in response to dietary restriction and subsequent re-alimentation in cattle. Physiol Genomics 2015; 47:344-54. [PMID: 26015430 DOI: 10.1152/physiolgenomics.00002.2015] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Accepted: 05/22/2015] [Indexed: 01/04/2023] Open
Abstract
The objectives of this study were to examine systemic insulin response to a glucose tolerance test (GTT) and transcript abundance of genes of the insulin signaling pathway in skeletal muscle, during both dietary restriction and re-alimentation-induced compensatory growth. Holstein Friesian bulls were blocked to one of two groups: 1) restricted feed allowance for 125 days (period 1) (RES, n = 15) followed by ad libitum feeding for 55 days (period 2) or 2) ad libitum access to feed throughout (periods 1 and 2) (ADLIB, n = 15). On days 90 and 36 of periods 1 and 2, respectively, a GTT was performed. M. longissimus dorsi biopsies were harvested from all bulls on days 120 and 15 of periods 1 and 2, respectively, and RNA-Seq analysis was performed. RES displayed a lower growth rate during period 1 (RES: 0.6 kg/day, ADLIB: 1.9 kg/day; P < 0.001), subsequently gaining more during re-alimentation (RES: 2.5 kg/day, ADLIB: 1.4 kg/day; P < 0.001). Systemic insulin response to glucose administration was lower in RES in period 1 (P < 0.001) with no difference observed during period 2. The insulin signaling pathway in M. longissimus dorsi was enriched (P < 0.05) in response to dietary restriction but not during re-alimentation (P > 0.05). Genes differentially expressed in the insulin signaling pathway suggested a greater sensitivity to insulin in skeletal muscle, with pleiotropic effects of insulin signaling interrupted during dietary restriction. Collectively, these results indicate increased sensitivity to glucose clearance and skeletal muscle insulin signaling during dietary restriction; however, no overall role for insulin was apparent in expressing compensatory growth.
Collapse
Affiliation(s)
- Kate Keogh
- Animal and Bioscience Research Department, Animal and Grassland Research and Innovation Centre, Teagasc, Dunsany, County Meath, Ireland; and UCD School of Agriculture and Food Science, Belfield, Dublin, Ireland
| | - David A Kenny
- Animal and Bioscience Research Department, Animal and Grassland Research and Innovation Centre, Teagasc, Dunsany, County Meath, Ireland; and
| | - Alan K Kelly
- UCD School of Agriculture and Food Science, Belfield, Dublin, Ireland
| | - Sinéad M Waters
- Animal and Bioscience Research Department, Animal and Grassland Research and Innovation Centre, Teagasc, Dunsany, County Meath, Ireland; and
| |
Collapse
|
16
|
Glaab E, Schneider R. RepExplore: addressing technical replicate variance in proteomics and metabolomics data analysis. ACTA ACUST UNITED AC 2015; 31:2235-7. [PMID: 25717197 PMCID: PMC4481852 DOI: 10.1093/bioinformatics/btv127] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2014] [Accepted: 02/22/2015] [Indexed: 11/13/2022]
Abstract
Summary: High-throughput omics datasets often contain technical replicates included to account for technical sources of noise in the measurement process. Although summarizing these replicate measurements by using robust averages may help to reduce the influence of noise on downstream data analysis, the information on the variance across the replicate measurements is lost in the averaging process and therefore typically disregarded in subsequent statistical analyses. We introduce RepExplore, a web-service dedicated to exploit the information captured in the technical replicate variance to provide more reliable and informative differential expression and abundance statistics for omics datasets. The software builds on previously published statistical methods, which have been applied successfully to biomedical omics data but are difficult to use without prior experience in programming or scripting. RepExplore facilitates the analysis by providing a fully automated data processing and interactive ranking tables, whisker plot, heat map and principal component analysis visualizations to interpret omics data and derived statistics. Availability and implementation: Freely available at http://www.repexplore.tk Contact:enrico.glaab@uni.lu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Enrico Glaab
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
17
|
Walsh MJ, Cooper-Knock J, Dodd JE, Stopford MJ, Mihaylov SR, Kirby J, Shaw PJ, Hautbergue GM. Invited review: decoding the pathophysiological mechanisms that underlie RNA dysregulation in neurodegenerative disorders: a review of the current state of the art. Neuropathol Appl Neurobiol 2015; 41:109-34. [PMID: 25319671 PMCID: PMC4329338 DOI: 10.1111/nan.12187] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2014] [Accepted: 10/07/2014] [Indexed: 12/12/2022]
Abstract
Altered RNA metabolism is a key pathophysiological component causing several neurodegenerative diseases. Genetic mutations causing neurodegeneration occur in coding and noncoding regions of seemingly unrelated genes whose products do not always contribute to the gene expression process. Several pathogenic mechanisms may coexist within a single neuronal cell, including RNA/protein toxic gain-of-function and/or protein loss-of-function. Genetic mutations that cause neurodegenerative disorders disrupt healthy gene expression at diverse levels, from chromatin remodelling, transcription, splicing, through to axonal transport and repeat-associated non-ATG (RAN) translation. We address neurodegeneration in repeat expansion disorders [Huntington's disease, spinocerebellar ataxias, C9ORF72-related amyotrophic lateral sclerosis (ALS)] and in diseases caused by deletions or point mutations (spinal muscular atrophy, most subtypes of familial ALS). Some neurodegenerative disorders exhibit broad dysregulation of gene expression with the synthesis of hundreds to thousands of abnormal messenger RNA (mRNA) molecules. However, the number and identity of aberrant mRNAs that are translated into proteins - and how these lead to neurodegeneration - remain unknown. The field of RNA biology research faces the challenge of identifying pathophysiological events of dysregulated gene expression. In conclusion, we discuss current research limitations and future directions to improve our characterization of pathological mechanisms that trigger disease onset and progression.
Collapse
Affiliation(s)
- M J Walsh
- Sheffield Institute for Translational Neuroscience (SITraN), Department of Neuroscience, University of SheffieldSheffield, UK
| | - J Cooper-Knock
- Sheffield Institute for Translational Neuroscience (SITraN), Department of Neuroscience, University of SheffieldSheffield, UK
| | - J E Dodd
- Sheffield Institute for Translational Neuroscience (SITraN), Department of Neuroscience, University of SheffieldSheffield, UK
| | - M J Stopford
- Sheffield Institute for Translational Neuroscience (SITraN), Department of Neuroscience, University of SheffieldSheffield, UK
| | - S R Mihaylov
- Sheffield Institute for Translational Neuroscience (SITraN), Department of Neuroscience, University of SheffieldSheffield, UK
| | - J Kirby
- Sheffield Institute for Translational Neuroscience (SITraN), Department of Neuroscience, University of SheffieldSheffield, UK
| | - P J Shaw
- Sheffield Institute for Translational Neuroscience (SITraN), Department of Neuroscience, University of SheffieldSheffield, UK
| | - G M Hautbergue
- Sheffield Institute for Translational Neuroscience (SITraN), Department of Neuroscience, University of SheffieldSheffield, UK
| |
Collapse
|
18
|
Laiho A, Elo LL. A note on an exon-based strategy to identify differentially expressed genes in RNA-seq experiments. PLoS One 2014; 9:e115964. [PMID: 25541961 PMCID: PMC4277429 DOI: 10.1371/journal.pone.0115964] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Accepted: 12/03/2014] [Indexed: 11/25/2022] Open
Abstract
RNA-sequencing (RNA-seq) has rapidly become the method of choice in many genome-wide transcriptomic studies. To meet the high expectations posed by this technology, powerful computational techniques are needed to translate the measurements into biological and biomedical understanding. A number of statistical procedures have already been developed to identify differentially expressed genes between distinct sample groups. With these methods statistical testing is typically performed after the data has been summarized at the gene level. As an alternative strategy, developed with the aim to improve the results, we demonstrate a method in which statistical testing at the exon level is performed prior to the summary of the results at the gene level. Using publicly available RNA-seq datasets as case studies, we illustrate how this exon-based strategy can improve the performance of the widely used differential expression software packages as compared to the conventional gene-based strategy. In particular, we show how it enables robust detection of moderate but systematic changes that are missed when relying on single gene-level summary counts only.
Collapse
Affiliation(s)
- Asta Laiho
- Turku Centre for Biotechnology, University of Turku and Åbo Akademi University, Turku, Finland
- * E-mail: (AL); (LLE)
| | - Laura L. Elo
- Turku Centre for Biotechnology, University of Turku and Åbo Akademi University, Turku, Finland
- Department of Mathematics and Statistics, University of Turku, Turku, Finland
- * E-mail: (AL); (LLE)
| |
Collapse
|
19
|
Papastamoulis P, Hensman J, Glaus P, Rattray M. Improved variational Bayes inference for transcript expression estimation. Stat Appl Genet Mol Biol 2014; 13:203-16. [PMID: 24413218 DOI: 10.1515/sagmb-2013-0054] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
RNA-seq studies allow for the quantification of transcript expression by aligning millions of short reads to a reference genome. However, transcripts share much of their sequence, so that many reads map to more than one place and their origin remains uncertain. This problem can be dealt using mixtures of distributions and transcript expression reduces to estimating the weights of the mixture. In this paper, variational Bayesian (VB) techniques are used in order to approximate the posterior distribution of transcript expression. VB has previously been shown to be more computationally efficient for this problem than Markov chain Monte Carlo. VB methodology can precisely estimate the posterior means, but leads to variance underestimation. For this reason, a novel approach is introduced which integrates the latent allocation variables out of the VB approximation. It is shown that this modification leads to a better marginal likelihood bound and improved estimate of the posterior variance. A set of simulation studies and application to real RNA-seq datasets highlight the improved performance of the proposed method.
Collapse
|
20
|
Stephen GL, Lui S, Hamilton SA, Tower CL, Harris LK, Stevens A, Jones RL. Transcriptomic profiling of human choriodecidua during term labor: inflammation as a key driver of labor. Am J Reprod Immunol 2014; 73:36-55. [PMID: 25283845 DOI: 10.1111/aji.12328] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2014] [Accepted: 09/05/2014] [Indexed: 01/10/2023] Open
Abstract
PROBLEM Inflammation is a driver of labor in myometrium and cervix; however, the involvement of decidua is poorly defined. We have reported decidual leukocyte infiltration prior to and during labor; the regulators of these inflammatory processes are unknown. METHOD OF STUDY Choriodecidua RNA obtained after term labor or elective cesarean delivery was applied to Affymetrix GeneChips. Pathway analysis and gene validation were performed. RESULTS Extensive inflammatory activation was identified in choriodecidua following labor, predominantly upregulation of genes regulating leukocyte trafficking and cytokine signalling. Genes governing cell fate, tissue remodelling, and translation were also altered. Upregulation of candidate genes (ICAM1, CXCR4, CD44, TLR4, SOCS3, BCL2A, and IDO) was confirmed. NFκB, STAT1&3, HMGB1, and miRNA-21, miRNA-46, miRNA-141, and miRNA-200 were predicted upstream regulators. CONCLUSION This study confirms inflammatory processes are major players in labor events in choriodecidua, as in other gestational tissues. Suppressing uterine inflammation is likely to be critical for arresting premature labor.
Collapse
Affiliation(s)
- Gillian L Stephen
- Maternal and Fetal Health Research Centre, University of Manchester, Manchester, UK; St. Mary's Hospital, Central Manchester University Hospitals NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, UK
| | | | | | | | | | | | | |
Collapse
|
21
|
Aung HH, Tsoukalas A, Rutledge JC, Tagkopoulos I. A systems biology analysis of brain microvascular endothelial cell lipotoxicity. BMC SYSTEMS BIOLOGY 2014; 8:80. [PMID: 24993133 PMCID: PMC4112729 DOI: 10.1186/1752-0509-8-80] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Accepted: 06/23/2014] [Indexed: 02/08/2023]
Abstract
Background Neurovascular inflammation is associated with a number of neurological diseases including vascular dementia and Alzheimer’s disease, which are increasingly important causes of morbidity and mortality around the world. Lipotoxicity is a metabolic disorder that results from accumulation of lipids, particularly fatty acids, in non-adipose tissue leading to cellular dysfunction, lipid droplet formation, and cell death. Results Our studies indicate for the first time that the neurovascular circulation also can manifest lipotoxicity, which could have major effects on cognitive function. The penetration of integrative systems biology approaches is limited in this area of research, which reduces our capacity to gain an objective insight into the signal transduction and regulation dynamics at a systems level. To address this question, we treated human microvascular endothelial cells with triglyceride-rich lipoprotein (TGRL) lipolysis products and then we used genome-wide transcriptional profiling to obtain transcript abundances over four conditions. We then identified regulatory genes and their targets that have been differentially expressed through analysis of the datasets with various statistical methods. We created a functional gene network by exploiting co-expression observations through a guilt-by-association assumption. Concomitantly, we used various network inference algorithms to identify putative regulatory interactions and we integrated all predictions to construct a consensus gene regulatory network that is TGRL lipolysis product specific. Conclusion System biology analysis has led to the validation of putative lipid-related targets and the discovery of several genes that may be implicated in lipotoxic-related brain microvascular endothelial cell responses. Here, we report that activating transcription factors 3 (ATF3) is a principal regulator of TGRL lipolysis products-induced gene expression in human brain microvascular endothelial cell.
Collapse
Affiliation(s)
| | | | | | - Ilias Tagkopoulos
- UC Davis Genome Center, University of California, Davis, CA 95616, USA.
| |
Collapse
|
22
|
Green NH, Nicholls Z, Heath PR, Cooper-Knock J, Corfe BM, MacNeil S, Bury JP. Pulsatile exposure to simulated reflux leads to changes in gene expression in a 3D model of oesophageal mucosa. Int J Exp Pathol 2014; 95:216-28. [PMID: 24713057 DOI: 10.1111/iep.12083] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2013] [Accepted: 03/07/2014] [Indexed: 01/11/2023] Open
Abstract
Oesophageal exposure to duodenogastroesophageal refluxate is implicated in the development of Barrett's metaplasia (BM), with increased risk of progression to oesophageal adenocarcinoma. The literature proposes that reflux exposure activates NF-κB, driving the aberrant expression of intestine-specific caudal-related homeobox (CDX) genes. However, early events in the pathogenesis of BM from normal epithelium are poorly understood. To investigate this, our study subjected a 3D model of the normal human oesophageal mucosa to repeated, pulsatile exposure to specific bile components and examined changes in gene expression. Initial 2D experiments with a range of bile salts observed that taurochenodeoxycholate (TCDC) impacted upon NF-κB activation without causing cell death. Informed by this, the 3D oesophageal model was repeatedly exposed to TCDC in the presence and absence of acid, and the epithelial cells underwent gene expression profiling. We identified ~300 differentially expressed genes following each treatment, with a large and significant overlap between treatments. Enrichment analysis (Broad GSEA, DAVID and Metacore™; GeneGo Inc) identified multiple gene sets related to cell signalling, inflammation, proliferation, differentiation and cell adhesion. Specifically NF-κB activation, Wnt signalling, cell adhesion and targets for the transcription factors PTF1A and HNF4α were highlighted. Our data suggest that HNF4α isoform switching may be an early event in Barrett's pathogenesis. CDX1/2 targets were, however, not enriched, suggesting that although CDX1/2 activation reportedly plays a role in BM development, it may not be an initial event. Our findings highlight new areas for investigation in the earliest stages of BM pathogenesis of oesophageal diseases and new potential therapeutic targets.
Collapse
Affiliation(s)
- Nicola H Green
- Kroto Research Institute, North Campus, University of Sheffield, Sheffield, UK
| | | | | | | | | | | | | |
Collapse
|
23
|
Killeen AP, Morris DG, Kenny DA, Mullen MP, Diskin MG, Waters SM. Global gene expression in endometrium of high and low fertility heifers during the mid-luteal phase of the estrous cycle. BMC Genomics 2014; 15:234. [PMID: 24669966 PMCID: PMC3986929 DOI: 10.1186/1471-2164-15-234] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2013] [Accepted: 03/14/2014] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND In both beef and dairy cattle, the majority of early embryo loss occurs within the first 14 days following insemination. During this time-period, embryos are completely dependent on their maternal uterine environment for development, growth and ultimately survival, therefore an optimum uterine environment is critical to their survival. The objective of this study was to investigate whether differences in endometrial gene expression during the mid-luteal phase of the estrous cycle exist between crossbred beef heifers ranked as either high (HF) or low fertility (LF) (following four rounds of artificial insemination (AI)) using the Affymetrix® 23 K Bovine Gene Chip. RESULTS Conception rates for each of the four rounds of AI were within a normal range: 70-73.3%. Microarray analysis of endometrial tissue collected on day 7 of the estrous cycle detected 419 differentially expressed genes (DEG) between HF (n = 6) and LF (n = 6) animals. The main gene pathways affected were, cellular growth and proliferation, angiogenesis, lipid metabolism, cellular and tissue morphology and development, inflammation and metabolic exchange. DEG included, FST, SLC45A2, MMP19, FADS1 and GALNT6. CONCLUSIONS This study highlights, some of the molecular mechanisms potentially controlling uterine endometrial function during the mid-luteal phase of the estrous cycle, which may contribute to uterine endometrial mediated impaired fertility in cattle. Differentially expressed genes are potential candidate genes for the identification of genetic variation influencing cow fertility, which may be incorporated into future breeding programmes.
Collapse
Affiliation(s)
| | | | | | | | | | - Sinéad M Waters
- Teagasc, Animal and Bioscience Research Department, Animal and Grassland Research and Innovation Centre, Grange, Dunsany, County Meath, Ireland.
| |
Collapse
|
24
|
Waters SM, Coyne GS, Kenny DA, Morris DG. Effect of dietary n-3 polyunsaturated fatty acids on transcription factor regulation in the bovine endometrium. Mol Biol Rep 2014; 41:2745-55. [PMID: 24449365 DOI: 10.1007/s11033-014-3129-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2013] [Accepted: 01/11/2014] [Indexed: 11/25/2022]
Abstract
Dietary n-3 polyunsaturated fatty acid (n-3 PUFA) supplementation is postulated to have positive effects on fertility. The impact of dietary n-3 PUFA supplementation on physiological and biochemical processes involved in reproduction is likely to be associated with significant alterations in gene expression in key reproductive tissues which is in turn regulated by transcription factors. Beef heifers were supplemented with a rumen protected source of either a saturated fatty acid or high n-3 PUFA diet per animal per day for 45 days and uterine endometrial tissue was harvested post slaughter. A microarray analysis was conducted and bioinformatic tools were employed to evaluate the effect of n-3 PUFA supplementation on gene expression in the bovine endometrium. Clustering of microarray gene expression data was performed to identify co-expressed genes. Functional annotation of each cluster of genes was carried out using Ingenuity Pathway Analysis. Furthermore, oPOSSUM was employed to identify transcription factors involved in gene expression changes due to supplementary PUFA. Gene functions which showed a significant response to n-3 PUFA supplementation included tissue development, immune function and reproductive function. Numerous transcription factors such as FOXD1, FOXD3, NFKB1, ESR1, PGR, FOXA2, NKX3-1 and PPARα were identified as potential regulators of gene expression in the endometrium of cattle supplemented with n-3 PUFA. This study demonstrates the complex nature of the alterations in the transcriptional regulation process in the uterine endometrium of cattle following dietary supplementation which may positively influence the uterine environment.
Collapse
Affiliation(s)
- Sinéad M Waters
- Animal and Bioscience Research Department, Animal and Grassland Research and Innovation Centre, Teagasc, Grange, Dunsany, Co. Meath, Ireland,
| | | | | | | |
Collapse
|
25
|
Sun J, Keates S. Canonical correlation analysis on data with censoring and error information. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2013; 24:1909-1919. [PMID: 24805211 DOI: 10.1109/tnnls.2013.2262949] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
We developed a probabilistic model for canonical correlation analysis in the case when the associated datasets are incomplete. This case can arise where data entries either contain measurement errors or are censored (i.e., nonignorable missing) due to uncertainties in instrument calibration and physical limitations of devices and experimental conditions. The aim of our model is to estimate the true correlation coefficients, through eliminating the effects of measurement errors and abstracting helpful information from censored data. As exact inference is not possible for the proposed model, a modified variational Expectation-Maximization (EM) algorithm was developed. In the algorithm developed, we approximated the posteriors of the latent variables as normal distributions. In the experiment, the modified E-step approximation accuracy is first empirically demonstrated by being compared to hybrid Monte Carlo (HMC) sampling. The following experiments were carried out on synthetic datasets with different numbers of censored data and different correlation coefficient settings to compare the proposed algorithm with a maximum a posteriori (MAP) solution and a Markov Chain-EM solution. Experimental results showed that the variational EM solution compares favorably against the MAP solution, approaching the accuracy of the Markov Chain-EM, while maintaining computational simplicity. We finally applied the proposed algorithm to finding the mostly correlated properties of galaxy group with the X-ray luminosity.
Collapse
|
26
|
Turro E, Astle WJ, Tavaré S. Flexible analysis of RNA-seq data using mixed effects models. ACTA ACUST UNITED AC 2013; 30:180-8. [PMID: 24281695 DOI: 10.1093/bioinformatics/btt624] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Most methods for estimating differential expression from RNA-seq are based on statistics that compare normalized read counts between treatment classes. Unfortunately, reads are in general too short to be mapped unambiguously to features of interest, such as genes, isoforms or haplotype-specific isoforms. There are methods for estimating expression levels that account for this source of ambiguity. However, the uncertainty is not generally accounted for in downstream analysis of gene expression experiments. Moreover, at the individual transcript level, it can sometimes be too large to allow useful comparisons between treatment groups. RESULTS In this article we make two proposals that improve the power, specificity and versatility of expression analysis using RNA-seq data. First, we present a Bayesian method for model selection that accounts for read mapping ambiguities using random effects. This polytomous model selection approach can be used to identify many interesting patterns of gene expression and is not confined to detecting differential expression between two groups. For illustration, we use our method to detect imprinting, different types of regulatory divergence in cis and in trans and differential isoform usage, but many other applications are possible. Second, we present a novel collapsing algorithm for grouping transcripts into inferential units that exploits the posterior correlation between transcript expression levels. The aggregate expression levels of these units can be estimated with useful levels of uncertainty. Our algorithm can improve the precision of expression estimates when uncertainty is large with only a small reduction in biological resolution. AVAILABILITY AND IMPLEMENTATION We have implemented our software in the mmdiff and mmcollapse multithreaded C++ programs as part of the open-source MMSEQ package, available on https://github.com/eturro/mmseq.
Collapse
Affiliation(s)
- Ernest Turro
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge CB2 0RE, UK, Department of Haematology, University of Cambridge, NHS Blood and Transplant, Long Road, Cambridge CB2 0PT, UK and Department of Epidemiology, Biostatistics and Occupational Health, McGill University, 1020 Pine Avenue West, Montreal QC H3A 1A2, Canada
| | | | | |
Collapse
|
27
|
Nardo G, Iennaco R, Fusi N, Heath PR, Marino M, Trolese MC, Ferraiuolo L, Lawrence N, Shaw PJ, Bendotti C. Transcriptomic indices of fast and slow disease progression in two mouse models of amyotrophic lateral sclerosis. ACTA ACUST UNITED AC 2013; 136:3305-32. [PMID: 24065725 DOI: 10.1093/brain/awt250] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Amyotrophic lateral sclerosis is heterogeneous with high variability in the speed of progression even in cases with a defined genetic cause such as superoxide dismutase 1 (SOD1) mutations. We reported that SOD1(G93A) mice on distinct genetic backgrounds (C57 and 129Sv) show consistent phenotypic differences in speed of disease progression and life-span that are not explained by differences in human SOD1 transgene copy number or the burden of mutant SOD1 protein within the nervous system. We aimed to compare the gene expression profiles of motor neurons from these two SOD1(G93A) mouse strains to discover the molecular mechanisms contributing to the distinct phenotypes and to identify factors underlying fast and slow disease progression. Lumbar spinal motor neurons from the two SOD1(G93A) mouse strains were isolated by laser capture microdissection and transcriptome analysis was conducted at four stages of disease. We identified marked differences in the motor neuron transcriptome between the two mice strains at disease onset, with a dramatic reduction of gene expression in the rapidly progressive (129Sv-SOD1(G93A)) compared with the slowly progressing mutant SOD1 mice (C57-SOD1(G93A)) (1276 versus 346; Q-value ≤ 0.01). Gene ontology pathway analysis of the transcriptional profile from 129Sv-SOD1(G93A) mice showed marked downregulation of specific pathways involved in mitochondrial function, as well as predicted deficiencies in protein degradation and axonal transport mechanisms. In contrast, the transcriptional profile from C57-SOD1(G93A) mice with the more benign disease course, revealed strong gene enrichment relating to immune system processes compared with 129Sv-SOD1(G93A) mice. Motor neurons from the more benign mutant strain demonstrated striking complement activation, over-expressing genes normally involved in immune cell function. We validated through immunohistochemistry increased expression of the C3 complement subunit and major histocompatibility complex I within motor neurons. In addition, we demonstrated that motor neurons from the slowly progressing mice activate a series of genes with neuroprotective properties such as angiogenin and the nuclear factor (erythroid-derived 2)-like 2 transcriptional regulator. In contrast, the faster progressing mice show dramatically reduced expression at disease onset of cell pathways involved in neuroprotection. This study highlights a set of key gene and molecular pathway indices of fast or slow disease progression which may prove useful in identifying potential disease modifiers responsible for the heterogeneity of human amyotrophic lateral sclerosis and which may represent valid therapeutic targets for ameliorating the disease course in humans.
Collapse
Affiliation(s)
- Giovanni Nardo
- 1 Laboratory of Molecular Neurobiology, Department of Neuroscience, IRCCS - Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa, 19, 20156 Milan, Italy
| | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Liu X, Gao Z, Zhang L, Rattray M. puma 3.0: improved uncertainty propagation methods for gene and transcript expression analysis. BMC Bioinformatics 2013; 14:39. [PMID: 23379655 PMCID: PMC3626802 DOI: 10.1186/1471-2105-14-39] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2012] [Accepted: 01/18/2013] [Indexed: 11/10/2022] Open
Abstract
Background Microarrays have been a popular tool for gene expression profiling at genome-scale for over a decade due to the low cost, short turn-around time, excellent quantitative accuracy and ease of data generation. The Bioconductor package puma incorporates a suite of analysis methods for determining uncertainties from Affymetrix GeneChip data and propagating these uncertainties to downstream analysis. As isoform level expression profiling receives more and more interest within genomics in recent years, exon microarray technology offers an important tool to quantify expression level of the majority of exons and enables the possibility of measuring isoform level expression. However, puma does not include methods for the analysis of exon array data. Moreover, the current expression summarisation method for Affymetrix 3’ GeneChip data suffers from instability for low expression genes. For the downstream analysis, the method for differential expression detection is computationally intensive and the original expression clustering method does not consider the variance across the replicated technical and biological measurements. It is therefore necessary to develop improved uncertainty propagation methods for gene and transcript expression analysis. Results We extend the previously developed Bioconductor package puma with a new method especially designed for GeneChip Exon arrays and a set of improved downstream approaches. The improvements include: (i) a new gamma model for exon arrays which calculates isoform and gene expression measurements and a level of uncertainty associated with the estimates, using the multi-mappings between probes, isoforms and genes, (ii) a variant of the existing approach for the probe-level analysis of Affymetrix 3’ GeneChip data to produce more stable gene expression estimates, (iii) an improved method for detecting differential expression which is computationally more efficient than the existing approach in the package and (iv) an improved method for robust model-based clustering of gene expression, which takes technical and biological replicate information into consideration. Conclusions With the extensions and improvements, the puma package is now applicable to the analysis of both Affymetrix 3’ GeneChips and Exon arrays for gene and isoform expression estimation. It propagates the uncertainty of expression measurements into more efficient and comprehensive downstream analysis at both gene and isoform level. Downstream methods are also applicable to other expression quantification platforms, such as RNA-Seq, when uncertainty information is available from expression measurements. puma is available through Bioconductor and can be found at http://www.bioconductor.org.
Collapse
Affiliation(s)
- Xuejun Liu
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, 29 Yudao St., Nanjing 210016, China.
| | | | | | | |
Collapse
|
29
|
Brockington A, Ning K, Heath PR, Wood E, Kirby J, Fusi N, Lawrence N, Wharton SB, Ince PG, Shaw PJ. Unravelling the enigma of selective vulnerability in neurodegeneration: motor neurons resistant to degeneration in ALS show distinct gene expression characteristics and decreased susceptibility to excitotoxicity. Acta Neuropathol 2013; 125:95-109. [PMID: 23143228 PMCID: PMC3535376 DOI: 10.1007/s00401-012-1058-5] [Citation(s) in RCA: 108] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2012] [Revised: 10/16/2012] [Accepted: 10/19/2012] [Indexed: 12/11/2022]
Abstract
A consistent clinical feature of amyotrophic lateral sclerosis (ALS) is the sparing of eye movements and the function of external sphincters, with corresponding preservation of motor neurons in the brainstem oculomotor nuclei, and of Onuf’s nucleus in the sacral spinal cord. Studying the differences in properties of neurons that are vulnerable and resistant to the disease process in ALS may provide insights into the mechanisms of neuronal degeneration, and identify targets for therapeutic manipulation. We used microarray analysis to determine the differences in gene expression between oculomotor and spinal motor neurons, isolated by laser capture microdissection from the midbrain and spinal cord of neurologically normal human controls. We compared these to transcriptional profiles of oculomotor nuclei and spinal cord from rat and mouse, obtained from the GEO omnibus database. We show that oculomotor neurons have a distinct transcriptional profile, with significant differential expression of 1,757 named genes (q < 0.001). Differentially expressed genes are enriched for the functional categories of synaptic transmission, ubiquitin-dependent proteolysis, mitochondrial function, transcriptional regulation, immune system functions, and the extracellular matrix. Marked differences are seen, across the three species, in genes with a function in synaptic transmission, including several glutamate and GABA receptor subunits. Using patch clamp recording in acute spinal and brainstem slices, we show that resistant oculomotor neurons show a reduced AMPA-mediated inward calcium current, and a higher GABA-mediated chloride current, than vulnerable spinal motor neurons. The findings suggest that reduced susceptibility to excitotoxicity, mediated in part through enhanced GABAergic transmission, is an important determinant of the relative resistance of oculomotor neurons to degeneration in ALS.
Collapse
Affiliation(s)
- Alice Brockington
- Academic Neurology Unit, Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, 385A Glossop Road, Sheffield, S10 2HQ UK
| | - Ke Ning
- Academic Neurology Unit, Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, 385A Glossop Road, Sheffield, S10 2HQ UK
| | - Paul R. Heath
- Academic Neurology Unit, Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, 385A Glossop Road, Sheffield, S10 2HQ UK
| | - Elizabeth Wood
- Academic Neurology Unit, Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, 385A Glossop Road, Sheffield, S10 2HQ UK
| | - Janine Kirby
- Academic Neurology Unit, Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, 385A Glossop Road, Sheffield, S10 2HQ UK
| | - Nicolò Fusi
- Computational Biology Unit, Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, 385A Glossop Road, Sheffield, S10 2HQ UK
| | - Neil Lawrence
- Computational Biology Unit, Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, 385A Glossop Road, Sheffield, S10 2HQ UK
| | - Stephen B. Wharton
- Academic Neuropathology Unit, Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, 385A Glossop Road, Sheffield, S10 2HQ UK
| | - Paul G. Ince
- Academic Neuropathology Unit, Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, 385A Glossop Road, Sheffield, S10 2HQ UK
| | - Pamela J. Shaw
- Academic Neurology Unit, Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, 385A Glossop Road, Sheffield, S10 2HQ UK
| |
Collapse
|
30
|
Jacklin N, Ding Z, Chen W, Chang C. Noniterative convex optimization methods for network component analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1472-1481. [PMID: 22641712 DOI: 10.1109/tcbb.2012.81] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
This work studies the reconstruction of gene regulatory networks by the means of network component analysis (NCA). We will expound a family of convex optimization-based methods for estimating the transcription factor control strengths and the transcription factor activities (TFAs). The approach taken in this work is to decompose the problem into a network connectivity strength estimation phase and a transcription factor activity estimation phase. In the control strength estimation phase, we formulate a new subspace-based method incorporating a choice of multiple error metrics. For the source estimation phase we propose a total least squares (TLS) formulation that generalizes many existing methods. Both estimation procedures are noniterative and yield the optimal estimates according to various proposed error metrics. We test the performance of the proposed algorithms on simulated data and experimental gene expression data for the yeast Saccharomyces cerevisiae and demonstrate that the proposed algorithms have superior effectiveness in comparison with both Bayesian Decomposition (BD) and our previous FastNCA approach, while the computational complexity is still orders of magnitude less than BD.
Collapse
Affiliation(s)
- Neil Jacklin
- Department of Electrical and Computer Engineering, University of California, Davis, CA 95616, USA.
| | | | | | | |
Collapse
|
31
|
Cooper-Knock J, Kirby J, Ferraiuolo L, Heath PR, Rattray M, Shaw PJ. Gene expression profiling in human neurodegenerative disease. Nat Rev Neurol 2012; 8:518-30. [PMID: 22890216 DOI: 10.1038/nrneurol.2012.156] [Citation(s) in RCA: 152] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Transcriptome study in neurodegenerative disease has advanced considerably in the past 5 years. Increasing scientific rigour and improved analytical tools have led to more-reproducible data. Many transcriptome analysis platforms assay the expression of the entire genome, enabling a complete biological context to be captured. Gene expression profiling (GEP) is, therefore, uniquely placed to discover pathways of disease pathogenesis, potential therapeutic targets, and biomarkers. This Review summarizes microarray human GEP studies in the common neurodegenerative diseases amyotrophic lateral sclerosis (ALS), Parkinson disease (PD) and Alzheimer disease (AD). Several interesting reports have compared pathological gene expression in different patient groups, disease stages and anatomical areas. In all three diseases, GEP has revealed dysregulation of genes related to neuroinflammation. In ALS and PD, gene expression related to RNA splicing and protein turnover is disrupted, and several studies in ALS support involvement of the cytoskeleton. GEP studies have implicated the ubiquitin-proteasome system in PD pathogenesis, and have provided evidence of mitochondrial dysfunction in PD and AD. Lastly, in AD, a possible role for dysregulation of intracellular signalling pathways, including calcium signalling, has been highlighted. This Review also provides a discussion of methodological considerations in microarray sample preparation and data analysis.
Collapse
Affiliation(s)
- Johnathan Cooper-Knock
- Academic Unit of Neurology, Sheffield Institute for Translational Neuroscience, University of Sheffield, 385A Glossop Road, Sheffield S10 2HQ, UK
| | | | | | | | | | | |
Collapse
|
32
|
Assessing numerical dependence in gene expression summaries with the jackknife expression difference. PLoS One 2012; 7:e39570. [PMID: 22876276 PMCID: PMC3411624 DOI: 10.1371/journal.pone.0039570] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2011] [Accepted: 05/27/2012] [Indexed: 11/19/2022] Open
Abstract
Statistical methods to test for differential expression traditionally assume that each gene's expression summaries are independent across arrays. When certain preprocessing methods are used to obtain those summaries, this assumption is not necessarily true. In general, the erroneous assumption of dependence results in a loss of statistical power. We introduce a diagnostic measure of numerical dependence for gene expression summaries from any preprocessing method and discuss the relative performance of several common preprocessing methods with respect to this measure. Some common preprocessing methods introduce non-trivial levels of numerical dependence. The issue of (between-array) dependence has received little if any attention in the literature, and researchers working with gene expression data should not take such properties for granted, or they risk unnecessarily losing statistical power.
Collapse
|
33
|
Waters SM, Coyne GS, Kenny DA, MacHugh DE, Morris DG. Dietary n-3 polyunsaturated fatty acid supplementation alters the expression of genes involved in the control of fertility in the bovine uterine endometrium. Physiol Genomics 2012; 44:878-88. [PMID: 22851761 DOI: 10.1152/physiolgenomics.00065.2011] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The potential for dietary supplementation with n-3 polyunsaturated fatty acids (n-3 PUFA) to improve reproductive efficiency in cattle has received much interest. The mechanisms by which n-3 PUFA may affect physiological and biochemical processes in key reproductive tissues are likely to be mediated by significant alterations in gene expression. The objective of this study was to examine the effects of dietary n-3 PUFA supplementation on global uterine endometrial gene expression in cattle. Beef heifers were supplemented with a rumen protected source of either a saturated fatty acid (CON; palmitic acid) or high n-3 PUFA (n-3 PUFA; 275 g) diet per animal per day for 45 days and global gene expression was determined in uterine endometrial tissue using an Affymetrix oligonucleotide bovine array. A total of 1,807 (946 up- and 861 downregulated) genes were differentially expressed following n-3 PUFA supplementation. Dietary n-3 PUFA altered numerous cellular processes potentially important in the control of reproduction in cattle. These included prostaglandin biosynthesis, steroidogenesis and transcriptional regulation, while effects on genes involved in maternal immune response and tissue remodeling were also observed. This study provides new insights into the effects of n-3 PUFA supplementation on the regulation of gene expression in the bovine uterus.
Collapse
Affiliation(s)
- Sinéad M Waters
- Teagasc, Animal and Bioscience Research Department, Animal and Grassland Research and Innovation Centre, Grange, Dunsany, Co. Meath, Ireland.
| | | | | | | | | |
Collapse
|
34
|
Glaus P, Honkela A, Rattray M. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics 2012; 28:1721-8. [PMID: 22563066 PMCID: PMC3381971 DOI: 10.1093/bioinformatics/bts260] [Citation(s) in RCA: 152] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Revised: 04/21/2012] [Accepted: 04/27/2012] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION High-throughput sequencing enables expression analysis at the level of individual transcripts. The analysis of transcriptome expression levels and differential expression (DE) estimation requires a probabilistic approach to properly account for ambiguity caused by shared exons and finite read sampling as well as the intrinsic biological variance of transcript expression. RESULTS We present Bayesian inference of transcripts from sequencing data (BitSeq), a Bayesian approach for estimation of transcript expression level from RNA-seq experiments. Inferred relative expression is represented by Markov chain Monte Carlo samples from the posterior probability distribution of a generative model of the read data. We propose a novel method for DE analysis across replicates which propagates uncertainty from the sample-level model while modelling biological variance using an expression-level-dependent prior. We demonstrate the advantages of our method using simulated data as well as an RNA-seq dataset with technical and biological replication for both studied conditions. AVAILABILITY The implementation of the transcriptome expression estimation and differential expression analysis, BitSeq, has been written in C++ and Python. The software is available online from http://code.google.com/p/bitseq/, version 0.4 was used for generating results presented in this article.
Collapse
Affiliation(s)
- Peter Glaus
- School of Computer Science, University of Manchester, Oxford Road, Manchester M13 9PL, UK.
| | | | | |
Collapse
|
35
|
Buler M, Aatsinki SM, Skoumal R, Komka Z, Tóth M, Kerkelä R, Georgiadi A, Kersten S, Hakkola J. Energy-sensing factors coactivator peroxisome proliferator-activated receptor γ coactivator 1-α (PGC-1α) and AMP-activated protein kinase control expression of inflammatory mediators in liver: induction of interleukin 1 receptor antagonist. J Biol Chem 2011; 287:1847-60. [PMID: 22117073 DOI: 10.1074/jbc.m111.302356] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Obesity and insulin resistance are associated with chronic, low grade inflammation. Moreover, regulation of energy metabolism and immunity are highly integrated. We hypothesized that energy-sensitive coactivator peroxisome proliferator-activated receptor γ coactivator 1-α (PGC-1α) and AMP-activated protein kinase (AMPK) may modulate inflammatory gene expression in liver. Microarray analysis revealed that PGC-1α up-regulated expression of several cytokines and cytokine receptors, including interleukin 15 receptor α (IL15Rα) and, even more importantly, anti-inflammatory interleukin 1 receptor antagonist (IL1Rn). Overexpression of PGC-1α and induction of PGC-1α by fasting, physical exercise, glucagon, or cAMP was associated with increased IL1Rn mRNA and protein expression in hepatocytes. Knockdown of PGC-1α by siRNA down-regulated cAMP-induced expression of IL1Rn in mouse hepatocytes. Furthermore, knockdown of peroxisome proliferator-activated receptor α (PPARα) attenuated IL1Rn induction by PGC-1α. Overexpression of PGC-1α, at least partially through IL1Rn, suppressed interleukin 1β-induced expression of acute phase proteins, C-reactive protein, and haptoglobin. Fasting and exercise also induced IL15Rα expression, whereas glucagon and cAMP resulted in reduction in IL15Rα mRNA levels. Finally, AMPK activator metformin and adenoviral overexpression of AMPK up-regulated IL1Rn and down-regulated IL15Rα in primary hepatocytes. We conclude that PGC-1α and AMPK alter inflammatory gene expression in liver and thus integrate energy homeostasis and inflammation. Induction of IL1Rn by PGC-1α and AMPK may be involved in the beneficial effects of exercise and caloric restriction and putative anti-inflammatory effects of metformin.
Collapse
Affiliation(s)
- Marcin Buler
- Department of Pharmacology and Toxicology, Institute of Biomedicine, University of Oulu, POB 5000 (Aapistie 5B), 90014 Oulu, Finland
| | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Posekany A, Felsenstein K, Sykacek P. Biological assessment of robust noise models in microarray data analysis. Bioinformatics 2011; 27:807-14. [PMID: 21252077 PMCID: PMC3051324 DOI: 10.1093/bioinformatics/btr018] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Motivation: Although several recently proposed analysis packages for microarray data can cope with heavy-tailed noise, many applications rely on Gaussian assumptions. Gaussian noise models foster computational efficiency. This comes, however, at the expense of increased sensitivity to outlying observations. Assessing potential insufficiencies of Gaussian noise in microarray data analysis is thus important and of general interest. Results: We propose to this end assessing different noise models on a large number of microarray experiments. The goodness of fit of noise models is quantified by a hierarchical Bayesian analysis of variance model, which predicts normalized expression values as a mixture of a Gaussian density and t-distributions with adjustable degrees of freedom. Inference of differentially expressed genes is taken into consideration at a second mixing level. For attaining far reaching validity, our investigations cover a wide range of analysis platforms and experimental settings. As the most striking result, we find irrespective of the chosen preprocessing and normalization method in all experiments that a heavy-tailed noise model is a better fit than a simple Gaussian. Further investigations revealed that an appropriate choice of noise model has a considerable influence on biological interpretations drawn at the level of inferred genes and gene ontology terms. We conclude from our investigation that neglecting the over dispersed noise in microarray data can mislead scientific discovery and suggest that the convenience of Gaussian-based modelling should be replaced by non-parametric approaches or other methods that account for heavy-tailed noise. Contact:peter.sykacek@boku.ac.at Availability:http://bioinf.boku.ac.at/alexp/robmca.html.
Collapse
Affiliation(s)
- A Posekany
- Department of Biotechnology, University of Natural Resources and Life Sciences, Vienna, Austria
| | | | | |
Collapse
|
37
|
Wang J, Jia M, Zhu L, Yuan Z, Li P, Chang C, Luo J, Liu M, Shi T. Systematical detection of significant genes in microarray data by incorporating gene interaction relationship in biological systems. PLoS One 2010; 5:e13721. [PMID: 21060778 PMCID: PMC2966410 DOI: 10.1371/journal.pone.0013721] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2010] [Accepted: 10/05/2010] [Indexed: 02/02/2023] Open
Abstract
Many methods, including parametric, nonparametric, and Bayesian methods, have been used for detecting differentially expressed genes based on the assumption that biological systems are linear, which ignores the nonlinear characteristics of most biological systems. More importantly, those methods do not simultaneously consider means, variances, and high moments, resulting in relatively high false positive rate. To overcome the limitations, the SWang test is proposed to determine differentially expressed genes according to the equality of distributions between case and control. Our method not only latently incorporates functional relationships among genes to consider nonlinear biological system but also considers the mean, variance, skewness, and kurtosis of expression profiles simultaneously. To illustrate biological significance of high moments, we construct a nonlinear gene interaction model, demonstrating that skewness and kurtosis could contain useful information of function association among genes in microarrays. Simulations and real microarray results show that false positive rate of SWang is lower than currently popular methods (T-test, F-test, SAM, and Fold-change) with much higher statistical power. Additionally, SWang can uniquely detect significant genes in real microarray data with imperceptible differential expression but higher variety in kurtosis and skewness. Those identified genes were confirmed with previous published literature or RT-PCR experiments performed in our lab.
Collapse
Affiliation(s)
- Junwei Wang
- The Center for Bioinformatics and Computational Biology, The Institute of Biomedical Sciences, The School of Life Sciences, East China Normal University, Shanghai, China
| | - Meiwen Jia
- The Center for Bioinformatics and Computational Biology, The Institute of Biomedical Sciences, The School of Life Sciences, East China Normal University, Shanghai, China
| | - Liping Zhu
- The College of Financial and Statistics, East China Normal University, Shanghai, China
| | - Zengjin Yuan
- The Center for Bioinformatics and Computational Biology, The Institute of Biomedical Sciences, The School of Life Sciences, East China Normal University, Shanghai, China
| | - Peng Li
- The Center for Bioinformatics and Computational Biology, The Institute of Biomedical Sciences, The School of Life Sciences, East China Normal University, Shanghai, China
| | - Chang Chang
- The Center for Bioinformatics and Computational Biology, The Institute of Biomedical Sciences, The School of Life Sciences, East China Normal University, Shanghai, China
| | - Jian Luo
- The Center for Bioinformatics and Computational Biology, The Institute of Biomedical Sciences, The School of Life Sciences, East China Normal University, Shanghai, China
| | - Mingyao Liu
- The Center for Bioinformatics and Computational Biology, The Institute of Biomedical Sciences, The School of Life Sciences, East China Normal University, Shanghai, China
| | - Tieliu Shi
- The Center for Bioinformatics and Computational Biology, The Institute of Biomedical Sciences, The School of Life Sciences, East China Normal University, Shanghai, China
- * E-mail:
| |
Collapse
|
38
|
McCarthy SD, Waters SM, Kenny DA, Diskin MG, Fitzpatrick R, Patton J, Wathes DC, Morris DG. Negative energy balance and hepatic gene expression patterns in high-yielding dairy cows during the early postpartum period: a global approach. Physiol Genomics 2010; 42A:188-99. [PMID: 20716645 PMCID: PMC3008362 DOI: 10.1152/physiolgenomics.00118.2010] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
In high-yielding dairy cows the liver undergoes extensive physiological and biochemical changes during the early postpartum period in an effort to re-establish metabolic homeostasis and to counteract the adverse effects of negative energy balance (NEB). These adaptations are likely to be mediated by significant alterations in hepatic gene expression. To gain new insights into these events an energy balance model was created using differential feeding and milking regimes to produce two groups of cows with either a mild (MNEB) or severe NEB (SNEB) status. Cows were slaughtered and liver tissues collected on days 6–7 of the first follicular wave postpartum. Using an Affymetrix 23k oligonucleotide bovine array to determine global gene expression in hepatic tissue of these cows, we found a total of 416 genes (189 up- and 227 downregulated) to be altered by SNEB. Network analysis using Ingenuity Pathway Analysis revealed that SNEB was associated with widespread changes in gene expression classified into 36 gene networks including those associated with lipid metabolism, connective tissue development and function, cell signaling, cell cycle, and metabolic diseases, the three most significant of which are discussed in detail. SNEB cows displayed reduced expression of transcription activators and signal transducers that regulate the expression of genes and gene networks associated with cell signaling and tissue repair. These alterations are linked with increased expression of abnormal cell cycle and cellular proliferation associated pathways. This study provides new information and insights on the effect of SNEB on gene expression in high-yielding Holstein Friesian dairy cows in the early postpartum period.
Collapse
Affiliation(s)
- S D McCarthy
- Teagasc, Animal and Bioscience Research Department, Animal and Grassland Research and Innovation Centre, Mellows Campus, Athenry, County Galway, Ireland
| | | | | | | | | | | | | | | |
Collapse
|
39
|
Noyes HA, Agaba M, Anderson S, Archibald AL, Brass A, Gibson J, Hall L, Hulme H, Oh SJ, Kemp S. Genotype and expression analysis of two inbred mouse strains and two derived congenic strains suggest that most gene expression is trans regulated and sensitive to genetic background. BMC Genomics 2010; 11:361. [PMID: 20529291 PMCID: PMC2896378 DOI: 10.1186/1471-2164-11-361] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2009] [Accepted: 06/07/2010] [Indexed: 11/24/2022] Open
Abstract
Background Differences in gene expression may be caused by nearby DNA polymorphisms (cis regulation) or by interactions of gene control regions with polymorphic transcription factors (trans regulation). Trans acting loci are much harder to detect than cis acting loci and their effects are much more sensitive to genetic background. Results To quantify cis and trans regulation we correlated haplotype data with gene expression in two inbred mouse strains and two derived congenic lines. Upstream haplotype differences between the parental strains suggested that 30-43% of differentially expressed genes were differentially expressed because of cis haplotype differences. These cis regulated genes displayed consistent and relatively tissue-independent differential expression. We independently estimated from the congenic mice that 71-85% of genes were trans regulated. Cis regulated genes were associated with low p values (p < 0.005) for differential expression, whereas trans regulated genes were associated with values 0.005 < p < 0.05. The genes differentially expressed between congenics and controls were not a subset of those that were differentially expressed between the founder lines, showing that these were dependent on genetic background. For example, the cholesterol synthesis pathway was strongly differentially expressed in the congenic mice by indirect trans regulation but this was not observable in the parental mice. Conclusions The evidence that most gene regulation is trans and strongly influenced by genetic background, suggests that pathways that are modified by an allelic variant, may only exhibit differential expression in the specific genetic backgrounds in which they were identified. This has significant implications for the interpretation of any QTL mapping study.
Collapse
Affiliation(s)
- Harry A Noyes
- School of Biological Sciences, University of Liverpool, Liverpool, UK.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Stevens JR, Bell JL, Aston KI, White KL. A comparison of probe-level and probeset models for small-sample gene expression data. BMC Bioinformatics 2010; 11:281. [PMID: 20504334 PMCID: PMC2901368 DOI: 10.1186/1471-2105-11-281] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2009] [Accepted: 05/26/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Statistical methods to tentatively identify differentially expressed genes in microarray studies typically assume larger sample sizes than are practical or even possible in some settings. RESULTS The performance of several probe-level and probeset models was assessed graphically and numerically using three spike-in datasets. Based on the Affymetrix GeneChip, a novel nested factorial model was developed and found to perform competitively on small-sample spike-in experiments. CONCLUSIONS Statistical methods with test statistics related to the estimated log fold change tend to be more consistent in their performance on small-sample gene expression data. For such small-sample experiments, the nested factorial model can be a useful statistical tool. This method is implemented in freely-available R code (affyNFM), available with a tutorial document at http://www.stat.usu.edu/~jrstevens.
Collapse
Affiliation(s)
- John R Stevens
- Department of Mathematics and Statistics, Utah State University, Logan, UT 84322, USA.
| | | | | | | |
Collapse
|
41
|
Sanchez-Calderon H, Rodriguez-de la Rosa L, Milo M, Pichel JG, Holley M, Varela-Nieto I. RNA microarray analysis in prenatal mouse cochlea reveals novel IGF-I target genes: implication of MEF2 and FOXM1 transcription factors. PLoS One 2010; 5:e8699. [PMID: 20111592 PMCID: PMC2810322 DOI: 10.1371/journal.pone.0008699] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2009] [Accepted: 12/18/2009] [Indexed: 01/19/2023] Open
Abstract
Background Insulin-like growth factor-I (IGF-I) provides pivotal cell survival and differentiation signals during inner ear development throughout evolution. Homozygous mutations of human IGF1 cause syndromic sensorineural deafness, decreased intrauterine and postnatal growth rates, and mental retardation. In the mouse, deficits in IGF-I result in profound hearing loss associated with reduced survival, differentiation and maturation of auditory neurons. Nevertheless, little is known about the molecular basis of IGF-I activity in hearing and deafness. Methodology/Principal Findings A combination of quantitative RT-PCR, subcellular fractionation and Western blotting, along with in situ hybridization studies show IGF-I and its high affinity receptor to be strongly expressed in the embryonic and postnatal mouse cochlea. The expression of both proteins decreases after birth and in the cochlea of E18.5 embryonic Igf1−/− null mice, the balance of the main IGF related signalling pathways is altered, with lower activation of Akt and ERK1/2 and stronger activation of p38 kinase. By comparing the Igf1−/− and Igf1+/+ transcriptomes in E18.5 mouse cochleae using RNA microchips and validating their results, we demonstrate the up-regulation of the FoxM1 transcription factor and the misexpression of the neural progenitor transcription factors Six6 and Mash1 associated with the loss of IGF-I. Parallel, in silico promoter analysis of the genes modulated in conjunction with the loss of IGF-I revealed the possible involvement of MEF2 in cochlear development. E18.5 Igf1+/+ mouse auditory ganglion neurons showed intense MEF2A and MEF2D nuclear staining and MEF2A was also evident in the organ of Corti. At P15, MEF2A and MEF2D expression were shown in neurons and sensory cells. In the absence of IGF-I, nuclear levels of MEF2 were diminished, indicating less transcriptional MEF2 activity. By contrast, there was an increase in the nuclear accumulation of FoxM1 and a corresponding decrease in the nuclear cyclin-dependent kinase inhibitor p27Kip1. Conclusions/Significance We have defined the spatiotemporal expression of elements involved in IGF signalling during inner ear development and reveal novel regulatory mechanisms that are modulated by IGF-I in promoting sensory cell and neural survival and differentiation. These data will help us to understand the molecular bases of human sensorineural deafness associated to deficits in IGF-I.
Collapse
|
42
|
Fujita A, Patriota AG, Sato JR, Miyano S. The impact of measurement errors in the identification of regulatory networks. BMC Bioinformatics 2009; 10:412. [PMID: 20003382 PMCID: PMC2811120 DOI: 10.1186/1471-2105-10-412] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2009] [Accepted: 12/13/2009] [Indexed: 11/21/2022] Open
Abstract
BACKGROUND There are several studies in the literature depicting measurement error in gene expression data and also, several others about regulatory network models. However, only a little fraction describes a combination of measurement error in mathematical regulatory networks and shows how to identify these networks under different rates of noise. RESULTS This article investigates the effects of measurement error on the estimation of the parameters in regulatory networks. Simulation studies indicate that, in both time series (dependent) and non-time series (independent) data, the measurement error strongly affects the estimated parameters of the regulatory network models, biasing them as predicted by the theory. Moreover, when testing the parameters of the regulatory network models, p-values computed by ignoring the measurement error are not reliable, since the rate of false positives are not controlled under the null hypothesis. In order to overcome these problems, we present an improved version of the Ordinary Least Square estimator in independent (regression models) and dependent (autoregressive models) data when the variables are subject to noises. Moreover, measurement error estimation procedures for microarrays are also described. Simulation results also show that both corrected methods perform better than the standard ones (i.e., ignoring measurement error). The proposed methodologies are illustrated using microarray data from lung cancer patients and mouse liver time series data. CONCLUSIONS Measurement error dangerously affects the identification of regulatory network models, thus, they must be reduced or taken into account in order to avoid erroneous conclusions. This could be one of the reasons for high biological false positive rates identified in actual regulatory network models.
Collapse
Affiliation(s)
- André Fujita
- Computational Science Research Program, RIKEN, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
| | - Alexandre G Patriota
- Institute of Mathematics and Statistics, University of São Paulo, Rua do Matão, 1010 - São Paulo, 05508-090, Brazil
| | - João R Sato
- Center of Mathematics, Computation and Cognition, Universidade Federal do ABC, Rua Santa Adélia, 166 - Santo André, 09210-170, Brazil
| | - Satoru Miyano
- Computational Science Research Program, RIKEN, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan
| |
Collapse
|
43
|
Sontrop HMJ, Moerland PD, van den Ham R, Reinders MJT, Verhaegh WFJ. A comprehensive sensitivity analysis of microarray breast cancer classification under feature variability. BMC Bioinformatics 2009; 10:389. [PMID: 19941644 PMCID: PMC2789744 DOI: 10.1186/1471-2105-10-389] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2009] [Accepted: 11/26/2009] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Large discrepancies in signature composition and outcome concordance have been observed between different microarray breast cancer expression profiling studies. This is often ascribed to differences in array platform as well as biological variability. We conjecture that other reasons for the observed discrepancies are the measurement error associated with each feature and the choice of preprocessing method. Microarray data are known to be subject to technical variation and the confidence intervals around individual point estimates of expression levels can be wide. Furthermore, the estimated expression values also vary depending on the selected preprocessing scheme. In microarray breast cancer classification studies, however, these two forms of feature variability are almost always ignored and hence their exact role is unclear. RESULTS We have performed a comprehensive sensitivity analysis of microarray breast cancer classification under the two types of feature variability mentioned above. We used data from six state of the art preprocessing methods, using a compendium consisting of eight different datasets, involving 1131 hybridizations, containing data from both one and two-color array technology. For a wide range of classifiers, we performed a joint study on performance, concordance and stability. In the stability analysis we explicitly tested classifiers for their noise tolerance by using perturbed expression profiles that are based on uncertainty information directly related to the preprocessing methods. Our results indicate that signature composition is strongly influenced by feature variability, even if the array platform and the stratification of patient samples are identical. In addition, we show that there is often a high level of discordance between individual class assignments for signatures constructed on data coming from different preprocessing schemes, even if the actual signature composition is identical. CONCLUSION Feature variability can have a strong impact on breast cancer signature composition, as well as the classification of individual patient samples. We therefore strongly recommend that feature variability is considered in analyzing data from microarray breast cancer expression profiling experiments.
Collapse
|
44
|
D'Elia R, DeSchoolmeester ML, Zeef LAH, Wright SH, Pemberton AD, Else KJ. Expulsion of Trichuris muris is associated with increased expression of angiogenin 4 in the gut and increased acidity of mucins within the goblet cell. BMC Genomics 2009; 10:492. [PMID: 19852835 PMCID: PMC2774869 DOI: 10.1186/1471-2164-10-492] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2008] [Accepted: 10/24/2009] [Indexed: 01/23/2023] Open
Abstract
Background Trichuris muris in the mouse is an invaluable model for infection of man with the gastrointestinal nematode Trichuris trichiura. Three T. muris isolates have been studied, the Edinburgh (E), the Japan (J) and the Sobreda (S) isolates. The S isolate survives to chronicity within the C57BL/6 host whereas E and J are expelled prior to reaching fecundity. How the S isolate survives so successfully in its host is unclear. Results Microarray analysis was used as a tool to identify genes whose expression could determine the differences in expulsion kinetics between the E and S T. muris isolates. Clear differences in gene expression profiles were evident as early as day 7 post-infection (p.i.). 43 probe sets associated with immune and defence responses were up-regulated in gut tissue from an E isolate-infected C57BL/6 mouse compared to tissue from an S isolate infection, including the message for the anti-microbial protein, angiogenin 4 (Ang4). This led to the identification of distinct differences in the goblet cell phenotype post-infection with the two isolates. Conclusion Differences in gene expression levels identified between the S and E-infected mice early during infection have furthered our knowledge of how the S isolate persists for longer than the E isolate in the C57BL/6 mouse. Potential new targets for manipulation in order to aid expulsion have been identified. Further we provide evidence for a potential new marker involving the acidity of the mucins within the goblet cell which may predict outcome of infection within days of parasite exposure.
Collapse
Affiliation(s)
- Riccardo D'Elia
- Faculty of Life Sciences, University of Manchester, Manchester, M13 9PT, UK.
| | | | | | | | | | | |
Collapse
|
45
|
Laajala E, Aittokallio T, Lahesmaa R, Elo LL. Probe-level estimation improves the detection of differential splicing in Affymetrix exon array studies. Genome Biol 2009; 10:R77. [PMID: 19607685 PMCID: PMC2728531 DOI: 10.1186/gb-2009-10-7-r77] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2009] [Revised: 06/05/2009] [Accepted: 07/16/2009] [Indexed: 12/22/2022] Open
Abstract
A novel statistical procedure is presented that uses probe-level information on Affymetrix exon arrays to detect differential splicing. The recent advent of exon microarrays has made it possible to reveal differences in alternative splicing events on a global scale. We introduce a novel statistical procedure that takes full advantage of the probe-level information on Affymetrix exon arrays when detecting differential splicing between sample groups. In comparison to existing ranking methods, the procedure shows superior reproducibility and accuracy in distinguishing true biological findings from background noise in high agreement with experimental validations.
Collapse
Affiliation(s)
- Essi Laajala
- Turku Centre for Biotechnology, University of Turku and Abo Akademi University, Turku, FI-20521, Finland
| | | | | | | |
Collapse
|
46
|
Pearson RD, Liu X, Sanguinetti G, Milo M, Lawrence ND, Rattray M. puma: a Bioconductor package for propagating uncertainty in microarray analysis. BMC Bioinformatics 2009; 10:211. [PMID: 19589155 PMCID: PMC2714555 DOI: 10.1186/1471-2105-10-211] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2009] [Accepted: 07/09/2009] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Most analyses of microarray data are based on point estimates of expression levels and ignore the uncertainty of such estimates. By determining uncertainties from Affymetrix GeneChip data and propagating these uncertainties to downstream analyses it has been shown that we can improve results of differential expression detection, principal component analysis and clustering. Previously, implementations of these uncertainty propagation methods have only been available as separate packages, written in different languages. Previous implementations have also suffered from being very costly to compute, and in the case of differential expression detection, have been limited in the experimental designs to which they can be applied. RESULTS puma is a Bioconductor package incorporating a suite of analysis methods for use on Affymetrix GeneChip data. puma extends the differential expression detection methods of previous work from the 2-class case to the multi-factorial case. puma can be used to automatically create design and contrast matrices for typical experimental designs, which can be used both within the package itself but also in other Bioconductor packages. The implementation of differential expression detection methods has been parallelised leading to significant decreases in processing time on a range of computer architectures. puma incorporates the first R implementation of an uncertainty propagation version of principal component analysis, and an implementation of a clustering method based on uncertainty propagation. All of these techniques are brought together in a single, easy-to-use package with clear, task-based documentation. CONCLUSION For the first time, the puma package makes a suite of uncertainty propagation methods available to a general audience. These methods can be used to improve results from more traditional analyses of microarray data. puma also offers improvements in terms of scope and speed of execution over previously available methods. puma is recommended for anyone working with the Affymetrix GeneChip platform for gene expression analysis and can also be applied more generally.
Collapse
Affiliation(s)
- Richard D Pearson
- School of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK
- Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford, OX3 7BN, UK
| | - Xuejun Liu
- College of Information Science and Technology, Nanjing University of Aeronautics and Astronautics, 29 Yudao Street, Nanjing 210016, PR China
| | - Guido Sanguinetti
- Department of Computer Science, University of Sheffield, Regent Court 211 Portobello Street, Sheffield, S1 4DP, UK
- ChELSI Institute, Department of Chemical and Process Engineering, University of Sheffield, Mappin Street, Sheffield, S1 3JD, UK
| | - Marta Milo
- NIHR Cardiovascular Biomedical Research Unit, Sheffield Teaching Hospitals NHS Trust, Beech Hill Road, Sheffield, S10 2RX, UK
| | - Neil D Lawrence
- School of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK
| | - Magnus Rattray
- School of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK
| |
Collapse
|
47
|
Morris DG, Waters SM, McCarthy SD, Patton J, Earley B, Fitzpatrick R, Murphy JJ, Diskin MG, Kenny DA, Brass A, Wathes DC. Pleiotropic effects of negative energy balance in the postpartum dairy cow on splenic gene expression: repercussions for innate and adaptive immunity. Physiol Genomics 2009; 39:28-37. [PMID: 19567785 PMCID: PMC2747343 DOI: 10.1152/physiolgenomics.90394.2008] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Increased energy demands to support lactation, coupled with lowered feed intake capacity results in negative energy balance (NEB) and is typically characterized by extensive mobilization of body energy reserves in the early postpartum dairy cow. The catabolism of stored lipid leads to an increase in the systemic concentrations of nonesterified fatty acids (NEFA) and β-hydroxy butyrate (BHB). Oxidation of NEFA in the liver result in the increased production of reactive oxygen species and the onset of oxidative stress and can lead to disruption of normal metabolism and physiology. The immune system is depressed in the peripartum period and early lactation and dairy cows are therefore more vulnerable to bacterial infections causing mastitis and or endometritis at this time. A bovine Affymetrix oligonucleotide array was used to determine global gene expression in the spleen of dairy cows in the early postpartum period. Spleen tissue was removed post mortem from five severe NEB (SNEB) and five medium NEB (MNEB) cows 15 days postpartum. SNEB increased systemic concentrations of NEFA and BHB, and white blood cell and lymphocyte numbers were decreased in SNEB animals. A total of 545 genes were altered by SNEB. Network analysis using Ingenuity Pathway Analysis revealed that SNEB was associated with NRF2-mediated oxidative stress, mitochondrial dysfunction, endoplasmic reticulum stress, natural killer cell signaling, p53 signaling, downregulation of IL-15, BCL-2, and IFN-γ; upregulation of BAX and CHOP and increased apoptosis with a potential negative impact on innate and adaptive immunity.
Collapse
Affiliation(s)
- D G Morris
- Teagasc, Mellows Campus, Athenry, County Galway, Ireland.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Packham IM, Gray C, Heath PR, Hellewell PG, Ingham PW, Crossman DC, Milo M, Chico TJA. Microarray profiling reveals CXCR4a is downregulated by blood flow in vivo and mediates collateral formation in zebrafish embryos. Physiol Genomics 2009; 38:319-27. [PMID: 19509081 DOI: 10.1152/physiolgenomics.00049.2009] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The response to hemodynamic force is implicated in a number of pathologies including collateral vessel development. However, the transcriptional effect of hemodynamic force is extremely challenging to examine in vivo in mammals without also detecting confounding processes such as hypoxia and ischemia. We therefore serially examined the transcriptional effect of preventing cardiac contraction in zebrafish embryos which can be deprived of circulation without experiencing hypoxia since they obtain sufficient oxygenation by diffusion. Morpholino antisense knock-down of cardiac troponin T2 (tnnt2) prevented cardiac contraction without affecting vascular development. Gene expression in whole embryo RNA from tnnt2 or control morphants at 36, 48, and 60 h postfertilization (hpf) was assessed using Affymetrix GeneChip Zebrafish Genome Arrays (>14,900 transcripts). We identified 308 differentially expressed genes between tnnt2 and control morphants. One such (CXCR4a) was significantly more highly expressed in tnnt2 morphants at 48 and 60 hpf than controls. In situ hybridization localized CXCR4a upregulation to endothelium of both tnnt2 morphants and gridlock mutants (which have an occluded aorta preventing distal blood flow). This upregulation appears to be of functional significance as either CXCR4a knock-down or pharmacologic inhibition impaired the ability of gridlock mutants to recover blood flow via collateral vessels. We conclude absence of hemodynamic force induces endothelial CXCR4a upregulation that promotes recovery of blood flow.
Collapse
Affiliation(s)
- Ian M Packham
- Medical Research Council Centre for Developmental and Biomedical Genetics, United Kingdom
| | | | | | | | | | | | | | | |
Collapse
|
49
|
Kadota K, Nakai Y, Shimizu K. Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity. Algorithms Mol Biol 2009; 4:7. [PMID: 19386098 PMCID: PMC2679019 DOI: 10.1186/1748-7188-4-7] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2008] [Accepted: 04/22/2009] [Indexed: 12/20/2022] Open
Abstract
Background To identify differentially expressed genes (DEGs) from microarray data, users of the Affymetrix GeneChip system need to select both a preprocessing algorithm to obtain expression-level measurements and a way of ranking genes to obtain the most plausible candidates. We recently recommended suitable combinations of a preprocessing algorithm and gene ranking method that can be used to identify DEGs with a higher level of sensitivity and specificity. However, in addition to these recommendations, researchers also want to know which combinations enhance reproducibility. Results We compared eight conventional methods for ranking genes: weighted average difference (WAD), average difference (AD), fold change (FC), rank products (RP), moderated t statistic (modT), significance analysis of microarrays (samT), shrinkage t statistic (shrinkT), and intensity-based moderated t statistic (ibmT) with six preprocessing algorithms (PLIER, VSN, FARMS, multi-mgMOS (mmgMOS), MBEI, and GCRMA). A total of 36 real experimental datasets was evaluated on the basis of the area under the receiver operating characteristic curve (AUC) as a measure for both sensitivity and specificity. We found that the RP method performed well for VSN-, FARMS-, MBEI-, and GCRMA-preprocessed data, and the WAD method performed well for mmgMOS-preprocessed data. Our analysis of the MicroArray Quality Control (MAQC) project's datasets showed that the FC-based gene ranking methods (WAD, AD, FC, and RP) had a higher level of reproducibility: The percentages of overlapping genes (POGs) across different sites for the FC-based methods were higher overall than those for the t-statistic-based methods (modT, samT, shrinkT, and ibmT). In particular, POG values for WAD were the highest overall among the FC-based methods irrespective of the choice of preprocessing algorithm. Conclusion Our results demonstrate that to increase sensitivity, specificity, and reproducibility in microarray analyses, we need to select suitable combinations of preprocessing algorithms and gene ranking methods. We recommend the use of FC-based methods, in particular RP or WAD.
Collapse
|
50
|
Geeleher P, Morris D, Hinde JP, Golden A. BioconductorBuntu: a Linux distribution that implements a web-based DNA microarray analysis server. Bioinformatics 2009; 25:1438-9. [DOI: 10.1093/bioinformatics/btp165] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|