1
|
Zhou B, Guo Y, Xue Y, Ji X, Huang Y. Comprehensive insights into the mechanism of keratin degradation and exploitation of keratinase to enhance the bioaccessibility of soybean protein. BIOTECHNOLOGY FOR BIOFUELS AND BIOPRODUCTS 2023; 16:177. [PMID: 37978558 PMCID: PMC10655438 DOI: 10.1186/s13068-023-02426-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 11/02/2023] [Indexed: 11/19/2023]
Abstract
Keratin is a recalcitrant protein and can be decomposed in nature. However, the mechanism of keratin degradation is still not well understood. In this study, Bacillus sp. 8A6 can completely degrade the feather in 20 h, which is an efficient keratin degrader reported so far. Comprehensive transcriptome analysis continuously tracks the metabolism of Bacillus sp. 8A6 throughout its growth in feather medium. It reveals for the first time how the strain can acquire nutrients and energy in an oligotrophic feather medium for proliferation in the early stage. Then, the degradation of the outer lipid layer of feather can expose the internal keratin structure for disulfide bonds reduction by sulfite from the newly identified sulfite metabolic pathway, disulfide reductases and iron uptake. The resulting weakened keratin has been further proposedly de-assembled by the S9 protease and hydrolyzed by synergistic effects of the endo, exo and oligo-proteases from S1, S8, M3, M14, M20, M24, M42, M84 and T3 families. Finally, bioaccessible peptides and amino acids are generated and transported for strain growth. The keratinase has been applied for soybean hydrolysis, which generates 2234 peptides and 559.93 mg/L17 amino acids. Therefore, the keratinases, inducing from the poultry waste, have great potential to be further applied for producing bioaccessible peptides and amino acids for feed industry.
Collapse
Affiliation(s)
- Beiya Zhou
- College of Mathematical Sciences, Bohai University, Jinzhou, 121013, Liaoning, China
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, China
- Huizhou Institute of Green Energy and Advanced Materials, Huizhou, 516000, Guangdong, China
| | - Yandong Guo
- College of Mathematical Sciences, Bohai University, Jinzhou, 121013, Liaoning, China.
| | - Yaju Xue
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, China
| | - Xiuling Ji
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, China
| | - Yuhong Huang
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, China.
| |
Collapse
|
2
|
Fenn A, Tsoy O, Faro T, Rößler FM, Dietrich A, Kersting J, Louadi Z, Lio CT, Völker U, Baumbach J, Kacprowski T, List M. Alternative splicing analysis benchmark with DICAST. NAR Genom Bioinform 2023; 5:lqad044. [PMID: 37260511 PMCID: PMC10227362 DOI: 10.1093/nargab/lqad044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 04/13/2023] [Accepted: 05/05/2023] [Indexed: 06/02/2023] Open
Abstract
Alternative splicing is a major contributor to transcriptome and proteome diversity in health and disease. A plethora of tools have been developed for studying alternative splicing in RNA-seq data. Previous benchmarks focused on isoform quantification and mapping. They neglected event detection tools, which arguably provide the most detailed insights into the alternative splicing process. DICAST offers a modular and extensible framework for analysing alternative splicing integrating eleven splice-aware mapping and eight event detection tools. We benchmark all tools extensively on simulated as well as whole blood RNA-seq data. STAR and HISAT2 demonstrated the best balance between performance and run time. The performance of event detection tools varies widely with no tool outperforming all others. DICAST allows researchers to employ a consensus approach to consider the most successful tools jointly for robust event detection. Furthermore, we propose the first reporting standard to unify existing formats and to guide future tool development.
Collapse
Affiliation(s)
| | | | - Tim Faro
- Chair of Experimental Bioinformatics, Technical University of Munich, 85354 Freising, Germany
| | - Fanny L M Rößler
- Chair of Experimental Bioinformatics, Technical University of Munich, 85354 Freising, Germany
| | - Alexander Dietrich
- Chair of Experimental Bioinformatics, Technical University of Munich, 85354 Freising, Germany
| | - Johannes Kersting
- Chair of Experimental Bioinformatics, Technical University of Munich, 85354 Freising, Germany
| | - Zakaria Louadi
- Chair of Experimental Bioinformatics, Technical University of Munich, 85354 Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607 Hamburg, Germany
| | - Chit Tong Lio
- Chair of Experimental Bioinformatics, Technical University of Munich, 85354 Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607 Hamburg, Germany
| | - Uwe Völker
- Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Felix-Hausdorff-Straße 8, D-17475 Greifswald, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Greifswald, Greifswald, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607 Hamburg, Germany
- Institute of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, 5000 Odense, Denmark
| | | | - Markus List
- To whom correspondence should be addressed. Tel: +49 8161 71 2761;
| |
Collapse
|
3
|
Miller B, Kim SJ, Mehta HH, Cao K, Kumagai H, Thumaty N, Leelaprachakul N, Braniff RG, Jiao H, Vaughan J, Diedrich J, Saghatelian A, Arpawong TE, Crimmins EM, Ertekin-Taner N, Tubi MA, Hare ET, Braskie MN, Décarie-Spain L, Kanoski SE, Grodstein F, Bennett DA, Zhao L, Toga AW, Wan J, Yen K, Cohen P. Mitochondrial DNA variation in Alzheimer's disease reveals a unique microprotein called SHMOOSE. Mol Psychiatry 2023; 28:1813-1826. [PMID: 36127429 PMCID: PMC10027624 DOI: 10.1038/s41380-022-01769-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 08/15/2022] [Accepted: 08/26/2022] [Indexed: 01/22/2023]
Abstract
Mitochondrial DNA variants have previously associated with disease, but the underlying mechanisms have been largely elusive. Here, we report that mitochondrial SNP rs2853499 associated with Alzheimer's disease (AD), neuroimaging, and transcriptomics. We mapped rs2853499 to a novel mitochondrial small open reading frame called SHMOOSE with microprotein encoding potential. Indeed, we detected two unique SHMOOSE-derived peptide fragments in mitochondria by using mass spectrometry-the first unique mass spectrometry-based detection of a mitochondrial-encoded microprotein to date. Furthermore, cerebrospinal fluid (CSF) SHMOOSE levels in humans correlated with age, CSF tau, and brain white matter volume. We followed up on these genetic and biochemical findings by carrying out a series of functional experiments. SHMOOSE acted on the brain following intracerebroventricular administration, differentiated mitochondrial gene expression in multiple models, localized to mitochondria, bound the inner mitochondrial membrane protein mitofilin, and boosted mitochondrial oxygen consumption. Altogether, SHMOOSE has vast implications for the fields of neurobiology, Alzheimer's disease, and microproteins.
Collapse
Affiliation(s)
- Brendan Miller
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
- Neuroscience Graduate Program, University of Southern California, Los Angeles, CA, USA
| | - Su-Jeong Kim
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
| | - Hemal H Mehta
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
| | - Kevin Cao
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
| | - Hiroshi Kumagai
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
| | - Neehar Thumaty
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
| | - Naphada Leelaprachakul
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
| | - Regina Gonzalez Braniff
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
| | - Henry Jiao
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
| | - Joan Vaughan
- Clayton Foundation Laboratories for Peptide Biology, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Jolene Diedrich
- Clayton Foundation Laboratories for Peptide Biology, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Alan Saghatelian
- Clayton Foundation Laboratories for Peptide Biology, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Thalida E Arpawong
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
| | - Eileen M Crimmins
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
| | | | - Meral A Tubi
- Imaging Genetics Center, Institute for Neuroimaging and Informatics, University of Southern California, Los Angeles, CA, USA
- Department of Neurology, University of Southern California, Los Angeles, CA, USA
| | - Evan T Hare
- Imaging Genetics Center, Institute for Neuroimaging and Informatics, University of Southern California, Los Angeles, CA, USA
- Department of Neurology, University of Southern California, Los Angeles, CA, USA
| | - Meredith N Braskie
- Imaging Genetics Center, Institute for Neuroimaging and Informatics, University of Southern California, Los Angeles, CA, USA
- Department of Neurology, University of Southern California, Los Angeles, CA, USA
| | - Léa Décarie-Spain
- Human and Evolutionary Biology Section, Department of Biological Sciences, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, CA, USA
| | - Scott E Kanoski
- Neuroscience Graduate Program, University of Southern California, Los Angeles, CA, USA
- Human and Evolutionary Biology Section, Department of Biological Sciences, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, CA, USA
| | - Francine Grodstein
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - Lu Zhao
- Laboratory of Neuro Imaging, USC Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of University of Southern California, Los Angeles, CA, USA
| | - Arthur W Toga
- Laboratory of Neuro Imaging, USC Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of University of Southern California, Los Angeles, CA, USA
| | - Junxiang Wan
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
| | - Kelvin Yen
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
| | - Pinchas Cohen
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
4
|
Gallardo VJ, Gómez-Galván JB, Asskour L, Torres-Ferrús M, Alpuente A, Caronna E, Pozo-Rosich P. A study of differential microRNA expression profile in migraine: the microMIG exploratory study. J Headache Pain 2023; 24:11. [PMID: 36797674 PMCID: PMC9936672 DOI: 10.1186/s10194-023-01542-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 01/27/2023] [Indexed: 02/18/2023] Open
Abstract
BACKGROUND Several studies have described potential microRNA (miRNA) biomarkers associated with migraine, but studies are scarcely reproducible primarily due to the heterogeneous variability of participants. Increasing evidence shows that disease-related intrinsic factors together with lifestyle (environmental factors), influence epigenetic mechanisms and in turn, diseases. Hence, the main objective of this exploratory study was to find differentially expressed miRNAs (DE miRNA) in peripheral blood mononuclear cells (PBMC) of patients with migraine compared to healthy controls in a well-controlled homogeneous cohort of non-menopausal women. METHODS Patients diagnosed with migraine according to the International Classification of Headache Disorders (ICHD-3) and healthy controls without familial history of headache disorders were recruited. All participants completed a very thorough questionnaire and structured-interview in order to control for environmental factors. RNA was extracted from PBMC and a microarray system (GeneChip miRNA 4.1 Array chip, Affymetrix) was used to determine the miRNA profiles between study groups. Principal components analysis and hierarchical clustering analysis were performed to study samples distribution and random forest (RF) algorithms were computed for the classification task. To evaluate the stability of the results and the prediction error rate, a bootstrap (.632 + rule) was run through all the procedure. Finally, a functional enrichment analysis of selected targets was computed through protein-protein interaction networks. RESULTS After RF classification, three DE miRNA distinguished study groups in a very homogeneous female cohort, controlled by factors such as demographics (age and BMI), life-habits (physical activity, caffeine and alcohol consumptions), comorbidities and clinical features associated to the disease: miR-342-3p, miR-532-3p and miR-758-5p. Sixty-eight target genes were predicted which were linked mainly to enriched ion channels and signaling pathways, neurotransmitter and hormone homeostasis, infectious diseases and circadian entrainment. CONCLUSIONS A 3-miRNA (miR-342-3p, miR-532-3p and miR-758-5p) novel signature has been found differentially expressed between controls and patients with migraine. Enrichment analysis showed that these pathways are closely associated with known migraine pathophysiology, which could lead to the first reliable epigenetic biomarker set. Further studies should be performed to validate these findings in a larger and more heterogeneous sample.
Collapse
Affiliation(s)
- V. J. Gallardo
- grid.430994.30000 0004 1763 0287Headache and Neurological Pain Research Group, Vall d’Hebron Institute of Research (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain
| | - J. B. Gómez-Galván
- grid.430994.30000 0004 1763 0287Headache and Neurological Pain Research Group, Vall d’Hebron Institute of Research (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain
| | - L. Asskour
- grid.430994.30000 0004 1763 0287Headache and Neurological Pain Research Group, Vall d’Hebron Institute of Research (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain
| | - M. Torres-Ferrús
- grid.430994.30000 0004 1763 0287Headache and Neurological Pain Research Group, Vall d’Hebron Institute of Research (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain ,grid.411083.f0000 0001 0675 8654Neurology Department, Headache Unit, Vall d’Hebron University Hospital, Barcelona, Spain
| | - A. Alpuente
- grid.430994.30000 0004 1763 0287Headache and Neurological Pain Research Group, Vall d’Hebron Institute of Research (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain ,grid.411083.f0000 0001 0675 8654Neurology Department, Headache Unit, Vall d’Hebron University Hospital, Barcelona, Spain
| | - E. Caronna
- grid.430994.30000 0004 1763 0287Headache and Neurological Pain Research Group, Vall d’Hebron Institute of Research (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain ,grid.411083.f0000 0001 0675 8654Neurology Department, Headache Unit, Vall d’Hebron University Hospital, Barcelona, Spain
| | - P. Pozo-Rosich
- grid.430994.30000 0004 1763 0287Headache and Neurological Pain Research Group, Vall d’Hebron Institute of Research (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain ,grid.411083.f0000 0001 0675 8654Neurology Department, Headache Unit, Vall d’Hebron University Hospital, Barcelona, Spain
| |
Collapse
|
5
|
Isgut M, Gloster L, Choi K, Venugopalan J, Wang MD. Systematic Review of Advanced AI Methods for Improving Healthcare Data Quality in Post COVID-19 Era. IEEE Rev Biomed Eng 2023; 16:53-69. [PMID: 36269930 DOI: 10.1109/rbme.2022.3216531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
At the beginning of the COVID-19 pandemic, there was significant hype about the potential impact of artificial intelligence (AI) tools in combatting COVID-19 on diagnosis, prognosis, or surveillance. However, AI tools have not yet been widely successful. One of the key reason is the COVID-19 pandemic has demanded faster real-time development of AI-driven clinical and health support tools, including rapid data collection, algorithm development, validation, and deployment. However, there was not enough time for proper data quality control. Learning from the hard lessons in COVID-19, we summarize the important health data quality challenges during COVID-19 pandemic such as lack of data standardization, missing data, tabulation errors, and noise and artifact. Then we conduct a systematic investigation of computational methods that address these issues, including emerging novel advanced AI data quality control methods that achieve better data quality outcomes and, in some cases, simplify or automate the data cleaning process. We hope this article can assist healthcare community to improve health data quality going forward with novel AI development.
Collapse
|
6
|
Sarantopoulou D, Brooks TG, Nayak S, Mrčela A, Lahens NF, Grant GR. Comparative evaluation of full-length isoform quantification from RNA-Seq. BMC Bioinformatics 2021; 22:266. [PMID: 34034652 PMCID: PMC8145802 DOI: 10.1186/s12859-021-04198-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 05/16/2021] [Indexed: 11/18/2022] Open
Abstract
Background Full-length isoform quantification from RNA-Seq is a key goal in transcriptomics analyses and has been an area of active development since the beginning. The fundamental difficulty stems from the fact that RNA transcripts are long, while RNA-Seq reads are short. Results Here we use simulated benchmarking data that reflects many properties of real data, including polymorphisms, intron signal and non-uniform coverage, allowing for systematic comparative analyses of isoform quantification accuracy and its impact on differential expression analysis. Genome, transcriptome and pseudo alignment-based methods are included; and a simple approach is included as a baseline control. Conclusions Salmon, kallisto, RSEM, and Cufflinks exhibit the highest accuracy on idealized data, while on more realistic data they do not perform dramatically better than the simple approach. We determine the structural parameters with the greatest impact on quantification accuracy to be length and sequence compression complexity and not so much the number of isoforms. The effect of incomplete annotation on performance is also investigated. Overall, the tested methods show sufficient divergence from the truth to suggest that full-length isoform quantification and isoform level DE should still be employed selectively. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04198-1.
Collapse
Affiliation(s)
- Dimitra Sarantopoulou
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA.,National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Thomas G Brooks
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Soumyashant Nayak
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Antonijo Mrčela
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Nicholas F Lahens
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Gregory R Grant
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA. .,Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
7
|
Naraine R, Abaffy P, Sidova M, Tomankova S, Pocherniaieva K, Smolik O, Kubista M, Psenicka M, Sindelka R. NormQ: RNASeq normalization based on RT-qPCR derived size factors. Comput Struct Biotechnol J 2020; 18:1173-1181. [PMID: 32514328 PMCID: PMC7264052 DOI: 10.1016/j.csbj.2020.05.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Revised: 05/07/2020] [Accepted: 05/07/2020] [Indexed: 02/04/2023] Open
Abstract
The merit of RNASeq data relies heavily on correct normalization. However, most methods assume that the majority of transcripts show no differential expression between conditions. This assumption may not always be correct, especially when one condition results in overexpression. We present a new method (NormQ) to normalize the RNASeq library size, using the relative proportion observed from RT-qPCR of selected marker genes. The method was compared against the popular median-of-ratios method, using simulated and real-datasets. NormQ produced more matches to differentially expressed genes in the simulated dataset and more distribution profile matches for both simulated and real datasets.
Collapse
Affiliation(s)
- Ravindra Naraine
- Laboratory of Gene Expression, Institute of Biotechnology of the Czech Academy of Sciences - BIOCEV, Prumyslova 595, Vestec 252 50, Czech Republic
| | - Pavel Abaffy
- Laboratory of Gene Expression, Institute of Biotechnology of the Czech Academy of Sciences - BIOCEV, Prumyslova 595, Vestec 252 50, Czech Republic
| | - Monika Sidova
- Laboratory of Gene Expression, Institute of Biotechnology of the Czech Academy of Sciences - BIOCEV, Prumyslova 595, Vestec 252 50, Czech Republic
| | - Silvie Tomankova
- Laboratory of Gene Expression, Institute of Biotechnology of the Czech Academy of Sciences - BIOCEV, Prumyslova 595, Vestec 252 50, Czech Republic
| | - Kseniia Pocherniaieva
- University of South Bohemia in Ceske Budejovice, Faculty of Fisheries and Protection of Waters, South Bohemian Research Center of Aquaculture and Biodiversity of Hydrocenoses, Research Institute of Fish Culture and Hydrobiology, Vodnany, Czech Republic
| | - Ondrej Smolik
- Laboratory of Gene Expression, Institute of Biotechnology of the Czech Academy of Sciences - BIOCEV, Prumyslova 595, Vestec 252 50, Czech Republic
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Mikael Kubista
- Laboratory of Gene Expression, Institute of Biotechnology of the Czech Academy of Sciences - BIOCEV, Prumyslova 595, Vestec 252 50, Czech Republic
| | - Martin Psenicka
- University of South Bohemia in Ceske Budejovice, Faculty of Fisheries and Protection of Waters, South Bohemian Research Center of Aquaculture and Biodiversity of Hydrocenoses, Research Institute of Fish Culture and Hydrobiology, Vodnany, Czech Republic
| | - Radek Sindelka
- Laboratory of Gene Expression, Institute of Biotechnology of the Czech Academy of Sciences - BIOCEV, Prumyslova 595, Vestec 252 50, Czech Republic
| |
Collapse
|
8
|
Kubota A, Kawai YK, Yamashita N, Lee JS, Kondoh D, Zhang S, Nishi Y, Suzuki K, Kitazawa T, Teraoka H. Transcriptional profiling of cytochrome P450 genes in the liver of adult zebrafish, Danio rerio. J Toxicol Sci 2019; 44:347-356. [PMID: 31068540 DOI: 10.2131/jts.44.347] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Increasing use of zebrafish in biomedical, toxicological and developmental studies requires explicit knowledge of cytochrome P450 (CYP), given the central role of CYP in oxidative biotransformation of xenobiotics and many regulatory molecules. A full complement of CYP genes in zebrafish and their transcript expression during early development have already been examined. Here we established a comprehensive picture of CYP gene expression in the adult zebrafish liver using a RNA-seq technique. Transcriptional profiling of a full complement of CYP genes revealed that CYP2AD2, CYP3A65, CYP1A, CYP2P9 and CYP2Y3 are major CYP genes expressed in the adult zebrafish liver in both sexes. Quantitative real-time RT-PCR analysis for selected CYP genes further supported our RNA-seq data. There were significant sex differences in the transcript levels for CYP1A, CYP1B1, CYP1D1 and CYP2N13, with males having higher expression levels than those in females in all cases. A similar feature of gender-specific expression was observed for CYP2AD2 and CYP2P9, suggesting sex-specific regulation of constitutive expression of some CYP genes in the adult zebrafish liver. The present study revealed several "orphan" CYP genes as dominant isozymes at transcript levels in the adult zebrafish liver, implying crucial roles of these CYP genes in liver physiology and drug metabolism. The current results establish a foundation for studies with zebrafish in drug discovery and toxicology.
Collapse
Affiliation(s)
- Akira Kubota
- Laboratory of Toxicology, Department of Veterinary Medicine, Obihiro University of Agriculture and Veterinary Medicine
| | - Yusuke K Kawai
- Laboratory of Toxicology, Department of Veterinary Medicine, Obihiro University of Agriculture and Veterinary Medicine
| | - Natsumi Yamashita
- Laboratory of Veterinary Pharmacology, School of Veterinary Medicine, Rakuno Gakuen University
| | - Jae Seung Lee
- Laboratory of Toxicology, Department of Veterinary Medicine, Obihiro University of Agriculture and Veterinary Medicine
| | - Daisuke Kondoh
- Laboratory of Veterinary Anatomy, Department of Veterinary Medicine, Obihiro University of Agriculture and Veterinary Medicine
| | - Shuangyi Zhang
- Laboratory of Veterinary Pharmacology, School of Veterinary Medicine, Rakuno Gakuen University
| | - Yasunobu Nishi
- Department of Large Animal Clinical Sciences, School of Veterinary Medicine, Rakuno Gakuen University
| | - Kazuyuki Suzuki
- Department of Large Animal Clinical Sciences, School of Veterinary Medicine, Rakuno Gakuen University
| | - Takio Kitazawa
- Laboratory of Veterinary Pharmacology, School of Veterinary Medicine, Rakuno Gakuen University
| | - Hiroki Teraoka
- Laboratory of Veterinary Pharmacology, School of Veterinary Medicine, Rakuno Gakuen University
| |
Collapse
|
9
|
Rao MS, Van Vleet TR, Ciurlionis R, Buck WR, Mittelstadt SW, Blomme EAG, Liguori MJ. Comparison of RNA-Seq and Microarray Gene Expression Platforms for the Toxicogenomic Evaluation of Liver From Short-Term Rat Toxicity Studies. Front Genet 2019; 9:636. [PMID: 30723492 PMCID: PMC6349826 DOI: 10.3389/fgene.2018.00636] [Citation(s) in RCA: 127] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 11/27/2018] [Indexed: 12/12/2022] Open
Abstract
Gene expression profiling is a useful tool to predict and interrogate mechanisms of toxicity. RNA-Seq technology has emerged as an attractive alternative to traditional microarray platforms for conducting transcriptional profiling. The objective of this work was to compare both transcriptomic platforms to determine whether RNA-Seq offered significant advantages over microarrays for toxicogenomic studies. RNA samples from the livers of rats treated for 5 days with five tool hepatotoxicants (α-naphthylisothiocyanate/ANIT, carbon tetrachloride/CCl4, methylenedianiline/MDA, acetaminophen/APAP, and diclofenac/DCLF) were analyzed with both gene expression platforms (RNA-Seq and microarray). Data were compared to determine any potential added scientific (i.e., better biological or toxicological insight) value offered by RNA-Seq compared to microarrays. RNA-Seq identified more differentially expressed protein-coding genes and provided a wider quantitative range of expression level changes when compared to microarrays. Both platforms identified a larger number of differentially expressed genes (DEGs) in livers of rats treated with ANIT, MDA, and CCl4 compared to APAP and DCLF, in agreement with the severity of histopathological findings. Approximately 78% of DEGs identified with microarrays overlapped with RNA-Seq data, with a Spearman’s correlation of 0.7 to 0.83. Consistent with the mechanisms of toxicity of ANIT, APAP, MDA and CCl4, both platforms identified dysregulation of liver relevant pathways such as Nrf2, cholesterol biosynthesis, eiF2, hepatic cholestasis, glutathione and LPS/IL-1 mediated RXR inhibition. RNA-Seq data showed additional DEGs that not only significantly enriched these pathways, but also suggested modulation of additional liver relevant pathways. In addition, RNA-Seq enabled the identification of non-coding DEGs that offer a potential for improved mechanistic clarity. Overall, these results indicate that RNA-Seq is an acceptable alternative platform to microarrays for rat toxicogenomic studies with several advantages. Because of its wider dynamic range as well as its ability to identify a larger number of DEGs, RNA-Seq may generate more insight into mechanisms of toxicity. However, more extensive reference data will be necessary to fully leverage these additional RNA-Seq data, especially for non-coding sequences.
Collapse
Affiliation(s)
- Mohan S Rao
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| | - Terry R Van Vleet
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| | - Rita Ciurlionis
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| | - Wayne R Buck
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| | - Scott W Mittelstadt
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| | - Eric A G Blomme
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| | - Michael J Liguori
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| |
Collapse
|
10
|
Gardner M, Dhroso A, Johnson N, Davis EL, Baum TJ, Korkin D, Mitchum MG. Novel global effector mining from the transcriptome of early life stages of the soybean cyst nematode Heterodera glycines. Sci Rep 2018; 8:2505. [PMID: 29410430 PMCID: PMC5802810 DOI: 10.1038/s41598-018-20536-5] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2017] [Accepted: 01/12/2018] [Indexed: 11/08/2022] Open
Abstract
Soybean cyst nematode (SCN) Heterodera glycines is an obligate parasite that relies on the secretion of effector proteins to manipulate host cellular processes that favor the formation of a feeding site within host roots to ensure its survival. The sequence complexity and co-evolutionary forces acting upon these effectors remain unknown. Here we generated a de novo transcriptome assembly representing the early life stages of SCN in both a compatible and an incompatible host interaction to facilitate global effector mining efforts in the absence of an available annotated SCN genome. We then employed a dual effector prediction strategy coupling a newly developed nematode effector prediction tool, N-Preffector, with a traditional secreted protein prediction pipeline to uncover a suite of novel effector candidates. Our analysis distinguished between effectors that co-evolve with the host genotype and those conserved by the pathogen to maintain a core function in parasitism and demonstrated that alternative splicing is one mechanism used to diversify the effector pool. In addition, we confirmed the presence of viral and microbial inhabitants with molecular sequence information. This transcriptome represents the most comprehensive whole-nematode sequence currently available for SCN and can be used as a tool for annotation of expected genome assemblies.
Collapse
Affiliation(s)
- Michael Gardner
- Division of Plant Sciences and Bond Life Sciences Center, University of Missouri, Columbia, USA
| | - Andi Dhroso
- Department of Computer Science and Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, USA
| | - Nathan Johnson
- Department of Computer Science and Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, USA
| | - Eric L Davis
- Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, USA
| | - Thomas J Baum
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, USA
| | - Dmitry Korkin
- Department of Computer Science and Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, USA.
| | - Melissa G Mitchum
- Division of Plant Sciences and Bond Life Sciences Center, University of Missouri, Columbia, USA.
| |
Collapse
|
11
|
Domenger C, Allais M, François V, Léger A, Lecomte E, Montus M, Servais L, Voit T, Moullier P, Audic Y, Le Guiner C. RNA-Seq Analysis of an Antisense Sequence Optimized for Exon Skipping in Duchenne Patients Reveals No Off-Target Effect. MOLECULAR THERAPY-NUCLEIC ACIDS 2017; 10:277-291. [PMID: 29499940 PMCID: PMC5785776 DOI: 10.1016/j.omtn.2017.12.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/11/2017] [Revised: 12/16/2017] [Accepted: 12/16/2017] [Indexed: 01/16/2023]
Abstract
Non-coding uridine-rich small nuclear RNAs (UsnRNAs) have emerged in recent years as effective tools for exon skipping for the treatment of Duchenne muscular dystrophy (DMD), a degenerative muscular genetic disorder. We recently showed the high capacity of a recombinant adeno-associated virus (rAAV)-U7snRNA vector to restore the reading frame of the DMD mRNA in the muscles of DMD dogs. We are now moving toward a phase I/II clinical trial with an rAAV-U7snRNA-E53, carrying an antisense sequence designed to hybridize exon 53 of the human DMD messenger. As observed for genome-editing tools, antisense sequences present a risk of off-target effects, reflecting partial hybridization onto unintended transcripts. To characterize the clinical antisense sequence, we studied its expression and explored the occurrence of its off-target effects in human in vitro models of skeletal muscle and liver. We presented a comprehensive methodology combining RNA sequencing and in silico filtering to analyze off-targets. We showed that U7snRNA-E53 induced the effective exon skipping of the DMD transcript without inducing the notable deregulation of transcripts in human cells, neither at gene expression nor at the mRNA splicing level. Altogether, these results suggest that the use of the rAAV-U7snRNA-E53 vector for exon skipping could be safe in eligible DMD patients.
Collapse
Affiliation(s)
- Claire Domenger
- INSERM UMR 1089, Université de Nantes, CHU de Nantes, 44200 Nantes, France.
| | - Marine Allais
- INSERM UMR 1089, Université de Nantes, CHU de Nantes, 44200 Nantes, France
| | - Virginie François
- INSERM UMR 1089, Université de Nantes, CHU de Nantes, 44200 Nantes, France
| | - Adrien Léger
- INSERM UMR 1089, Université de Nantes, CHU de Nantes, 44200 Nantes, France
| | - Emilie Lecomte
- INSERM UMR 1089, Université de Nantes, CHU de Nantes, 44200 Nantes, France
| | | | - Laurent Servais
- Institute I-Motion, Hôpital Armand Trousseau, 75012 Paris, France
| | - Thomas Voit
- NIHR Biomedical Research Centre, UCL Institute of Child Health/Great Ormond Street Hospital NHS Trust, WC1N 1EH London, UK
| | - Philippe Moullier
- INSERM UMR 1089, Université de Nantes, CHU de Nantes, 44200 Nantes, France
| | - Yann Audic
- CNRS, UMR 6290 Institut Génétique et Développement de Rennes, Université de Rennes 1, 35000 Rennes, France
| | - Caroline Le Guiner
- INSERM UMR 1089, Université de Nantes, CHU de Nantes, 44200 Nantes, France.
| |
Collapse
|
12
|
Abstract
The pervasive expression of circular RNAs (circRNAs) is a recently discovered feature of gene expression in highly diverged eukaryotes. Numerous algorithms that are used to detect genome-wide circRNA expression from RNA sequencing (RNA-seq) data have been developed in the past few years, but there is little overlap in their predictions and no clear gold-standard method to assess the accuracy of these algorithms. We review sources of experimental and bioinformatic biases that complicate the accurate discovery of circRNAs and discuss statistical approaches to address these biases. We conclude with a discussion of the current experimental progress on the topic.
Collapse
|
13
|
Benchmarking of RNA-sequencing analysis workflows using whole-transcriptome RT-qPCR expression data. Sci Rep 2017; 7:1559. [PMID: 28484260 PMCID: PMC5431503 DOI: 10.1038/s41598-017-01617-3] [Citation(s) in RCA: 211] [Impact Index Per Article: 30.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2016] [Accepted: 04/03/2017] [Indexed: 11/08/2022] Open
Abstract
RNA-sequencing has become the gold standard for whole-transcriptome gene expression quantification. Multiple algorithms have been developed to derive gene counts from sequencing reads. While a number of benchmarking studies have been conducted, the question remains how individual methods perform at accurately quantifying gene expression levels from RNA-sequencing reads. We performed an independent benchmarking study using RNA-sequencing data from the well established MAQCA and MAQCB reference samples. RNA-sequencing reads were processed using five workflows (Tophat-HTSeq, Tophat-Cufflinks, STAR-HTSeq, Kallisto and Salmon) and resulting gene expression measurements were compared to expression data generated by wet-lab validated qPCR assays for all protein coding genes. All methods showed high gene expression correlations with qPCR data. When comparing gene expression fold changes between MAQCA and MAQCB samples, about 85% of the genes showed consistent results between RNA-sequencing and qPCR data. Of note, each method revealed a small but specific gene set with inconsistent expression measurements. A significant proportion of these method-specific inconsistent genes were reproducibly identified in independent datasets. These genes were typically smaller, had fewer exons, and were lower expressed compared to genes with consistent expression measurements. We propose that careful validation is warranted when evaluating RNA-seq based expression profiles for this specific gene set.
Collapse
|
14
|
Abstract
BACKGROUND Deconvolution is a mathematical process of resolving an observed function into its constituent elements. In the field of biomedical research, deconvolution analysis is applied to obtain single cell-type or tissue specific signatures from a mixed signal and most of them follow the linearity assumption. Although recent development of next generation sequencing technology suggests RNA-seq as a fast and accurate method for obtaining transcriptomic profiles, few studies have been conducted to investigate best RNA-seq quantification methods that yield the optimum linear space for deconvolution analysis. RESULTS Using a benchmark RNA-seq dataset, we investigated the linearity of abundance estimated from seven most popular RNA-seq quantification methods both at the gene and isoform levels. Linearity is evaluated through parameter estimation, concordance analysis and residual analysis based on a multiple linear regression model. Results show that count data gives poor parameter estimations, large intercepts and high inter-sample variability; while TPM value from Kallisto and Salmon shows high linearity in all analyses. CONCLUSIONS Salmon and Kallisto TPM data gives the best fit to the linear model studied. This suggests that TPM values estimated from Salmon and Kallisto are the ideal RNA-seq measurements for deconvolution studies.
Collapse
Affiliation(s)
- Haijing Jin
- Graduate Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, One Baylor Plaza, Houston, 77030 TX USA
| | - Ying-Wooi Wan
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, 77030 TX USA
| | - Zhandong Liu
- Department of Pediatrics-Neurology, Jan and Dan Duncan Neurological Research Institute, Baylor College of Medicine, 1250 Moursund St., Suite 1325, Houston, 77030 TX USA
| |
Collapse
|
15
|
Poirion OB, Zhu X, Ching T, Garmire L. Single-Cell Transcriptomics Bioinformatics and Computational Challenges. Front Genet 2016; 7:163. [PMID: 27708664 PMCID: PMC5030210 DOI: 10.3389/fgene.2016.00163] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2016] [Accepted: 09/02/2016] [Indexed: 12/21/2022] Open
Abstract
The emerging single-cell RNA-Seq (scRNA-Seq) technology holds the promise to revolutionize our understanding of diseases and associated biological processes at an unprecedented resolution. It opens the door to reveal intercellular heterogeneity and has been employed to a variety of applications, ranging from characterizing cancer cells subpopulations to elucidating tumor resistance mechanisms. Parallel to improving experimental protocols to deal with technological issues, deriving new analytical methods to interpret the complexity in scRNA-Seq data is just as challenging. Here, we review current state-of-the-art bioinformatics tools and methods for scRNA-Seq analysis, as well as addressing some critical analytical challenges that the field faces.
Collapse
Affiliation(s)
- Olivier B Poirion
- Epidemiology Program, University of Hawaii Cancer Center Honolulu, HI, USA
| | - Xun Zhu
- Epidemiology Program, University of Hawaii Cancer CenterHonolulu, HI, USA; Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at ManoaHonolulu, HI, USA
| | - Travers Ching
- Epidemiology Program, University of Hawaii Cancer CenterHonolulu, HI, USA; Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at ManoaHonolulu, HI, USA
| | - Lana Garmire
- Epidemiology Program, University of Hawaii Cancer Center Honolulu, HI, USA
| |
Collapse
|
16
|
Schuierer S, Roma G. The exon quantification pipeline (EQP): a comprehensive approach to the quantification of gene, exon and junction expression from RNA-seq data. Nucleic Acids Res 2016; 44:e132. [PMID: 27302131 PMCID: PMC5027495 DOI: 10.1093/nar/gkw538] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 06/04/2016] [Indexed: 01/24/2023] Open
Abstract
The quantification of transcriptomic features is the basis of the analysis of RNA-seq data. We present an integrated alignment workflow and a simple counting-based approach to derive estimates for gene, exon and exon–exon junction expression. In contrast to previous counting-based approaches, EQP takes into account only reads whose alignment pattern agrees with the splicing pattern of the features of interest. This leads to improved gene expression estimates as well as to the generation of exon counts that allow disambiguating reads between overlapping exons. Unlike other methods that quantify skipped introns, EQP offers a novel way to compute junction counts based on the agreement of the read alignments with the exons on both sides of the junction, thus providing a uniformly derived set of counts. We evaluated the performance of EQP on both simulated and real Illumina RNA-seq data and compared it with other quantification tools. Our results suggest that EQP provides superior gene expression estimates and we illustrate the advantages of EQP's exon and junction counts. The provision of uniformly derived high-quality counts makes EQP an ideal quantification tool for differential expression and differential splicing studies. EQP is freely available for download at https://github.com/Novartis/EQP-cluster.
Collapse
Affiliation(s)
- Sven Schuierer
- Novartis Institutes for Biomedical Research, CH-4056 Basel, Switzerland
| | - Guglielmo Roma
- Novartis Institutes for Biomedical Research, CH-4056 Basel, Switzerland
| |
Collapse
|
17
|
Hartley SW, Mullikin JC. Detection and visualization of differential splicing in RNA-Seq data with JunctionSeq. Nucleic Acids Res 2016; 44:e127. [PMID: 27257077 PMCID: PMC5009739 DOI: 10.1093/nar/gkw501] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2016] [Accepted: 05/24/2016] [Indexed: 12/14/2022] Open
Abstract
Although RNA-Seq data provide unprecedented isoform-level expression information, detection of alternative isoform regulation (AIR) remains difficult, particularly when working with an incomplete transcript annotation. We introduce JunctionSeq, a new method that builds on the statistical techniques used by the well-established DEXSeq package to detect differential usage of both exonic regions and splice junctions. In particular, JunctionSeq is capable of detecting differential usage of novel splice junctions without the need for an additional isoform assembly step, greatly improving performance when the available transcript annotation is flawed or incomplete. JunctionSeq also provides a powerful and streamlined visualization toolset that allows bioinformaticians to quickly and intuitively interpret their results. We tested our method on publicly available data from several experiments performed on the rat pineal gland and Toxoplasma gondii, successfully detecting known and previously validated AIR genes in 19 out of 19 gene-level hypothesis tests. Due to its ability to query novel splice sites, JunctionSeq is still able to detect these differences even when all alternative isoforms for these genes were not included in the transcript annotation. JunctionSeq thus provides a powerful method for detecting alternative isoform regulation even with low-quality annotations. An implementation of JunctionSeq is available as an R/Bioconductor package.
Collapse
Affiliation(s)
- Stephen W Hartley
- Comparative Genomics Analysis Unit, Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - James C Mullikin
- Comparative Genomics Analysis Unit, Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
18
|
Liu J, Deng S, Wang H, Ye J, Wu HW, Sun HX, Chua NH. CURLY LEAF Regulates Gene Sets Coordinating Seed Size and Lipid Biosynthesis. PLANT PHYSIOLOGY 2016; 171:424-36. [PMID: 26945048 PMCID: PMC4854673 DOI: 10.1104/pp.15.01335] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2015] [Accepted: 03/03/2016] [Indexed: 05/05/2023]
Abstract
CURLY LEAF (CLF), a histone methyltransferase of Polycomb Repressive Complex 2 (PRC2) for trimethylation of histone H3 Lys 27 (H3K27me3), has been thought as a negative regulator controlling mainly postgermination growth in Arabidopsis (Arabidopsis thaliana). Approximately 14% to 29% of genic regions are decorated by H3K27me3 in the Arabidopsis genome; however, transcriptional repression activities of PRC2 on a majority of these regions remain unclear. Here, by analysis of transcriptome profiles, we found that approximately 11.6% genes in the Arabidopsis genome were repressed by CLF in various organs. Unexpectedly, approximately 54% of these genes were preferentially repressed in siliques. Further analyses of 118 transcriptome datasets uncovered a group of genes that was preferentially expressed and repressed by CLF in embryos at the mature-green stage. This observation suggests that CLF mediates a large-scale H3K27me3 programming/reprogramming event during embryonic development. Plants of clf-28 produced bigger and heavier seeds with higher oil content, larger oil bodies, and altered long-chain fatty acid composition compared with wild type. Around 46% of CLF-repressed genes were associated with H3K27me3 marks; moreover, we verified histone modification and transcriptional repression by CLF on regulatory genes. Our results suggest that CLF silences specific gene expression modules. Genes operating within a module have various molecular functions, but they cooperate to regulate a similar physiological function during embryo development.
Collapse
Affiliation(s)
- Jun Liu
- Laboratory of Plant Molecular Biology, The Rockefeller University, 1230 York Avenue, New York, New York 10065;National Key Facility for Crop Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China; andTemasek Life Sciences Laboratory, 1 Research Link, National University of Singapore, 117604 Singapore
| | - Shulin Deng
- Laboratory of Plant Molecular Biology, The Rockefeller University, 1230 York Avenue, New York, New York 10065;National Key Facility for Crop Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China; andTemasek Life Sciences Laboratory, 1 Research Link, National University of Singapore, 117604 Singapore
| | - Huan Wang
- Laboratory of Plant Molecular Biology, The Rockefeller University, 1230 York Avenue, New York, New York 10065;National Key Facility for Crop Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China; andTemasek Life Sciences Laboratory, 1 Research Link, National University of Singapore, 117604 Singapore
| | - Jian Ye
- Laboratory of Plant Molecular Biology, The Rockefeller University, 1230 York Avenue, New York, New York 10065;National Key Facility for Crop Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China; andTemasek Life Sciences Laboratory, 1 Research Link, National University of Singapore, 117604 Singapore
| | - Hui-Wen Wu
- Laboratory of Plant Molecular Biology, The Rockefeller University, 1230 York Avenue, New York, New York 10065;National Key Facility for Crop Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China; andTemasek Life Sciences Laboratory, 1 Research Link, National University of Singapore, 117604 Singapore
| | - Hai-Xi Sun
- Laboratory of Plant Molecular Biology, The Rockefeller University, 1230 York Avenue, New York, New York 10065;National Key Facility for Crop Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China; andTemasek Life Sciences Laboratory, 1 Research Link, National University of Singapore, 117604 Singapore
| | - Nam-Hai Chua
- Laboratory of Plant Molecular Biology, The Rockefeller University, 1230 York Avenue, New York, New York 10065;National Key Facility for Crop Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China; andTemasek Life Sciences Laboratory, 1 Research Link, National University of Singapore, 117604 Singapore
| |
Collapse
|
19
|
Leshkowitz D, Feldmesser E, Friedlander G, Jona G, Ainbinder E, Parmet Y, Horn-Saban S. Using Synthetic Mouse Spike-In Transcripts to Evaluate RNA-Seq Analysis Tools. PLoS One 2016; 11:e0153782. [PMID: 27100792 PMCID: PMC4839710 DOI: 10.1371/journal.pone.0153782] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2015] [Accepted: 04/04/2016] [Indexed: 11/25/2022] Open
Abstract
One of the key applications of next-generation sequencing (NGS) technologies is RNA-Seq for transcriptome genome-wide analysis. Although multiple studies have evaluated and benchmarked RNA-Seq tools dedicated to gene level analysis, few studies have assessed their effectiveness on the transcript-isoform level. Alternative splicing is a naturally occurring phenomenon in eukaryotes, significantly increasing the biodiversity of proteins that can be encoded by the genome. The aim of this study was to assess and compare the ability of the bioinformatics approaches and tools to assemble, quantify and detect differentially expressed transcripts using RNA-Seq data, in a controlled experiment. To this end, in vitro synthesized mouse spike-in control transcripts were added to the total RNA of differentiating mouse embryonic bodies, and their expression patterns were measured. This novel approach was used to assess the accuracy of the tools, as established by comparing the observed results versus the results expected of the mouse controlled spiked-in transcripts. We found that detection of differential expression at the gene level is adequate, yet on the transcript-isoform level, all tools tested lacked accuracy and precision.
Collapse
Affiliation(s)
- Dena Leshkowitz
- Biological Services Department, Weizmann Institute of Science, Rehovot, 76100, Israel
- * E-mail:
| | - Ester Feldmesser
- Biological Services Department, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Gilgi Friedlander
- Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Ghil Jona
- Biological Services Department, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Elena Ainbinder
- Biological Services Department, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Yisrael Parmet
- Industrial Engineering and Management Department, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel
| | - Shirley Horn-Saban
- Biological Services Department, Weizmann Institute of Science, Rehovot, 76100, Israel
| |
Collapse
|
20
|
van der Beek SL, Le Breton Y, Ferenbach AT, Chapman RN, van Aalten DMF, Navratilova I, Boons GJ, McIver KS, van Sorge NM, Dorfmueller HC. GacA is essential for Group A Streptococcus and defines a new class of monomeric dTDP-4-dehydrorhamnose reductases (RmlD). Mol Microbiol 2015; 98:946-62. [PMID: 26278404 PMCID: PMC4832382 DOI: 10.1111/mmi.13169] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/13/2015] [Indexed: 12/29/2022]
Abstract
The sugar nucleotide dTDP‐L‐rhamnose is critical for the biosynthesis of the Group A Carbohydrate, the molecular signature and virulence determinant of the human pathogen Group A Streptococcus (GAS). The final step of the four‐step dTDP‐L‐rhamnose biosynthesis pathway is catalyzed by dTDP‐4‐dehydrorhamnose reductases (RmlD). RmlD from the Gram‐negative bacterium Salmonella is the only structurally characterized family member and requires metal‐dependent homo‐dimerization for enzymatic activity. Using a biochemical and structural biology approach, we demonstrate that the only RmlD homologue from GAS, previously renamed GacA, functions in a novel monomeric manner. Sequence analysis of 213 Gram‐negative and Gram‐positive RmlD homologues predicts that enzymes from all Gram‐positive species lack a dimerization motif and function as monomers. The enzymatic function of GacA was confirmed through heterologous expression of gacA in a S. mutans rmlD knockout, which restored attenuated growth and aberrant cell division. Finally, analysis of a saturated mutant GAS library using Tn‐sequencing and generation of a conditional‐expression mutant identified gacA as an essential gene for GAS. In conclusion, GacA is an essential monomeric enzyme in GAS and representative of monomeric RmlD enzymes in Gram‐positive bacteria and a subset of Gram‐negative bacteria. These results will help future screens for novel inhibitors of dTDP‐L‐rhamnose biosynthesis.
Collapse
Affiliation(s)
- Samantha L van der Beek
- University Medical Center Utrecht, Medical Microbiology, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands
| | - Yoann Le Breton
- Department of Cell Biology and Molecular Genetics, Maryland Pathogen Research Institute, University of Maryland, 3124 Biosciences Research Building, College Park, MD 20742, USA
| | - Andrew T Ferenbach
- Division of Molecular Microbiology, University of Dundee, School of Life Sciences, Dow Street, DD1 5EH, Dundee, UK
| | - Robert N Chapman
- Complex Carbohydrate Research Center, Department of Chemistry, The University of Georgia, 315 Riverbend Road, Athens, USA
| | - Daan M F van Aalten
- Division of Molecular Microbiology, University of Dundee, School of Life Sciences, Dow Street, DD1 5EH, Dundee, UK
| | - Iva Navratilova
- Division of Biological Chemistry and Drug Discovery, University of Dundee, School of Life Sciences, Dow Street, DD1 5EH, Dundee, UK
| | - Geert-Jan Boons
- Complex Carbohydrate Research Center, Department of Chemistry, The University of Georgia, 315 Riverbend Road, Athens, USA
| | - Kevin S McIver
- Department of Cell Biology and Molecular Genetics, Maryland Pathogen Research Institute, University of Maryland, 3124 Biosciences Research Building, College Park, MD 20742, USA
| | - Nina M van Sorge
- University Medical Center Utrecht, Medical Microbiology, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands
| | - Helge C Dorfmueller
- Division of Molecular Microbiology, University of Dundee, School of Life Sciences, Dow Street, DD1 5EH, Dundee, UK.,Rutherford Appleton Laboratory, Research Complex at Harwell, OX11 0FA, Didcot, UK
| |
Collapse
|
21
|
Hayer KE, Pizarro A, Lahens NF, Hogenesch JB, Grant GR. Benchmark analysis of algorithms for determining and quantifying full-length mRNA splice forms from RNA-seq data. Bioinformatics 2015; 31:3938-45. [PMID: 26338770 PMCID: PMC4673975 DOI: 10.1093/bioinformatics/btv488] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2014] [Accepted: 08/17/2015] [Indexed: 01/26/2023] Open
Abstract
MOTIVATION Because of the advantages of RNA sequencing (RNA-Seq) over microarrays, it is gaining widespread popularity for highly parallel gene expression analysis. For example, RNA-Seq is expected to be able to provide accurate identification and quantification of full-length splice forms. A number of informatics packages have been developed for this purpose, but short reads make it a difficult problem in principle. Sequencing error and polymorphisms add further complications. It has become necessary to perform studies to determine which algorithms perform best and which if any algorithms perform adequately. However, there is a dearth of independent and unbiased benchmarking studies. Here we take an approach using both simulated and experimental benchmark data to evaluate their accuracy. RESULTS We conclude that most methods are inaccurate even using idealized data, and that no method is highly accurate once multiple splice forms, polymorphisms, intron signal, sequencing errors, alignment errors, annotation errors and other complicating factors are present. These results point to the pressing need for further algorithm development. AVAILABILITY AND IMPLEMENTATION Simulated datasets and other supporting information can be found at http://bioinf.itmat.upenn.edu/BEERS/bp2. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Katharina E Hayer
- University of Pennsylvania, Institute for Translational Medicine and Therapeutics, Philadelphia, PA 19104
| | - Angel Pizarro
- Scientific Computing at Amazon Web Services, Seattle, WA 98108
| | | | | | - Gregory R Grant
- University of Pennsylvania, Institute for Translational Medicine and Therapeutics, Philadelphia, PA 19104, Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
22
|
Kanitz A, Gypas F, Gruber AJ, Gruber AR, Martin G, Zavolan M. Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data. Genome Biol 2015. [PMID: 26201343 PMCID: PMC4511015 DOI: 10.1186/s13059-015-0702-5] [Citation(s) in RCA: 99] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Background Understanding the regulation of gene expression, including transcription start site usage, alternative splicing, and polyadenylation, requires accurate quantification of expression levels down to the level of individual transcript isoforms. To comparatively evaluate the accuracy of the many methods that have been proposed for estimating transcript isoform abundance from RNA sequencing data, we have used both synthetic data as well as an independent experimental method for quantifying the abundance of transcript ends at the genome-wide level. Results We found that many tools have good accuracy and yield better estimates of gene-level expression compared to commonly used count-based approaches, but they vary widely in memory and runtime requirements. Nucleotide composition and intron/exon structure have comparatively little influence on the accuracy of expression estimates, which correlates most strongly with transcript/gene expression levels. To facilitate the reproduction and further extension of our study, we provide datasets, source code, and an online analysis tool on a companion website, where developers can upload expression estimates obtained with their own tool to compare them to those inferred by the methods assessed here. Conclusions As many methods for quantifying isoform abundance with comparable accuracy are available, a user’s choice will likely be determined by factors such as the memory and runtime requirements, as well as the availability of methods for downstream analyses. Sequencing-based methods to quantify the abundance of specific transcript regions could complement validation schemes based on synthetic data and quantitative PCR in future or ongoing assessments of RNA-seq analysis methods. Electronic supplementary material The online version of this article (doi:10.1186/s13059-015-0702-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alexander Kanitz
- Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel, Switzerland.
| | - Foivos Gypas
- Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel, Switzerland.
| | - Andreas J Gruber
- Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel, Switzerland.
| | - Andreas R Gruber
- Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel, Switzerland.
| | - Georges Martin
- Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel, Switzerland.
| | - Mihaela Zavolan
- Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel, Switzerland.
| |
Collapse
|
23
|
Davies MN, Krause L, Bell JT, Gao F, Ward KJ, Wu H, Lu H, Liu Y, Tsai PC, Collier DA, Murphy T, Dempster E, Mill J, Battle A, Mostafavi S, Zhu X, Henders A, Byrne E, Wray NR, Martin NG, Spector TD, Wang J. Hypermethylation in the ZBTB20 gene is associated with major depressive disorder. Genome Biol 2014; 15:R56. [PMID: 24694013 PMCID: PMC4072999 DOI: 10.1186/gb-2014-15-4-r56] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2013] [Accepted: 04/02/2014] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Although genetic variation is believed to contribute to an individual's susceptibility to major depressive disorder, genome-wide association studies have not yet identified associations that could explain the full etiology of the disease. Epigenetics is increasingly believed to play a major role in the development of common clinical phenotypes, including major depressive disorder. RESULTS Genome-wide MeDIP-Sequencing was carried out on a total of 50 monozygotic twin pairs from the UK and Australia that are discordant for depression. We show that major depressive disorder is associated with significant hypermethylation within the coding region of ZBTB20, and is replicated in an independent cohort of 356 unrelated case-control individuals. The twins with major depressive disorder also show increased global variation in methylation in comparison with their unaffected co-twins. ZBTB20 plays an essential role in the specification of the Cornu Ammonis-1 field identity in the developing hippocampus, a region previously implicated in the development of major depressive disorder. CONCLUSIONS Our results suggest that aberrant methylation profiles affecting the hippocampus are associated with major depressive disorder and show the potential of the epigenetic twin model in neuro-psychiatric disease.
Collapse
|