1
|
Zhou B, Guo Y, Xue Y, Ji X, Huang Y. Comprehensive insights into the mechanism of keratin degradation and exploitation of keratinase to enhance the bioaccessibility of soybean protein. BIOTECHNOLOGY FOR BIOFUELS AND BIOPRODUCTS 2023; 16:177. [PMID: 37978558 PMCID: PMC10655438 DOI: 10.1186/s13068-023-02426-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 11/02/2023] [Indexed: 11/19/2023]
Abstract
Keratin is a recalcitrant protein and can be decomposed in nature. However, the mechanism of keratin degradation is still not well understood. In this study, Bacillus sp. 8A6 can completely degrade the feather in 20 h, which is an efficient keratin degrader reported so far. Comprehensive transcriptome analysis continuously tracks the metabolism of Bacillus sp. 8A6 throughout its growth in feather medium. It reveals for the first time how the strain can acquire nutrients and energy in an oligotrophic feather medium for proliferation in the early stage. Then, the degradation of the outer lipid layer of feather can expose the internal keratin structure for disulfide bonds reduction by sulfite from the newly identified sulfite metabolic pathway, disulfide reductases and iron uptake. The resulting weakened keratin has been further proposedly de-assembled by the S9 protease and hydrolyzed by synergistic effects of the endo, exo and oligo-proteases from S1, S8, M3, M14, M20, M24, M42, M84 and T3 families. Finally, bioaccessible peptides and amino acids are generated and transported for strain growth. The keratinase has been applied for soybean hydrolysis, which generates 2234 peptides and 559.93 mg/L17 amino acids. Therefore, the keratinases, inducing from the poultry waste, have great potential to be further applied for producing bioaccessible peptides and amino acids for feed industry.
Collapse
Affiliation(s)
- Beiya Zhou
- College of Mathematical Sciences, Bohai University, Jinzhou, 121013, Liaoning, China
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, China
- Huizhou Institute of Green Energy and Advanced Materials, Huizhou, 516000, Guangdong, China
| | - Yandong Guo
- College of Mathematical Sciences, Bohai University, Jinzhou, 121013, Liaoning, China.
| | - Yaju Xue
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, China
| | - Xiuling Ji
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, China
| | - Yuhong Huang
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, China.
| |
Collapse
|
2
|
Current advances in primate genomics: novel approaches for understanding evolution and disease. Nat Rev Genet 2023; 24:314-331. [PMID: 36599936 DOI: 10.1038/s41576-022-00554-w] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/07/2022] [Indexed: 01/05/2023]
Abstract
Primate genomics holds the key to understanding fundamental aspects of human evolution and disease. However, genetic diversity and functional genomics data sets are currently available for only a few of the more than 500 extant primate species. Concerted efforts are under way to characterize primate genomes, genetic polymorphism and divergence, and functional landscapes across the primate phylogeny. The resulting data sets will enable the connection of genotypes to phenotypes and provide new insight into aspects of the genetics of primate traits, including human diseases. In this Review, we describe the existing genome assemblies as well as genetic variation and functional genomic data sets. We highlight some of the challenges with sample acquisition. Finally, we explore how technological advances in single-cell functional genomics and induced pluripotent stem cell-derived organoids will facilitate our understanding of the molecular foundations of primate biology.
Collapse
|
3
|
Singh A, Hermann BP. Bulk and Single-Cell RNA-Seq Analyses for Studies of Spermatogonia. Methods Mol Biol 2023; 2656:37-70. [PMID: 37249866 DOI: 10.1007/978-1-0716-3139-3_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Robust methods have been developed that leverage next-generation sequencing (NGS) to measure abundance of all mRNAs (RNA-seq) in samples as small as individual cells in order to study the testicular transcriptome in mammals. In this chapter, we present robust options for implementing bioinformatics workflows for the analysis of bulk RNA-seq from aggregate samples of hundreds to millions of cells and single-cell RNA-seq from individual cells. We also provide detailed protocols for using the R packages DESeq2 and Seurat, important parameters for successful implementation, and considerations for drawing conclusions from the results.
Collapse
Affiliation(s)
- Anukriti Singh
- Department of Neuroscience, Developmental and Regenerative Biology, The University of Texas at San Antonio, San Antonio, TX, USA
| | - Brian P Hermann
- Department of Neuroscience, Developmental and Regenerative Biology, University of Texas at San Antonio, San Antonio, TX, USA.
| |
Collapse
|
4
|
Shao G, He T, Mu Y, Mu P, Ao J, Lin X, Ruan L, Wang Y, Gao Y, Liu D, Zhang L, Chen X. The genome of a hadal sea cucumber reveals novel adaptive strategies to deep-sea environments. iScience 2022; 25:105545. [PMID: 36444293 PMCID: PMC9700323 DOI: 10.1016/j.isci.2022.105545] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Revised: 01/18/2022] [Accepted: 11/07/2022] [Indexed: 11/11/2022] Open
Abstract
How organisms cope with coldness and high pressure in the hadal zone remains poorly understood. Here, we sequenced and assembled the genome of hadal sea cucumber Paelopatides sp. Yap with high quality and explored its potential mechanisms for deep-sea adaptation. First, the expansion of ACOX1 for rate-limiting enzyme in the DHA synthesis pathway, increased DHA content in the phospholipid bilayer, and positive selection of EPT1 may maintain cell membrane fluidity. Second, three genes for translation initiation factors and two for ribosomal proteins underwent expansion, and three ribosomal protein genes were positively selected, which may ameliorate the protein synthesis inhibition or ribosome dissociation in the hadal zone. Third, expansion and positive selection of genes associated with stalled replication fork recovery and DNA repair suggest improvements in DNA protection. This is the first genome sequence of a hadal invertebrate. Our results provide insights into the genetic adaptations used by invertebrate in deep oceans.
Collapse
Affiliation(s)
- Guangming Shao
- Key Laboratory of Marine Biotechnology of Fujian Province, Institute of Oceanology, College of Marine Sciences, Fujian Agriculture and Forestry University, Fuzhou, Fujian 350002, China
| | - Tianliang He
- Key Laboratory of Marine Biotechnology of Fujian Province, Institute of Oceanology, College of Marine Sciences, Fujian Agriculture and Forestry University, Fuzhou, Fujian 350002, China
| | - Yinnan Mu
- Key Laboratory of Marine Biotechnology of Fujian Province, Institute of Oceanology, College of Marine Sciences, Fujian Agriculture and Forestry University, Fuzhou, Fujian 350002, China
| | - Pengfei Mu
- Key Laboratory of Marine Biotechnology of Fujian Province, Institute of Oceanology, College of Marine Sciences, Fujian Agriculture and Forestry University, Fuzhou, Fujian 350002, China
| | - Jingqun Ao
- Key Laboratory of Marine Biotechnology of Fujian Province, Institute of Oceanology, College of Marine Sciences, Fujian Agriculture and Forestry University, Fuzhou, Fujian 350002, China
| | - Xihuang Lin
- Key Laboratory of Marine Biogenetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, Fujian 361005, China
| | - Lingwei Ruan
- Key Laboratory of Marine Biogenetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, Fujian 361005, China
| | - YuGuang Wang
- Key Laboratory of Marine Biogenetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, Fujian 361005, China
| | - Yuan Gao
- Genomics and Genetic Engineering Laboratory of Ornamental Plants, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou 310058, China
| | - Dinggao Liu
- Genomics and Genetic Engineering Laboratory of Ornamental Plants, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou 310058, China
| | - Liangsheng Zhang
- Genomics and Genetic Engineering Laboratory of Ornamental Plants, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou 310058, China
| | - Xinhua Chen
- Key Laboratory of Marine Biotechnology of Fujian Province, Institute of Oceanology, College of Marine Sciences, Fujian Agriculture and Forestry University, Fuzhou, Fujian 350002, China
- Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, Guangdong 519000, China
| |
Collapse
|
5
|
Lee SG, Na D, Park C. Comparability of reference-based and reference-free transcriptome analysis approaches at the gene expression level. BMC Bioinformatics 2021; 22:310. [PMID: 34674628 PMCID: PMC8529712 DOI: 10.1186/s12859-021-04226-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2021] [Accepted: 06/01/2021] [Indexed: 11/10/2022] Open
Abstract
Background Lately, high-throughput RNA sequencing has been extensively used to elucidate the transcriptome landscape and dynamics of cell types of different species. In particular, for most non-model organisms lacking complete reference genomes with high-quality annotation of genetic information, reference-free (RF) de novo transcriptome analyses, rather than reference-based (RB) approaches, are widely used, and RF analyses have substantially contributed toward understanding the mechanisms regulating key biological processes and functions. To date, numerous bioinformatics studies have been conducted for assessing the workflow, production rate, and completeness of transcriptome assemblies within and between RF and RB datasets. However, the degree of consistency and variability of results obtained by analyzing gene expression levels through these two different approaches have not been adequately documented. Results In the present study, we evaluated the differences in expression profiles obtained with RF and RB approaches and revealed that the former tends to be satisfactorily replaced by the latter with respect to transcriptome repertoires, as well as from a gene expression quantification perspective. In addition, we urge cautious interpretation of these findings. Several genes that are lowly expressed, have long coding sequences, or belong to large gene families must be validated carefully, whenever gene expression levels are calculated using the RF method. Conclusions Our empirical results indicate important contributions toward addressing transcriptome-related biological questions in non-model organisms. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04226-0.
Collapse
Affiliation(s)
- Sung-Gwon Lee
- School of Biological Sciences and Technology, Chonnam National University, Gwangju, 61186, Republic of Korea
| | - Dokyun Na
- Department of Biomedical Engineering, Chung-Ang University, Seoul, 06974, Republic of Korea
| | - Chungoo Park
- School of Biological Sciences and Technology, Chonnam National University, Gwangju, 61186, Republic of Korea.
| |
Collapse
|
6
|
Lee K, Yu H, Shouse S, Kong B, Lee J, Lee SH, Ko KS. RNA-Seq Reveals Different Gene Expression in Liver-Specific Prohibitin 1 Knock-Out Mice. Front Physiol 2021; 12:717911. [PMID: 34539442 PMCID: PMC8446661 DOI: 10.3389/fphys.2021.717911] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 07/27/2021] [Indexed: 12/24/2022] Open
Abstract
Prohibitin 1 (PHB1) is an evolutionarily conserved and ubiquitously expressed protein that stabilizes mitochondrial chaperone. Our previous studies showed that liver-specific Phb1 deficiency induced liver injuries and aggravated lipopolysaccharide (LPS)-induced innate immune responses. In this study, we performed RNA-sequencing (RNA-seq) analysis with liver tissues to investigate global gene expression among liver-specific Phb1−/−, Phb1+/−, and WT mice, focusing on the differentially expressed (DE) genes between Phb1+/− and WT. When 78 DE genes were analyzed for biological functions, using ingenuity pathway analysis (IPA) tool, lipid metabolism-related genes, including insulin receptor (Insr), sterol regulatory element-binding transcription factor 1 (Srebf1), Srebf2, and SREBP cleavage-activating protein (Scap) appeared to be downregulated in liver-specific Phb1+/− compared with WT. Diseases and biofunctions analyses conducted by IPA verified that hepatic system diseases, including liver fibrosis, liver hyperplasia/hyperproliferation, and liver necrosis/cell death, which may be caused by hepatotoxicity, were highly associated with liver-specific Phb1 deficiency in mice. Interestingly, of liver disease-related 5 DE genes between Phb1+/− and WT, the mRNA expressions of forkhead box M1 (Foxm1) and TIMP inhibitor of metalloproteinase (Timp1) were matched with validation for RNA-seq in liver tissues and AML12 cells transfected with Phb1 siRNA. The results in this study provide additional insights into molecular mechanisms responsible for increasing susceptibility of liver injuries associated with hepatic Phb1.
Collapse
Affiliation(s)
- Kyuwon Lee
- Department of Nutritional Science and Food Management, College of Science and Industry Convergence, Ewha Womans University, Seoul, South Korea
| | - Hyeonju Yu
- Department of Nutritional Science and Food Management, College of Science and Industry Convergence, Ewha Womans University, Seoul, South Korea
| | - Stephanie Shouse
- Center of Excellence for Poultry Science, University of Arkansas System Division of Agriculture, Fayetteville, AR, United States
| | - Byungwhi Kong
- Center of Excellence for Poultry Science, University of Arkansas System Division of Agriculture, Fayetteville, AR, United States
| | - Jihye Lee
- Department of Nutrition and Food Science, College of Agriculture and Natural Resources, University of Maryland, College Park, MD, United States
| | - Seong-Ho Lee
- Department of Nutrition and Food Science, College of Agriculture and Natural Resources, University of Maryland, College Park, MD, United States
| | - Kwang Suk Ko
- Department of Nutritional Science and Food Management, College of Science and Industry Convergence, Ewha Womans University, Seoul, South Korea.,Karsh Division of Gastroenterology and Hepatology, Department of Medicine, Cedars-Sinai Medical Center, Beverly Hills, CA, United States
| |
Collapse
|
7
|
Nodehi HM, Tabatabaiefar MA, Sehhati M. Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data. JOURNAL OF MEDICAL SIGNALS & SENSORS 2021; 11:37-44. [PMID: 34026589 PMCID: PMC8043119 DOI: 10.4103/jmss.jmss_7_20] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 01/28/2020] [Accepted: 02/12/2020] [Indexed: 11/04/2022]
Abstract
Background Careful design in the primary steps of a next-generation sequencing study is critical for obtaining successful results in downstream analysis. Methods In this study, a framework is proposed to evaluate and improve the sequence mapping in targeted regions of the reference genome. In this regard, simulated short reads were produced from the coding regions of the human genome and mapped to a Customized Target-Based Reference (CTBR) by the alignment tools that have been introduced recently. The short reads produced by different sequencing technologies aligned to the standard genome and also CTBR with and without well-defined mutation types where the amount of unmapped and misaligned reads and runtime was measured for comparison. Results The results showed that the mapping accuracy of the reads generated from Illumina Hiseq2500 using Stampy as the alignment tool whenever the CTBR was used as reference was significantly better than other evaluated pipelines. Using CTBR for alignment significantly decreased the mapping error in comparison to other expanded or more limited references. While intentional mutations were imported in the reads, Stampy showed the minimum error of 1.67% using CTBR. However, the lowest error obtained by stampy too using whole genome and one chromosome as references was 3.78% and 20%, respectively. Maximum and minimum misalignment errors were observed on chromosome Y and 20, respectively. Conclusion Therefore using the proposed framework in a clinical targeted sequencing study may lead to predict the error and improve the performance of variant calling regarding the genomic regions targeted in a clinical study.
Collapse
Affiliation(s)
- Hannane Mohammadi Nodehi
- Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Mohammad Amin Tabatabaiefar
- Department of Medical Genetics, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.,Department of Bioinformatics, Medical Image and Signal Processing Research Center, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Mohammadreza Sehhati
- Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|
8
|
Galise TR, Esposito S, D'Agostino N. Guidelines for Setting Up a mRNA Sequencing Experiment and Best Practices for Bioinformatic Data Analysis. Methods Mol Biol 2021; 2264:137-162. [PMID: 33263908 DOI: 10.1007/978-1-0716-1201-9_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
RNA-sequencing, commonly referred to as RNA-seq, is the most recently developed method for the analysis of transcriptomes. It uses high-throughput next-generation sequencing technologies and has revolutionized our understanding of the complexity and dynamics of whole transcriptomes.In this chapter, we recall the key developments in transcriptome analysis and dissect the different steps of the general workflow that can be run by users to design and perform a mRNA-seq experiment as well as to process mRNA-seq data obtained by the Illumina technology. The chapter proposes guidelines for completing a mRNA-seq study properly and makes available recommendations for best practices based on recent literature and on the latest developments in technology and algorithms. We also remark the large number of choices available (especially for bioinformatic data analysis) in front of which the scientist may be in trouble.In the last part of the chapter we discuss the new frontiers of single-cell RNA-seq and isoform sequencing by long read technology.
Collapse
Affiliation(s)
- Teresa Rosa Galise
- Department of Agricultural Sciences, University of Naples Federico II, Portici, Italy
| | - Salvatore Esposito
- CREA Research Centre for Vegetable and Ornamental Crops, Pontecagnano Faiano, Italy
| | - Nunzio D'Agostino
- Department of Agricultural Sciences, University of Naples Federico II, Portici, Italy.
| |
Collapse
|
9
|
Abstract
RNA-Seq is nowadays an indispensable approach for comparative transcriptome profiling in model and nonmodel organisms. Analyzing RNA-Seq data from nonmodel organisms poses unique challenges, due to unavailability of a high-quality genome reference and to relative sparsity of tools for downstream functional analyses. In this chapter, we provide an overview of the analysis steps in RNA-Seq projects of nonmodel organisms, while elaborating on aspects that are unique to this analysis. These will include (1) strategic decisions that have to be made in advance, regarding sequencing technology and reference to use; (2) how to search for available draft genomes, and, if necessary, how to improve their gene prediction and annotation; (3) how to clean raw reads before de novo assembly; (4) how to separate the reads in RNA-Seq projects of symbiont organisms; (5) how to design and carry out a de novo transcriptome assembly that will be comprehensive and reliable; (6) how to assess transcriptome quality; (7) when and how to reduce redundancy in the transcriptome; (8) techniques and considerations in transcriptome functional annotation; (9) quantitating transcript abundance in the face of high transcriptome redundancy; and, most importantly, (10) how to achieve functional enrichment testing using available tools which either support a large range of species or enable a universal, non-species-specific analysis.Throughout the chapter, we will refer to a variety of useful software tools. For the initial analysis steps involving high-volume data, these will include Linux-based programs. For the later steps, we will describe both Linux and R packages for advanced users, as well as many user-friendly tools for nonprogrammers. Finally, we will present a full workflow for RNA-Seq analysis of nonmodel organisms using the NeatSeq-Flow platform, which can be used locally through a user-friendly interface.
Collapse
Affiliation(s)
- Vered Chalifa-Caspi
- Bioinformatics Core Facility, Ben-Gurion University of the Negev, Beer-Sheva, Israel.
| |
Collapse
|
10
|
Mahmood K, Orabi J, Kristensen PS, Sarup P, Jørgensen LN, Jahoor A. De novo transcriptome assembly, functional annotation, and expression profiling of rye (Secale cereale L.) hybrids inoculated with ergot (Claviceps purpurea). Sci Rep 2020; 10:13475. [PMID: 32778722 PMCID: PMC7417550 DOI: 10.1038/s41598-020-70406-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 07/24/2020] [Indexed: 12/22/2022] Open
Abstract
Rye is used as food, feed, and for bioenergy production and remain an essential grain crop for cool temperate zones in marginal soils. Ergot is known to cause severe problems in cross-pollinated rye by contamination of harvested grains. The molecular response of the underlying mechanisms of this disease is still poorly understood due to the complex infection pattern. RNA sequencing can provide astonishing details about the transcriptional landscape, hence we employed a transcriptomic approach to identify genes in the underlying mechanism of ergot infection in rye. In this study, we generated de novo assemblies from twelve biological samples of two rye hybrids with identified contrasting phenotypic responses to ergot infection. The final transcriptome of ergot susceptible (DH372) and moderately ergot resistant (Helltop) hybrids contain 208,690 and 192,116 contigs, respectively. By applying the BUSCO pipeline, we confirmed that these transcriptome assemblies contain more than 90% of gene representation of the available orthologue groups at Virdiplantae odb10. We employed a de novo assembled and the draft reference genome of rye to count the differentially expressed genes (DEGs) between the two hybrids with and without inoculation. The gene expression comparisons revealed that 228 genes were linked to ergot infection in both hybrids. The genome ontology enrichment analysis of DEGs associated them with metabolic processes, hydrolase activity, pectinesterase activity, cell wall modification, pollen development and pollen wall assembly. In addition, gene set enrichment analysis of DEGs linked them to cell wall modification and pectinesterase activity. These results suggest that a combination of different pathways, particularly cell wall modification and pectinesterase activity contribute to the underlying mechanism that might lead to resistance against ergot in rye. Our results may pave the way to select genetic material to improve resistance against ergot through better understanding of the mechanism of ergot infection at molecular level. Furthermore, the sequence data and de novo assemblies are valuable as scientific resources for future studies in rye.
Collapse
Affiliation(s)
- Khalid Mahmood
- Nordic Seed A/S, Grindsnabevej 25, 8300, Odder, Denmark. .,Department of Agroecology, Faculty of Science and Technology, Aarhus University, Forsøgsvej 1, Flakkebjerg, 4200, Slagelse, Denmark.
| | - Jihad Orabi
- Nordic Seed A/S, Grindsnabevej 25, 8300, Odder, Denmark
| | | | | | - Lise Nistrup Jørgensen
- Department of Agroecology, Faculty of Science and Technology, Aarhus University, Forsøgsvej 1, Flakkebjerg, 4200, Slagelse, Denmark
| | - Ahmed Jahoor
- Nordic Seed A/S, Grindsnabevej 25, 8300, Odder, Denmark.,Department of Plant Breeding, The Swedish University of Agricultural Sciences, 23053, Alnarp, Sweden
| |
Collapse
|
11
|
Evaluation of Seven Different RNA-Seq Alignment Tools Based on Experimental Data from the Model Plant Arabidopsis thaliana. Int J Mol Sci 2020; 21:ijms21051720. [PMID: 32138290 PMCID: PMC7084517 DOI: 10.3390/ijms21051720] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 02/28/2020] [Accepted: 02/29/2020] [Indexed: 01/15/2023] Open
Abstract
Quantification of gene expression is crucial to connect genome sequences with phenotypic and physiological data. RNA-Sequencing (RNA-Seq) has taken a prominent role in the study of transcriptomic reactions of plants to various environmental and genetic perturbations. However, comparative tests of different tools for RNA-Seq read mapping and quantification have been mainly performed on data from animals or humans, which necessarily neglect, for example, the large genetic variability among natural accessions within plant species. Here, we compared seven computational tools for their ability to map and quantify Illumina single-end reads from the Arabidopsis thaliana accessions Columbia-0 (Col-0) and N14. Between 92.4% and 99.5% of all reads were mapped to the reference genome or transcriptome and the raw count distributions obtained from the different mappers were highly correlated. Using the software DESeq2 to determine differential gene expression (DGE) between plants exposed to 20 °C or 4 °C from these read counts showed a large pairwise overlap between the mappers. Interestingly, when the commercial CLC software was used with its own DGE module instead of DESeq2, strongly diverging results were obtained. All tested mappers provided highly similar results for mapping Illumina reads of two polymorphic Arabidopsis accessions to the reference genome or transcriptome and for the determination of DGE when the same software was used for processing.
Collapse
|
12
|
Gaska JM, Parsons L, Balev M, Cirincione A, Wang W, Schwartz RE, Ploss A. Conservation of cell-intrinsic immune responses in diverse nonhuman primate species. Life Sci Alliance 2019; 2:2/5/e201900495. [PMID: 31649152 PMCID: PMC6814850 DOI: 10.26508/lsa.201900495] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2019] [Revised: 10/14/2019] [Accepted: 10/15/2019] [Indexed: 02/03/2023] Open
Abstract
The transcriptomic response of diverse nonhuman primate (NHP) species to poly(I:C) is highly conserved, and this novel RNA sequencing dataset will help improve NHP genome annotations. Differences in immune responses across species can contribute to the varying permissivity of species to the same viral pathogen. Understanding how our closest evolutionary relatives, nonhuman primates (NHPs), confront pathogens and how these responses have evolved over time could shed light on host range barriers, especially for zoonotic infections. Here, we analyzed cell-intrinsic immunity of primary cells from the broadest panel of NHP species interrogated to date, including humans, great apes, and Old and New World monkeys. Our analysis of their transcriptomes after poly(I:C) transfection revealed conservation in the functional consequences of their response. In mapping reads to either the human or the species-specific genomes, we observed that with the current state of NHP annotations, the percent of reads assigned to a genetic feature was largely similar regardless of the method. Together, these data provide a baseline for the cell-intrinsic responses elicited by a potent immune stimulus across multiple NHP donors, including endangered species, and serve as a resource for refining and furthering the existing annotations of NHP genomes.
Collapse
Affiliation(s)
- Jenna M Gaska
- Lewis Thomas Laboratory, Department of Molecular Biology, Princeton University, Princeton, NJ, USA
| | - Lance Parsons
- Carl Icahn Laboratory, Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Metodi Balev
- Lewis Thomas Laboratory, Department of Molecular Biology, Princeton University, Princeton, NJ, USA
| | - Ann Cirincione
- Lewis Thomas Laboratory, Department of Molecular Biology, Princeton University, Princeton, NJ, USA
| | - Wei Wang
- Carl Icahn Laboratory, Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Robert E Schwartz
- Weill Cornell Medical College, Belfer Research Building, New York, NY, USA
| | - Alexander Ploss
- Lewis Thomas Laboratory, Department of Molecular Biology, Princeton University, Princeton, NJ, USA
| |
Collapse
|
13
|
Quinn TP, Erb I, Richardson MF, Crowley TM. Understanding sequencing data as compositions: an outlook and review. Bioinformatics 2019; 34:2870-2878. [PMID: 29608657 PMCID: PMC6084572 DOI: 10.1093/bioinformatics/bty175] [Citation(s) in RCA: 158] [Impact Index Per Article: 31.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Accepted: 03/26/2018] [Indexed: 12/30/2022] Open
Abstract
Motivation Although seldom acknowledged explicitly, count data generated by sequencing platforms exist as compositions for which the abundance of each component (e.g. gene or transcript) is only coherently interpretable relative to other components within that sample. This property arises from the assay technology itself, whereby the number of counts recorded for each sample is constrained by an arbitrary total sum (i.e. library size). Consequently, sequencing data, as compositional data, exist in a non-Euclidean space that, without normalization or transformation, renders invalid many conventional analyses, including distance measures, correlation coefficients and multivariate statistical models. Results The purpose of this review is to summarize the principles of compositional data analysis (CoDA), provide evidence for why sequencing data are compositional, discuss compositionally valid methods available for analyzing sequencing data, and highlight future directions with regard to this field of study. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Thomas P Quinn
- Bioinformatics Core Research Group, Deakin University, Geelong, Australia
| | - Ionas Erb
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Mark F Richardson
- Bioinformatics Core Research Group, Deakin University, Geelong, Australia.,Centre for Integrative Ecology, School of Life and Environmental Sciences, Deakin University, Geelong, Australia
| | - Tamsyn M Crowley
- Bioinformatics Core Research Group, Deakin University, Geelong, Australia.,Poultry Hub Australia, University of New England, Armidale, Australia
| |
Collapse
|
14
|
Lu X, Wen H, Li Q, Wang G, Li P, Chen J, Sun Y, Yang C, Wu F. Comparative analysis of growth performance and liver transcriptome response of juvenile Ancherythroculter nigrocauda fed diets with different protein levels. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY D-GENOMICS & PROTEOMICS 2019; 31:100592. [PMID: 31200228 DOI: 10.1016/j.cbd.2019.05.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2018] [Revised: 04/27/2019] [Accepted: 05/01/2019] [Indexed: 01/16/2023]
Abstract
This study aimed at investigating the effects of dietary protein levels on the growth and liver transcriptome in juvenile Ancherythroculter nigrocauda. Six semi-purified diets were formulated containing 25 (control), 30, 35, 40, 45, and 50% protein. Each diet was fed to three groups of 35 fish (mean initial weight: 5.86 ± 0.10 g) for 56 days. The rate of weight gain and specific growth rate increased with dietary protein levels from 25% to 40%, but remained unchanged when fed with 45 or 50% dietary protein. The feed conversion ratio was significantly influenced by the dietary protein levels, being the lowest in fish fed 40% protein. Illumina RNA-seq analysis was performed to investigate liver gene expression changes under different dietary protein treatments. A total of 367.78 million clean reads were obtained from the six libraries. Compared with 25% protein treatment library, there were 734, 1946, 1755, 2726, and 1523 upregulated genes, and 407, 1882, 1865, 2216 and 1624 downregulated genes in the 30, 35, 40, 45, and 50% protein treatment libraries, respectively. Trend analysis of these differentially expressed genes (DEGs) identified six statistically significant trends. A series of DEGs that related to protein metabolism, growth and development, lipid metabolism and immune and stress response were identified. Moreover, gene ontology enrichment analysis of the DEGs demonstrated that cellular process, single-organism process, metabolic process and biological regulation were the most highly overrepresented biological processes. Kyoto Encyclopedia of Genes and Genomes enrichment analysis revealed that protein processing in endoplasmic reticulum, PPAR signaling pathway, complement and coagulation cascades, and cytochrome P450 (CYP450s) were significantly enriched in the dietary protein treatment groups. Furthermore, qPCR results showed excellent agreement on those of RNA-seq for both up- and down-regulated genes (including fasn, accα, SCD, CPT-I, igf1, ST, AST, trdmt1, hsp70, cyp450, MHC-II, C4, tgfβ, ube4b, apoE and abcb7). Thus, our results provide the baseline information for the feed formulation and nutritional research for A. nigrocauda.
Collapse
Affiliation(s)
- Xing Lu
- Fisheries Research Institute, Wuhan Academy of Agricultural Sciences, Wuhan 430207, Hubei, China; Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan 430223, Hubei, China
| | - Hua Wen
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan 430223, Hubei, China
| | - Qing Li
- Fisheries Research Institute, Wuhan Academy of Agricultural Sciences, Wuhan 430207, Hubei, China; Wuhan Xianfeng Aquaculture Technology Co. Ltd, Wuhan 430207, China.
| | - Guiying Wang
- Fisheries Research Institute, Wuhan Academy of Agricultural Sciences, Wuhan 430207, Hubei, China; Wuhan Xianfeng Aquaculture Technology Co. Ltd, Wuhan 430207, China
| | - Pei Li
- Fisheries Research Institute, Wuhan Academy of Agricultural Sciences, Wuhan 430207, Hubei, China; Wuhan Xianfeng Aquaculture Technology Co. Ltd, Wuhan 430207, China
| | - Jian Chen
- Fisheries Research Institute, Wuhan Academy of Agricultural Sciences, Wuhan 430207, Hubei, China; Wuhan Xianfeng Aquaculture Technology Co. Ltd, Wuhan 430207, China
| | - Yanhong Sun
- Fisheries Research Institute, Wuhan Academy of Agricultural Sciences, Wuhan 430207, Hubei, China; Wuhan Xianfeng Aquaculture Technology Co. Ltd, Wuhan 430207, China
| | - Changgeng Yang
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan 430223, Hubei, China
| | - Fan Wu
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan 430223, Hubei, China
| |
Collapse
|
15
|
O'Keeffe KR, Jones CD. Challenges and solutions for analysing dual
RNA
‐seq data for non‐model host–pathogen systems. Methods Ecol Evol 2019. [DOI: 10.1111/2041-210x.13135] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Kayleigh R. O'Keeffe
- Department of Biology University of North Carolina at Chapel Hill Chapel Hill North Carolina
| | - Corbin D. Jones
- Department of Biology University of North Carolina at Chapel Hill Chapel Hill North Carolina
- Integrative Program for Biological & Genome Sciences University of North Carolina at Chapel Hill Chapel Hill North Carolina
| |
Collapse
|
16
|
Payá-Milans M, Olmstead JW, Nunez G, Rinehart TA, Staton M. Comprehensive evaluation of RNA-seq analysis pipelines in diploid and polyploid species. Gigascience 2018; 7:5168871. [PMID: 30418578 PMCID: PMC6275443 DOI: 10.1093/gigascience/giy132] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Accepted: 10/21/2018] [Indexed: 11/12/2022] Open
Abstract
Background The usual analysis of RNA sequencing (RNA-seq) reads is based on an existing reference genome and annotated gene models. However, when a reference for the sequenced species is not available, alternatives include using a reference genome from a related species or reconstructing transcript sequences with de novo assembly. In addition, researchers are faced with many options for RNA-seq data processing and limited information on how their decisions will impact the final outcome. Using both a diploid and polyploid species with a distant reference genome, we have tested the influence of different tools at various steps of a typical RNA-seq analysis workflow on the recovery of useful processed data available for downstream analysis. Findings At the preprocessing step, we found error correction has a strong influence on de novo assembly but not on mapping results. After trimming, a greater percentage of reads could be used in downstream analysis by selecting gentle quality trimming performed with Skewer instead of strict quality trimming with Trimmomatic. This availability of reads correlated with size, quality, and completeness of de novo assemblies and with number of mapped reads. When selecting a reference genome from a related species to map reads, outcome was significantly improved when using mapping software tolerant of greater sequence divergence, such as Stampy or GSNAP. Conclusions The selection of bioinformatic software tools for RNA-seq data analysis can maximize quality parameters on de novo assemblies and availability of reads in downstream analysis.
Collapse
Affiliation(s)
- Miriam Payá-Milans
- Department of Entomology and Plant Pathology, University of Tennessee, 370 PBB, 2505 EJ Chapman Blvd, Knoxville, TN, 37996, United States
| | - James W Olmstead
- Horticultural Sciences Department, University of Florida, 2550 Hull Rd, PO Box 110690, Gainesville, FL, 32611, United States
| | - Gerardo Nunez
- Horticultural Sciences Department, University of Florida, 2550 Hull Rd, PO Box 110690, Gainesville, FL, 32611, United States
| | - Timothy A Rinehart
- Thad Cochran Southern Horticultural Laboratory, USDA-Agricultural Research Service, PO Box 287, Poplarville, MS, 39470, United States.,Crop Production and Protection, USDA-Agricultural Research Service, 5601 Sunnyside Ave, Beltsville, MD, 20705, United States
| | - Margaret Staton
- Department of Entomology and Plant Pathology, University of Tennessee, 370 PBB, 2505 EJ Chapman Blvd, Knoxville, TN, 37996, United States
| |
Collapse
|
17
|
Quinn TP, Crowley TM, Richardson MF. Benchmarking differential expression analysis tools for RNA-Seq: normalization-based vs. log-ratio transformation-based methods. BMC Bioinformatics 2018; 19:274. [PMID: 30021534 PMCID: PMC6052553 DOI: 10.1186/s12859-018-2261-8] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Accepted: 06/25/2018] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Count data generated by next-generation sequencing assays do not measure absolute transcript abundances. Instead, the data are constrained to an arbitrary "library size" by the sequencing depth of the assay, and typically must be normalized prior to statistical analysis. The constrained nature of these data means one could alternatively use a log-ratio transformation in lieu of normalization, as often done when testing for differential abundance (DA) of operational taxonomic units (OTUs) in 16S rRNA data. Therefore, we benchmark how well the ALDEx2 package, a transformation-based DA tool, detects differential expression in high-throughput RNA-sequencing data (RNA-Seq), compared to conventional RNA-Seq methods such as edgeR and DESeq2. RESULTS To evaluate the performance of log-ratio transformation-based tools, we apply the ALDEx2 package to two simulated, and two real, RNA-Seq data sets. One of the latter was previously used to benchmark dozens of conventional RNA-Seq differential expression methods, enabling us to directly compare transformation-based approaches. We show that ALDEx2, widely used in meta-genomics research, identifies differentially expressed genes (and transcripts) from RNA-Seq data with high precision and, given sufficient sample sizes, high recall too (regardless of the alignment and quantification procedure used). Although we show that the choice in log-ratio transformation can affect performance, ALDEx2 has high precision (i.e., few false positives) across all transformations. Finally, we present a novel, iterative log-ratio transformation (now implemented in ALDEx2) that further improves performance in simulations. CONCLUSIONS Our results suggest that log-ratio transformation-based methods can work to measure differential expression from RNA-Seq data, provided that certain assumptions are met. Moreover, these methods have very high precision (i.e., few false positives) in simulations and perform well on real data too. With previously demonstrated applicability to 16S rRNA data, ALDEx2 can thus serve as a single tool for data from multiple sequencing modalities.
Collapse
Affiliation(s)
- Thomas P. Quinn
- Centre for Molecular and Medical Research, School of Medicine, Deakin University, Geelong, 3220 Australia
- Bioinformatics Core Research Group, Deakin University, Geelong, 3220 Australia
| | - Tamsyn M. Crowley
- Centre for Molecular and Medical Research, School of Medicine, Deakin University, Geelong, 3220 Australia
- Bioinformatics Core Research Group, Deakin University, Geelong, 3220 Australia
- Poultry Hub Australia, University of New England, Armidale, 2351 Australia
| | - Mark F. Richardson
- Bioinformatics Core Research Group, Deakin University, Geelong, 3220 Australia
- Centre for Integrative Ecology, School of Life and Environmental Science, Deakin University, Geelong, 3220 Australia
| |
Collapse
|
18
|
Moskalev AА, Kudryavtseva AV, Graphodatsky AS, Beklemisheva VR, Serdyukova NA, Krutovsky KV, Sharov VV, Kulakovskiy IV, Lando AS, Kasianov AS, Kuzmin DA, Putintseva YA, Feranchuk SI, Shaposhnikov MV, Fraifeld VE, Toren D, Snezhkina AV, Sitnik VV. De novo assembling and primary analysis of genome and transcriptome of gray whale Eschrichtius robustus. BMC Evol Biol 2017; 17:258. [PMID: 29297306 PMCID: PMC5751776 DOI: 10.1186/s12862-017-1103-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Background Gray whale, Eschrichtius robustus (E. robustus), is a single member of the family Eschrichtiidae, which is considered to be the most primitive in the class Cetacea. Gray whale is often described as a “living fossil”. It is adapted to extreme marine conditions and has a high life expectancy (77 years). The assembly of a gray whale genome and transcriptome will allow to carry out further studies of whale evolution, longevity, and resistance to extreme environment. Results In this work, we report the first de novo assembly and primary analysis of the E. robustus genome and transcriptome based on kidney and liver samples. The presented draft genome assembly is complete by 55% in terms of a total genome length, but only by 24% in terms of the BUSCO complete gene groups, although 10,895 genes were identified. Transcriptome annotation and comparison with other whale species revealed robust expression of DNA repair and hypoxia-response genes, which is expected for whales. Conclusions This preliminary study of the gray whale genome and transcriptome provides new data to better understand the whale evolution and the mechanisms of their adaptation to the hypoxic conditions. Electronic supplementary material The online version of this article (doi: 10.1186/s12862-017-1103-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alexey А Moskalev
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 119991, Russian Federation. .,Institute of Biology of Komi Science Center of Ural Branch of RAS, Syktyvkar, 167982, Russian Federation.
| | - Anna V Kudryavtseva
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 119991, Russian Federation
| | - Alexander S Graphodatsky
- Institute of Molecular and Cellular Biology SB RAS, Novosibirsk, 630090, Russian Federation.,Novosibirsk State University, Novosibirsk, 630090, Russian Federation
| | | | - Natalya A Serdyukova
- Institute of Molecular and Cellular Biology SB RAS, Novosibirsk, 630090, Russian Federation
| | - Konstantin V Krutovsky
- Department of Forest Genetics and Forest Tree Breeding, Georg-August University of Göttingen, Göttingen, 37077, Germany.,Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, 119991, Russian Federation.,Genome Research and Education Center, Siberian Federal University, Krasnoyarsk, 660036, Russian Federation.,Department of Ecosystem Science and Management, Texas A&M University, College Station, 77843-2138, TX, USA
| | - Vadim V Sharov
- Genome Research and Education Center, Siberian Federal University, Krasnoyarsk, 660036, Russian Federation.,Department of High Performance Computing, Institute of Space and Information Technologies, Siberian Federal University, Krasnoyarsk, 660074, Russian Federation
| | - Ivan V Kulakovskiy
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 119991, Russian Federation.,Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, 119991, Russian Federation.,Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, Moscow, 143026, Russia
| | - Andrey S Lando
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, 119991, Russian Federation
| | - Artem S Kasianov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, 119991, Russian Federation.,Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, Moscow, 143026, Russia
| | - Dmitry A Kuzmin
- Genome Research and Education Center, Siberian Federal University, Krasnoyarsk, 660036, Russian Federation.,Department of High Performance Computing, Institute of Space and Information Technologies, Siberian Federal University, Krasnoyarsk, 660074, Russian Federation
| | - Yuliya A Putintseva
- Genome Research and Education Center, Siberian Federal University, Krasnoyarsk, 660036, Russian Federation
| | - Sergey I Feranchuk
- Genome Research and Education Center, Siberian Federal University, Krasnoyarsk, 660036, Russian Federation.,Irkutsk National Research Technical University, Irkutsk, 664074, Russian Federation.,Limnological Institute, Siberian Branch of Russian Academy of Sciences, Irkutsk, 664033, Russian Federation
| | - Mikhail V Shaposhnikov
- Institute of Biology of Komi Science Center of Ural Branch of RAS, Syktyvkar, 167982, Russian Federation
| | - Vadim E Fraifeld
- The Shraga Segal Department of Microbiology, Immunology and Genetics, Faculty of Health Sciences, Center for Multidisciplinary Research on Aging, Ben-Gurion University of the Negev, Beer-Sheva, 8410501, Israel
| | - Dmitri Toren
- The Shraga Segal Department of Microbiology, Immunology and Genetics, Faculty of Health Sciences, Center for Multidisciplinary Research on Aging, Ben-Gurion University of the Negev, Beer-Sheva, 8410501, Israel
| | - Anastasia V Snezhkina
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 119991, Russian Federation
| | - Vasily V Sitnik
- Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, Moscow, 143026, Russia
| |
Collapse
|
19
|
Payá-Milans M, Nunez GH, Olmstead JW, Rinehart TA, Staton M. Regulation of gene expression in roots of the pH-sensitive Vaccinium corymbosum and the pH-tolerant Vaccinium arboreum in response to near neutral pH stress using RNA-Seq. BMC Genomics 2017; 18:580. [PMID: 28784085 PMCID: PMC5547544 DOI: 10.1186/s12864-017-3967-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Accepted: 07/31/2017] [Indexed: 01/19/2023] Open
Abstract
Background Blueberries are one of the few horticultural crops adapted to grow in acidic soils. Neutral to basic soil pH is detrimental to all commonly cultivated blueberry species, including Vaccinium corymbosum (VC). In contrast, the wild species V. arboreum (VA) is able to tolerate a wider range of soil pH. To assess the molecular mechanisms involved in near neutral pH stress response, plants from pH-sensitive VC (tetraploid) and pH-tolerant VA (diploid) were grown at near neutral pH 6.5 and at the preferred pH of 4.5. Results Transcriptome sequencing of root RNA was performed for 4 biological replications per species x pH level interaction, for a total of 16 samples. Reads were mapped to the reference genome from diploid V. corymbosum, transforming ~55% of the reads to gene counts. A quasi-likelihood F test identified differential expression due to pH stress in 337 and 4867 genes in VA and VC, respectively. Both species shared regulation of genes involved in nutrient homeostasis and cell wall metabolism. VA and VC exhibited differential regulation of signaling pathways related to abiotic/biotic stress, cellulose and lignin biosynthesis, and nutrient uptake. Conclusions The specific responses in VA likely facilitate tolerance to higher soil pH. In contrast, response in VC, despite affecting a greater number of genes, is not effective overcoming the stress induced by pH. Further inspection of those genes with differential expression that are specific in VA may provide insight on the mechanisms towards tolerance. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3967-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Miriam Payá-Milans
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN, USA
| | - Gerardo H Nunez
- Horticultural Sciences Department, University of Florida, Gainesville, Florida, USA
| | - James W Olmstead
- Horticultural Sciences Department, University of Florida, Gainesville, Florida, USA
| | - Timothy A Rinehart
- Thad Cochran Southern Horticultural Laboratory, USDA-Agricultural Research Service, Poplarville, MS, USA.,Crop Production and Protection, USDA-Agricultural Research Service, Beltsville, MD, USA
| | - Margaret Staton
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN, USA.
| |
Collapse
|
20
|
Kasianov AS, Klepikova AV, Kulakovskiy IV, Gerasimov ES, Fedotova AV, Besedina EG, Kondrashov AS, Logacheva MD, Penin AA. High-quality genome assembly of Capsella bursa-pastoris reveals asymmetry of regulatory elements at early stages of polyploid genome evolution. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2017; 91:278-291. [PMID: 28387959 DOI: 10.1111/tpj.13563] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Revised: 03/01/2017] [Accepted: 03/31/2017] [Indexed: 05/22/2023]
Abstract
Polyploidization and subsequent sub- and neofunctionalization of duplicated genes represent a major mechanism of plant genome evolution. Capsella bursa-pastoris, a widespread ruderal plant, is a recent allotetraploid and, thus, is an ideal model organism for studying early changes following polyploidization. We constructed a high-quality assembly of C. bursa-pastoris genome and a transcriptome atlas covering a broad sample of organs and developmental stages (available online at http://travadb.org/browse/Species=Cbp). We demonstrate that expression of homeologs is mostly symmetric between subgenomes, and identify a set of homeolog pairs with discordant expression. Comparison of promoters within such pairs revealed emerging asymmetry of regulatory elements. Among them there are multiple binding sites for transcription factors controlling the regulation of photosynthesis and plant development by light (PIF3, HY5) and cold stress response (CBF). These results suggest that polyploidization in C. bursa-pastoris enhanced its plasticity of response to light and temperature, and allowed substantial expansion of its distribution range.
Collapse
Affiliation(s)
- Artem S Kasianov
- A. N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119991, Russia
- N.I. Vavilov Institute of General Genetics, Russian Academy of Sciences, 3 Gubkina str, Moscow, 119333, Russia
| | - Anna V Klepikova
- A. N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119991, Russia
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, 127051, Russia
| | - Ivan V Kulakovskiy
- N.I. Vavilov Institute of General Genetics, Russian Academy of Sciences, 3 Gubkina str, Moscow, 119333, Russia
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilova 32, Moscow, 119991, Russia
- Skolkovo Institute of Science and Technology, Skolkovo Innovation Center, Building 3, Moscow, 143026, Russia
| | - Evgeny S Gerasimov
- A. N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119991, Russia
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, 127051, Russia
- Faculty of Biology, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Anna V Fedotova
- A. N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Elizaveta G Besedina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Alexey S Kondrashov
- A. N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119991, Russia
- Department of Ecology and Evolution, University of Michigan, 830 North University, Ann Arbor, MI 48109-1048, MI, USA
| | - Maria D Logacheva
- A. N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119991, Russia
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, 127051, Russia
- Extreme Biology Laboratory, Institute of Fundamental Medicine and Biology, Kazan Federal University, 18 Kremlevskaya str, Kazan, 420008, Russia
| | - Aleksey A Penin
- A. N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119991, Russia
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, 127051, Russia
- Faculty of Biology, Lomonosov Moscow State University, Moscow, 119991, Russia
| |
Collapse
|
21
|
Chen C, Le H, Goudar CT. Evaluation of two public genome references for chinese hamster ovary cells in the context of rna-seq based gene expression analysis. Biotechnol Bioeng 2017; 114:1603-1613. [PMID: 28295162 DOI: 10.1002/bit.26290] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2016] [Revised: 02/21/2017] [Accepted: 03/10/2017] [Indexed: 11/08/2022]
Abstract
RNA-Seq is a powerful transcriptomics tool for mammalian cell culture process development. Successful RNA-Seq data analysis requires a high quality reference for read mapping and gene expression quantification. Currently, there are two public genome references for Chinese hamster ovary (CHO) cells, the predominant mammalian cell line in the biopharmaceutical industry. In this study, we compared these two references by analyzing 60 RNA-Seq samples from a variety of CHO cell culture conditions. Among the 20,891 common genes in both references, we observed that 31.5% have more than 7.1% quantification differences, implying gene definition differences in the two references. We propose a framework to quantify this difference using two metrics, Consistency and Stringency, which account for the average quantification difference between the two references over all samples, and the sample-specific effect on the quantification result, respectively. These two metrics can be used to identify potential genes for future gene model improvement and to understand the reliability of differentially expressed genes identified by RNA-Seq data analysis. Before a more comprehensive genome reference for CHO cells emerges, the strategy proposed in this study can enable more robust transcriptome analysis from CHO cell RNA-Seq data. Biotechnol. Bioeng. 2017;114: 1603-1613. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Chun Chen
- Drug Substance Technologies, Process Development, Amgen Inc., 1 Amgen Center Drive, Thousand Oaks, California, 91320
| | - Huong Le
- Drug Substance Technologies, Process Development, Amgen Inc., 1 Amgen Center Drive, Thousand Oaks, California, 91320
| | - Chetan T Goudar
- Drug Substance Technologies, Process Development, Amgen Inc., 1 Amgen Center Drive, Thousand Oaks, California, 91320
| |
Collapse
|
22
|
Williams CR, Baccarella A, Parrish JZ, Kim CC. Empirical assessment of analysis workflows for differential expression analysis of human samples using RNA-Seq. BMC Bioinformatics 2017; 18:38. [PMID: 28095772 PMCID: PMC5240434 DOI: 10.1186/s12859-016-1457-z] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2016] [Accepted: 12/31/2016] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND RNA-Seq has supplanted microarrays as the preferred method of transcriptome-wide identification of differentially expressed genes. However, RNA-Seq analysis is still rapidly evolving, with a large number of tools available for each of the three major processing steps: read alignment, expression modeling, and identification of differentially expressed genes. Although some studies have benchmarked these tools against gold standard gene expression sets, few have evaluated their performance in concert with one another. Additionally, there is a general lack of testing of such tools on real-world, physiologically relevant datasets, which often possess qualities not reflected in tightly controlled reference RNA samples or synthetic datasets. RESULTS Here, we evaluate 219 combinatorial implementations of the most commonly used analysis tools for their impact on differential gene expression analysis by RNA-Seq. A test dataset was generated using highly purified human classical and nonclassical monocyte subsets from a clinical cohort, allowing us to evaluate the performance of 495 unique workflows, when accounting for differences in expression units and gene- versus transcript-level estimation. We find that the choice of methodologies leads to wide variation in the number of genes called significant, as well as in performance as gauged by precision and recall, calculated by comparing our RNA-Seq results to those from four previously published microarray and BeadChip analyses of the same cell populations. The method of differential gene expression identification exhibited the strongest impact on performance, with smaller impacts from the choice of read aligner and expression modeler. Many workflows were found to exhibit similar overall performance, but with differences in their calibration, with some biased toward higher precision and others toward higher recall. CONCLUSIONS There is significant heterogeneity in the performance of RNA-Seq workflows to identify differentially expressed genes. Among the higher performing workflows, different workflows exhibit a precision/recall tradeoff, and the ultimate choice of workflow should take into consideration how the results will be used in subsequent applications. Our analyses highlight the performance characteristics of these workflows, and the data generated in this study could also serve as a useful resource for future development of software for RNA-Seq analysis.
Collapse
Affiliation(s)
- Claire R Williams
- Department of Biology, University of Washington, Seattle, WA, 98195, USA
| | - Alyssa Baccarella
- Division of Experimental Medicine, Department of Medicine, University of California, San Francisco, CA, 94143, USA
| | - Jay Z Parrish
- Department of Biology, University of Washington, Seattle, WA, 98195, USA
| | - Charles C Kim
- Division of Experimental Medicine, Department of Medicine, University of California, San Francisco, CA, 94143, USA. .,Present address: Verily, South San Francisco, CA, 94080, USA.
| |
Collapse
|
23
|
Simulation-based comprehensive benchmarking of RNA-seq aligners. Nat Methods 2016; 14:135-139. [PMID: 27941783 DOI: 10.1038/nmeth.4106] [Citation(s) in RCA: 163] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Accepted: 11/15/2016] [Indexed: 01/27/2023]
Abstract
Alignment is the first step in most RNA-seq analysis pipelines, and the accuracy of downstream analyses depends heavily on it. Unlike most steps in the pipeline, alignment is particularly amenable to benchmarking with simulated data. We performed a comprehensive benchmarking of 14 common splice-aware aligners for base, read, and exon junction-level accuracy and compared default with optimized parameters. We found that performance varied by genome complexity, and accuracy and popularity were poorly correlated. The most widely cited tool underperforms for most metrics, particularly when using default settings.
Collapse
|
24
|
Pseudo-Reference-Based Assembly of Vertebrate Transcriptomes. Genes (Basel) 2016; 7:genes7030010. [PMID: 26927182 PMCID: PMC4808791 DOI: 10.3390/genes7030010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2015] [Revised: 02/05/2016] [Accepted: 02/17/2016] [Indexed: 11/17/2022] Open
Abstract
High-throughput RNA sequencing (RNA-seq) provides a comprehensive picture of the transcriptome, including the identity, structure, quantity, and variability of expressed transcripts in cells, through the assembly of sequenced short RNA-seq reads. Although the reference-based approach guarantees the high quality of the resulting transcriptome, this approach is only applicable when the relevant reference genome is present. Here, we developed a pseudo-reference-based assembly (PRA) that reconstructs a transcriptome based on a linear regression function of the optimized mapping parameters and genetic distances of the closest species. Using the linear model, we reconstructed transcriptomes of four different aves, the white leg horn, turkey, duck, and zebra finch, with the Gallus gallus genome as a pseudo-reference, and of three primates, the chimpanzee, gorilla, and macaque, with the human genome as a pseudo-reference. The resulting transcriptomes show that the PRAs outperformed the de novo approach for species with within about 10% mutation rate among orthologous transcriptomes, enough to cover distantly related species as far as chicken and duck. Taken together, we suggest that the PRA method can be used as a tool for reconstructing transcriptome maps of vertebrates whose genomes have not yet been sequenced.
Collapse
|
25
|
Costa ADF, Franco OL. Insights into RNA transcriptome profiling of cardiac tissue in obesity and hypertension conditions. J Cell Physiol 2015; 230:959-68. [PMID: 25393239 DOI: 10.1002/jcp.24807] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2014] [Accepted: 09/05/2014] [Indexed: 12/20/2022]
Abstract
Several epidemiologic studies suggest that obesity and hypertension are associated with cardiac transcriptome modifications that could be further associated with inflammatory processes and cardiac hypertrophy. In this field, transcriptome studies have demonstrated their importance to elucidate physiologic mechanisms, pathways or genes involved in many biologic processes. Over the past decade, RNA microarray and RNA-seq analysis has become an essential component to examine metabolic pathways in terms of mRNA expression in cardiology. In this review, cardiac muscle gene expression in response to effects of obesity and hypertension will be focused, providing a broad view on cardiac transcriptome and physiologic and biochemical mechanisms involved in gene expression changes produced by these events, emphasizing the use of new technologies for gene expression analyses.
Collapse
Affiliation(s)
- Alzenira de Fátima Costa
- Universidade Católica de Brasília, Pós-Graduação em Ciências Genômicas e Biotecnologia Centro de Análises Proteômicas e Bioquímicas, Brasília, Brazil
| | | |
Collapse
|