1
|
Hu Qian S, Shi MW, Wang DY, Fear JM, Chen L, Tu YX, Liu HS, Zhang Y, Zhang SJ, Yu SS, Oliver B, Chen ZX. Integrating massive RNA-seq data to elucidate transcriptome dynamics in Drosophila melanogaster. Brief Bioinform 2023; 24:bbad177. [PMID: 37232385 PMCID: PMC10505420 DOI: 10.1093/bib/bbad177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 04/19/2023] [Accepted: 04/20/2023] [Indexed: 05/27/2023] Open
Abstract
The volume of ribonucleic acid (RNA)-seq data has increased exponentially, providing numerous new insights into various biological processes. However, due to significant practical challenges, such as data heterogeneity, it is still difficult to ensure the quality of these data when integrated. Although some quality control methods have been developed, sample consistency is rarely considered and these methods are susceptible to artificial factors. Here, we developed MassiveQC, an unsupervised machine learning-based approach, to automatically download and filter large-scale high-throughput data. In addition to the read quality used in other tools, MassiveQC also uses the alignment and expression quality as model features. Meanwhile, it is user-friendly since the cutoff is generated from self-reporting and is applicable to multimodal data. To explore its value, we applied MassiveQC to Drosophila RNA-seq data and generated a comprehensive transcriptome atlas across 28 tissues from embryogenesis to adulthood. We systematically characterized fly gene expression dynamics and found that genes with high expression dynamics were likely to be evolutionarily young and expressed at late developmental stages, exhibiting high nonsynonymous substitution rates and low phenotypic severity, and they were involved in simple regulatory programs. We also discovered that human and Drosophila had strong positive correlations in gene expression in orthologous organs, revealing the great potential of the Drosophila system for studying human development and disease.
Collapse
Affiliation(s)
- Sheng Hu Qian
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, China
| | - Meng-Wei Shi
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, China
| | - Dan-Yang Wang
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, China
| | - Justin M Fear
- Section of Developmental Genomics, National Institute of Diabetes and Kidney and Digestive Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Lu Chen
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, China
| | - Yi-Xuan Tu
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, China
| | - Hong-Shan Liu
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, China
| | - Yuan Zhang
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, China
| | - Shuai-Jie Zhang
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, China
| | - Shan-Shan Yu
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, China
| | - Brian Oliver
- Section of Developmental Genomics, National Institute of Diabetes and Kidney and Digestive Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Zhen-Xia Chen
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, China
- Section of Developmental Genomics, National Institute of Diabetes and Kidney and Digestive Diseases, National Institutes of Health, Bethesda, MD 20892, USA
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
- Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan 430070, China
- Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Shenzhen 518000, China
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518000, China
| |
Collapse
|
2
|
Cho CY, Kelliher CM, Haase SB. The cell-cycle transcriptional network generates and transmits a pulse of transcription once each cell cycle. Cell Cycle 2019; 18:363-378. [PMID: 30668223 DOI: 10.1080/15384101.2019.1570655] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Multiple studies have suggested the critical roles of cyclin-dependent kinases (CDKs) as well as a transcription factor (TF) network in generating the robust cell-cycle transcriptional program. However, the precise mechanisms by which these components function together in the gene regulatory network remain unclear. Here we show that the TF network can generate and transmit a "pulse" of transcription independently of CDK oscillations. The premature firing of the transcriptional pulse is prevented by early G1 inhibitors, including transcriptional corepressors and the E3 ubiquitin ligase complex APCCdh1. We demonstrate that G1 cyclin-CDKs facilitate the activation and accumulation of TF proteins in S/G2/M phases through inhibiting G1 transcriptional corepressors (Whi5 and Stb1) and APCCdh1, thereby promoting the initiation and propagation of the pulse by the TF network. These findings suggest a unique oscillatory mechanism in which global phase-specific transcription emerges from a pulse-generating network that fires once-and-only-once at the start of the cycle.
Collapse
Affiliation(s)
- Chun-Yi Cho
- a Department of Biology , Duke University , Durham , NC , USA
| | | | - Steven B Haase
- a Department of Biology , Duke University , Durham , NC , USA
| |
Collapse
|
3
|
Jurman G, Filosi M, Visintainer R, Riccadonna S, Furlanello C. Stability in GRN Inference. Methods Mol Biol 2019; 1883:323-346. [PMID: 30547407 DOI: 10.1007/978-1-4939-8882-2_14] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Reconstructing a gene regulatory network from one or more sets of omics measurements has been a major task of computational biology in the last 20 years. Despite an overwhelming number of algorithms proposed to solve the network inference problem either in the general scenario or in an ad-hoc tailored situation, assessing the stability of reconstruction is still an uncharted territory and exploratory studies mainly tackled theoretical aspects. We introduce here empirical stability, which is induced by variability of reconstruction as a function of data subsampling. By evaluating differences between networks that are inferred using different subsets of the same data we obtain quantitative indicators of the robustness of the algorithm, of the noise level affecting the data, and, overall, of the reliability of the reconstructed graph. We show that empirical stability can be used whenever no ground truth is available to compute a direct measure of the similarity between the inferred structure and the true network. The main ingredient here is a suite of indicators, called NetSI, providing statistics of distances between graphs generated by a given algorithm fed with different data subsets, where the chosen metric is the Hamming-Ipsen-Mikhailov (HIM) distance evaluating dissimilarity of graph topologies with shared nodes. Operatively, the NetSI family is demonstrated here on synthetic and high-throughput datasets, inferring graphs at different resolution levels (topology, direction, weight), showing how the stability indicators can be effectively used for the quantitative comparison of the stability of different reconstruction algorithms.
Collapse
Affiliation(s)
| | | | - Roberto Visintainer
- The Microsoft Research - University of Trento Centre for Computational and Systems Biology (COSBI), Rovereto, Italy
| | | | | |
Collapse
|
4
|
Kelliher CM, Foster MW, Motta FC, Deckard A, Soderblom EJ, Moseley MA, Haase SB. Layers of regulation of cell-cycle gene expression in the budding yeast Saccharomyces cerevisiae. Mol Biol Cell 2018; 29:2644-2655. [PMID: 30207828 PMCID: PMC6249835 DOI: 10.1091/mbc.e18-04-0255] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Revised: 08/30/2018] [Accepted: 09/04/2018] [Indexed: 11/11/2022] Open
Abstract
In the budding yeast Saccharomyces cerevisiae, transcription factors (TFs) regulate the periodic expression of many genes during the cell cycle, including gene products required for progression through cell-cycle events. Experimental evidence coupled with quantitative models suggests that a network of interconnected TFs is capable of regulating periodic genes over the cell cycle. Importantly, these dynamical models were built on transcriptomics data and assumed that TF protein levels and activity are directly correlated with mRNA abundance. To ask whether TF transcripts match protein expression levels as cells progress through the cell cycle, we applied a multiplexed targeted mass spectrometry approach (parallel reaction monitoring) to synchronized populations of cells. We found that protein expression of many TFs and cell-cycle regulators closely followed their respective mRNA transcript dynamics in cycling wild-type cells. Discordant mRNA/protein expression dynamics was also observed for a subset of cell-cycle TFs and for proteins targeted for degradation by E3 ubiquitin ligase complexes such as SCF (Skp1/Cul1/F-box) and APC/C (anaphase-promoting complex/cyclosome). We further profiled mutant cells lacking B-type cyclin/CDK activity ( clb1-6) where oscillations in ubiquitin ligase activity, cyclin/CDKs, and cell-cycle progression are halted. We found that a number of proteins were no longer periodically degraded in clb1-6 mutants compared with wild type, highlighting the importance of posttranscriptional regulation. Finally, the TF complexes responsible for activating G1/S transcription (SBF and MBF) were more constitutively expressed at the protein level than at periodic mRNA expression levels in both wild-type and mutant cells. This comprehensive investigation of cell-cycle regulators reveals that multiple layers of regulation (transcription, protein stability, and proteasome targeting) affect protein expression dynamics during the cell cycle.
Collapse
Affiliation(s)
| | - Matthew W. Foster
- Duke Center for Genomic and Computational Biology, Proteomics and Metabolomics Shared Resource, Durham, NC 27701
| | | | | | - Erik J. Soderblom
- Duke Center for Genomic and Computational Biology, Proteomics and Metabolomics Shared Resource, Durham, NC 27701
| | - M. Arthur Moseley
- Duke Center for Genomic and Computational Biology, Proteomics and Metabolomics Shared Resource, Durham, NC 27701
| | | |
Collapse
|
5
|
Yamada T, Akimitsu N. Contributions of regulated transcription and mRNA decay to the dynamics of gene expression. WILEY INTERDISCIPLINARY REVIEWS-RNA 2018; 10:e1508. [PMID: 30276972 DOI: 10.1002/wrna.1508] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2018] [Revised: 08/06/2018] [Accepted: 08/27/2018] [Indexed: 12/21/2022]
Abstract
Organisms have acquired sophisticated regulatory networks that control gene expression in response to cellular perturbations. Understanding of the mechanisms underlying the coordinated changes in gene expression in response to external and internal stimuli is a fundamental issue in biology. Recent advances in high-throughput technologies have enabled the measurement of diverse biological information, including gene expression levels, kinetics of gene expression, and interactions among gene expression regulatory molecules. By coupling these technologies with quantitative modeling, we can now uncover the biological roles and mechanisms of gene regulation at the system level. This review consists of two parts. First, we focus on the methods using uridine analogs that measure synthesis and decay rates of RNAs, which demonstrate how cells dynamically change the regulation of gene expression in response to both internal and external cues. Second, we discuss the underlying mechanisms of these changes in kinetics, including the functions of transcription factors and RNA-binding proteins. Overall, this review will help to clarify a system-level view of gene expression programs in cells. This article is categorized under: Regulatory RNAs/RNAi/Riboswitches > Regulatory RNAs RNA Turnover and Surveillance > Regulation of RNA Stability RNA Methods > RNA Analyses in vitro and In Silico.
Collapse
Affiliation(s)
- Toshimichi Yamada
- Department of Molecular and Cellular Biochemistry, Meiji Pharmaceutical University, Tokyo, Japan
| | | |
Collapse
|
6
|
Cho CY, Motta FC, Kelliher CM, Deckard A, Haase SB. Reconciling conflicting models for global control of cell-cycle transcription. Cell Cycle 2017; 16:1965-1978. [PMID: 28934013 PMCID: PMC5638368 DOI: 10.1080/15384101.2017.1367073] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2017] [Accepted: 08/07/2017] [Indexed: 10/18/2022] Open
Abstract
Models for the control of global cell-cycle transcription have advanced from a CDK-APC/C oscillator, a transcription factor (TF) network, to coupled CDK-APC/C and TF networks. Nonetheless, current models were challenged by a recent study that concluded that the cell-cycle transcriptional program is primarily controlled by a CDK-APC/C oscillator in budding yeast. Here we report an analysis of the transcriptome dynamics in cyclin mutant cells that were not queried in the previous study. We find that B-cyclin oscillation is not essential for control of phase-specific transcription. Using a mathematical model, we demonstrate that the function of network TFs can be retained in the face of significant reductions in transcript levels. Finally, we show that cells arrested at mitotic exit with non-oscillating levels of B-cyclins continue to cycle transcriptionally. Taken together, these findings support a critical role of a TF network and a requirement for CDK activities that need not be periodic.
Collapse
Affiliation(s)
- Chun-Yi Cho
- Department of Biology, Duke University, Durham, NC, USA
| | | | | | | | | |
Collapse
|
7
|
Reverse engineering highlights potential principles of large gene regulatory network design and learning. NPJ Syst Biol Appl 2017. [PMID: 28649444 PMCID: PMC5481436 DOI: 10.1038/s41540-017-0019-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Inferring transcriptional gene regulatory networks from transcriptomic datasets is a key challenge of systems biology, with potential impacts ranging from medicine to agronomy. There are several techniques used presently to experimentally assay transcription factors to target relationships, defining important information about real gene regulatory networks connections. These techniques include classical ChIP-seq, yeast one-hybrid, or more recently, DAP-seq or target technologies. These techniques are usually used to validate algorithm predictions. Here, we developed a reverse engineering approach based on mathematical and computer simulation to evaluate the impact that this prior knowledge on gene regulatory networks may have on training machine learning algorithms. First, we developed a gene regulatory networks-simulating engine called FRANK (Fast Randomizing Algorithm for Network Knowledge) that is able to simulate large gene regulatory networks (containing 104 genes) with characteristics of gene regulatory networks observed in vivo. FRANK also generates stable or oscillatory gene expression directly produced by the simulated gene regulatory networks. The development of FRANK leads to important general conclusions concerning the design of large and stable gene regulatory networks harboring scale free properties (built ex nihilo). In combination with supervised (accepting prior knowledge) support vector machine algorithm we (i) address biologically oriented questions concerning our capacity to accurately reconstruct gene regulatory networks and in particular we demonstrate that prior-knowledge structure is crucial for accurate learning, and (ii) draw conclusions to inform experimental design to performed learning able to solve gene regulatory networks in the future. By demonstrating that our predictions concerning the influence of the prior-knowledge structure on support vector machine learning capacity holds true on real data (Escherichia coli K14 network reconstruction using network and transcriptomic data), we show that the formalism used to build FRANK can to some extent be a reasonable model for gene regulatory networks in real cells. This work by Carré et al addresses central questions in biology, which are: how very large gene regulatory networks (GRNs) are organized, generate stable gene expression, and can be learnt using machine learning algorithms? In this work authors developed an algorithm able to simulate large GRNs. From these networks they simulate stable or oscillating gene expression and highlights some mathematical rules controlling such a collective (several thousands of genes) behavior. They discuss consequent hypothesis concerning the organization of GRNs in real cells. Using this simulation tool, authors also demonstrate that it’s likely possible to computationally learn GRNs from transcriptomic data and prior knowledge on the network (actual known connections issued from Yeast One Hybrid or ChIP Seq for instance). They particularly highlight the crucial importance of the prior knowledge structure in their capacity to learn large GRNs.
Collapse
|
8
|
Kelliher CM, Haase SB. Connecting virulence pathways to cell-cycle progression in the fungal pathogen Cryptococcus neoformans. Curr Genet 2017; 63:803-811. [PMID: 28265742 PMCID: PMC5605583 DOI: 10.1007/s00294-017-0688-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2017] [Revised: 02/22/2017] [Accepted: 02/22/2017] [Indexed: 11/01/2022]
Abstract
Proliferation and host evasion are critical processes to understand at a basic biological level for improving infectious disease treatment options. The human fungal pathogen Cryptococcus neoformans causes fungal meningitis in immunocompromised individuals by proliferating in cerebrospinal fluid. Current antifungal drugs target "virulence factors" for disease, such as components of the cell wall and polysaccharide capsule in C. neoformans. However, mechanistic links between virulence pathways and the cell cycle are not as well studied. Recently, cell-cycle synchronized C. neoformans cells were profiled over time to identify gene expression dynamics (Kelliher et al., PLoS Genet 12(12):e1006453, 2016). Almost 20% of all genes in the C. neoformans genome were periodically activated during the cell cycle in rich media, including 40 genes that have previously been implicated in virulence pathways. Here, we review important findings about cell-cycle-regulated genes in C. neoformans and provide two examples of virulence pathways-chitin synthesis and G-protein coupled receptor signaling-with their putative connections to cell division. We propose that a "comparative functional genomics" approach, leveraging gene expression timing during the cell cycle, orthology to genes in other fungal species, and previous experimental findings, can lead to mechanistic hypotheses connecting the cell cycle to fungal virulence.
Collapse
Affiliation(s)
- Christina M Kelliher
- Department of Biology, Duke University, Box 90338, 130 Science Drive, Durham, NC, 27708-0338, USA
| | - Steven B Haase
- Department of Biology, Duke University, Box 90338, 130 Science Drive, Durham, NC, 27708-0338, USA.
| |
Collapse
|
9
|
Wong PS, Tashiro K, Kuhara S, Aburatani S. Elucidation of the sequential transcriptional activity in Escherichia coli using time-series RNA-seq data. Bioinformation 2017; 13:25-30. [PMID: 28479747 PMCID: PMC5405090 DOI: 10.6026/97320630013025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2016] [Accepted: 01/25/2017] [Indexed: 11/23/2022] Open
Abstract
Functional genomics and gene regulation inference has readily expanded our knowledge and understanding of gene interactions with regards to expression regulation. With the advancement of transcriptome sequencing in time-series comes the ability to study the sequential changes of the transcriptome. Here, we present a new method to augment regulation networks accumulated in literature with transcriptome data gathered from time-series experiments to construct a sequential representation of transcription factor activity. We apply our method on a time-series RNA-Seq data set of Escherichia coli as it transitions from growth to stationary phase over five hours and investigate the various activity in gene regulation process by taking advantage of the correlation between regulatory gene pairs to examine their activity on a dynamic network. We analyse the changes in metabolic activity of the pagP gene and associated transcription factors during phase transition, and visualize the sequential transcriptional activity to describe the change in metabolic pathway activity originating from the pagP transcription factor, phoP. We observe a shift from amino acid and nucleic acid metabolism, to energy metabolism during the transition to stationary phase in E. coli.
Collapse
Affiliation(s)
- Pui Shan Wong
- Biotechnology Research Institute for Drug Discovery, National Institute of AIST, Tokyo, Japan
| | - Kosuke Tashiro
- Graduate School of Bioresource and Bioenvironmental Sciences, Kyushu University, Fukuoka, Japan
| | - Satoru Kuhara
- Graduate School of Bioresource and Bioenvironmental Sciences, Kyushu University, Fukuoka, Japan
| | - Sachiyo Aburatani
- Biotechnology Research Institute for Drug Discovery, National Institute of AIST, Tokyo, Japan
- Com. Bio Big Data Open Innovation Lab. (CBBD-OIL), National Institute of AIST, Tokyo, Japan
| |
Collapse
|
10
|
Papagiannakis A, Niebel B, Wit EC, Heinemann M. Autonomous Metabolic Oscillations Robustly Gate the Early and Late Cell Cycle. Mol Cell 2016; 65:285-295. [PMID: 27989441 DOI: 10.1016/j.molcel.2016.11.018] [Citation(s) in RCA: 66] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Revised: 09/26/2016] [Accepted: 11/09/2016] [Indexed: 10/20/2022]
Abstract
Eukaryotic cell division is known to be controlled by the cyclin/cyclin dependent kinase (CDK) machinery. However, eukaryotes have evolved prior to CDKs, and cells can divide in the absence of major cyclin/CDK components. We hypothesized that an autonomous metabolic oscillator provides dynamic triggers for cell-cycle initiation and progression. Using microfluidics, cell-cycle reporters, and single-cell metabolite measurements, we found that metabolism of budding yeast is a CDK-independent oscillator that oscillates across different growth conditions, both in synchrony with and also in the absence of the cell cycle. Using environmental perturbations and dynamic single-protein depletion experiments, we found that the metabolic oscillator and the cell cycle form a system of coupled oscillators, with the metabolic oscillator separately gating and maintaining synchrony with the early and late cell cycle. Establishing metabolism as a dynamic component within the cell-cycle network opens new avenues for cell-cycle research and therapeutic interventions for proliferative disorders.
Collapse
Affiliation(s)
- Alexandros Papagiannakis
- Molecular Systems Biology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Nijenborgh 4, 9747 AG Groningen, the Netherlands
| | - Bastian Niebel
- Molecular Systems Biology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Nijenborgh 4, 9747 AG Groningen, the Netherlands
| | - Ernst C Wit
- Probability and Statistics, Johann Bernoulli Institute of Mathematics and Computer Science, University of Groningen, Nijenborgh 9, 9747 AG Groningen, the Netherlands
| | - Matthias Heinemann
- Molecular Systems Biology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Nijenborgh 4, 9747 AG Groningen, the Netherlands.
| |
Collapse
|