1
|
Lee AJ, Reiter T, Doing G, Oh J, Hogan DA, Greene CS. Using genome-wide expression compendia to study microorganisms. Comput Struct Biotechnol J 2022; 20:4315-4324. [PMID: 36016717 PMCID: PMC9396250 DOI: 10.1016/j.csbj.2022.08.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 08/07/2022] [Accepted: 08/07/2022] [Indexed: 11/30/2022] Open
Abstract
A gene expression compendium is a heterogeneous collection of gene expression experiments assembled from data collected for diverse purposes. The widely varied experimental conditions and genetic backgrounds across samples creates a tremendous opportunity for gaining a systems level understanding of the transcriptional responses that influence phenotypes. Variety in experimental design is particularly important for studying microbes, where the transcriptional responses integrate many signals and demonstrate plasticity across strains including response to what nutrients are available and what microbes are present. Advances in high-throughput measurement technology have made it feasible to construct compendia for many microbes. In this review we discuss how these compendia are constructed and analyzed to reveal transcriptional patterns.
Collapse
Affiliation(s)
- Alexandra J. Lee
- Genomics and Computational Biology Graduate Program, University of Pennsylvania, Philadelphia, PA, USA
| | - Taylor Reiter
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Denver, CO, USA
| | - Georgia Doing
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Julia Oh
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Deborah A. Hogan
- Department of Microbiology and Immunology, Geisel School of Medicine, Dartmouth, Hanover, NH, USA
| | - Casey S. Greene
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Denver, CO, USA
| |
Collapse
|
2
|
Inference of Bacterial Small RNA Regulatory Networks and Integration with Transcription Factor-Driven Regulatory Networks. mSystems 2020; 5:5/3/e00057-20. [PMID: 32487739 PMCID: PMC8534726 DOI: 10.1128/msystems.00057-20] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Small noncoding RNAs (sRNAs) are key regulators of bacterial gene expression. Through complementary base pairing, sRNAs affect mRNA stability and translation efficiency. Here, we describe a network inference approach designed to identify sRNA-mediated regulation of transcript levels. We use existing transcriptional data sets and prior knowledge to infer sRNA regulons using our network inference tool, the Inferelator. This approach produces genome-wide gene regulatory networks that include contributions by both transcription factors and sRNAs. We show the benefits of estimating and incorporating sRNA activities into network inference pipelines using available experimental data. We also demonstrate how these estimated sRNA regulatory activities can be mined to identify the experimental conditions where sRNAs are most active. We uncover 45 novel experimentally supported sRNA-mRNA interactions in Escherichia coli, outperforming previous network-based efforts. Additionally, our pipeline complements sequence-based sRNA-mRNA interaction prediction methods by adding a data-driven filtering step. Finally, we show the general applicability of our approach by identifying 24 novel, experimentally supported, sRNA-mRNA interactions in Pseudomonas aeruginosa, Staphylococcus aureus, and Bacillus subtilis. Overall, our strategy generates novel insights into the functional context of sRNA regulation in multiple bacterial species. IMPORTANCE Individual bacterial genomes can have dozens of small noncoding RNAs with largely unexplored regulatory functions. Although bacterial sRNAs influence a wide range of biological processes, including antibiotic resistance and pathogenicity, our current understanding of sRNA-mediated regulation is far from complete. Most of the available information is restricted to a few well-studied bacterial species; and even in those species, only partial sets of sRNA targets have been characterized in detail. To close this information gap, we developed a computational strategy that takes advantage of available transcriptional data and knowledge about validated and putative sRNA-mRNA interactions for inferring expanded sRNA regulons. Our approach facilitates the identification of experimentally supported novel interactions while filtering out false-positive results. Due to its data-driven nature, our method prioritizes biologically relevant interactions among lists of candidate sRNA-target pairs predicted in silico from sequence analysis or derived from sRNA-mRNA binding experiments.
Collapse
|
3
|
McClure RS, Wendler JP, Adkins JN, Swanstrom J, Baric R, Kaiser BLD, Oxford KL, Waters KM, McDermott JE. Unified feature association networks through integration of transcriptomic and proteomic data. PLoS Comput Biol 2019; 15:e1007241. [PMID: 31527878 PMCID: PMC6748406 DOI: 10.1371/journal.pcbi.1007241] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Accepted: 07/02/2019] [Indexed: 11/18/2022] Open
Abstract
High-throughput multi-omics studies and corresponding network analyses of multi-omic data have rapidly expanded their impact over the last 10 years. As biological features of different types (e.g. transcripts, proteins, metabolites) interact within cellular systems, the greatest amount of knowledge can be gained from networks that incorporate multiple types of -omic data. However, biological and technical sources of variation diminish the ability to detect cross-type associations, yielding networks dominated by communities comprised of nodes of the same type. We describe here network building methods that can maximize edges between nodes of different data types leading to integrated networks, networks that have a large number of edges that link nodes of different-omic types (transcripts, proteins, lipids etc). We systematically rank several network inference methods and demonstrate that, in many cases, using a random forest method, GENIE3, produces the most integrated networks. This increase in integration does not come at the cost of accuracy as GENIE3 produces networks of approximately the same quality as the other network inference methods tested here. Using GENIE3, we also infer networks representing antibody-mediated Dengue virus cell invasion and receptor-mediated Dengue virus invasion. A number of functional pathways showed centrality differences between the two networks including genes responding to both GM-CSF and IL-4, which had a higher centrality value in an antibody-mediated vs. receptor-mediated Dengue network. Because a biological system involves the interplay of many different types of molecules, incorporating multiple data types into networks will improve their use as models of biological systems. The methods explored here are some of the first to specifically highlight and address the challenges associated with how such multi-omic networks can be assembled and how the greatest number of interactions can be inferred from different data types. The resulting networks can lead to the discovery of new host response patterns and interactions during viral infection, generate new hypotheses of pathogenic mechanisms and confirm mechanisms of disease.
Collapse
Affiliation(s)
- Ryan S. McClure
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland WA, United States of America
| | - Jason P. Wendler
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland WA, United States of America
| | - Joshua N. Adkins
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland WA, United States of America
| | - Jesica Swanstrom
- Department of Microbiology and Immunology, School of Medicine, University of North Carolina, Chapel Hill, Chapel Hill, NC, United States of America
| | - Ralph Baric
- Department of Microbiology and Immunology, School of Medicine, University of North Carolina, Chapel Hill, Chapel Hill, NC, United States of America
| | - Brooke L. Deatherage Kaiser
- Signatures Science and Technology Division, Pacific Northwest National Laboratory, Richland WA, United States of America
| | - Kristie L. Oxford
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland WA, United States of America
| | - Katrina M. Waters
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland WA, United States of America
| | - Jason E. McDermott
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland WA, United States of America
- Department of Molecular Microbiology and Immunology, Oregon Health & Sciences University, Portland, OR, United States of America
| |
Collapse
|
4
|
McClure RS, Overall CC, Hill EA, Song HS, Charania M, Bernstein HC, McDermott JE, Beliaev AS. Species-specific transcriptomic network inference of interspecies interactions. THE ISME JOURNAL 2018; 12:2011-2023. [PMID: 29795448 PMCID: PMC6052077 DOI: 10.1038/s41396-018-0145-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Revised: 02/22/2018] [Accepted: 03/26/2018] [Indexed: 12/25/2022]
Abstract
The advent of high-throughput 'omics approaches coupled with computational analyses to reconstruct individual genomes from metagenomes provides a basis for species-resolved functional studies. Here, a mutual information approach was applied to build a gene association network of a commensal consortium, in which a unicellular cyanobacterium Thermosynechococcus elongatus BP1 supported the heterotrophic growth of Meiothermus ruber strain A. Specifically, we used the context likelihood of relatedness (CLR) algorithm to generate a gene association network from 25 transcriptomic datasets representing distinct growth conditions. The resulting interspecies network revealed a number of linkages between genes in each species. While many of the linkages were supported by the existing knowledge of phototroph-heterotroph interactions and the metabolism of these two species several new interactions were inferred as well. These include linkages between amino acid synthesis and uptake genes, as well as carbohydrate and vitamin metabolism, terpenoid metabolism and cell adhesion genes. Further topological examination and functional analysis of specific gene associations suggested that the interactions are likely to center around the exchange of energetically costly metabolites between T. elongatus and M. ruber. Both the approach and conclusions derived from this work are widely applicable to microbial communities for identification of the interactions between species and characterization of community functioning as a whole.
Collapse
Affiliation(s)
- Ryan S McClure
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | - Christopher C Overall
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | - Eric A Hill
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | - Hyun-Seob Song
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | - Moiz Charania
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | - Hans C Bernstein
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, 99352, USA
- The Gene and Linda Voiland School of Chemical Engineering and Bioengineering, Washington State University, Pullman, WA, USA
| | - Jason E McDermott
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, 99352, USA
- Department of Molecular Microbiology and Immunology, Oregon Health and Sciences University, Portland, OR, USA
| | - Alexander S Beliaev
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, 99352, USA.
- Institute for Future Environments, Queensland University of Technology, Brisbane, Australia.
- Center for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, Australia.
| |
Collapse
|
5
|
McClure RS, Overall CC, McDermott JE, Hill EA, Markillie LM, McCue LA, Taylor RC, Ludwig M, Bryant DA, Beliaev AS. Network analysis of transcriptomics expands regulatory landscapes in Synechococcus sp. PCC 7002. Nucleic Acids Res 2016; 44:8810-8825. [PMID: 27568004 PMCID: PMC5062996 DOI: 10.1093/nar/gkw737] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2015] [Accepted: 08/05/2016] [Indexed: 12/29/2022] Open
Abstract
Cyanobacterial regulation of gene expression must contend with a genome organization that lacks apparent functional context, as the majority of cellular processes and metabolic pathways are encoded by genes found at disparate locations across the genome and relatively few transcription factors exist. In this study, global transcript abundance data from the model cyanobacterium Synechococcus sp. PCC 7002 grown under 42 different conditions was analyzed using Context-Likelihood of Relatedness (CLR). The resulting network, organized into 11 modules, provided insight into transcriptional network topology as well as grouping genes by function and linking their response to specific environmental variables. When used in conjunction with genome sequences, the network allowed identification and expansion of novel potential targets of both DNA binding proteins and sRNA regulators. These results offer a new perspective into the multi-level regulation that governs cellular adaptations of the fast-growing physiologically robust cyanobacterium Synechococcus sp. PCC 7002 to changing environmental variables. It also provides a methodological high-throughput approach to studying multi-scale regulatory mechanisms that operate in cyanobacteria. Finally, it provides valuable context for integrating systems-level data to enhance gene grouping based on annotated function, especially in organisms where traditional context analyses cannot be implemented due to lack of operon-based functional organization.
Collapse
Affiliation(s)
- Ryan S McClure
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Christopher C Overall
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Jason E McDermott
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Eric A Hill
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Lye Meng Markillie
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Lee Ann McCue
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Ronald C Taylor
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Marcus Ludwig
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, State College, PA 16802, USA
| | - Donald A Bryant
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, State College, PA 16802, USA Department of Chemistry and Biochemistry, Montana State University, Bozeman, MT 59717, USA
| | - Alexander S Beliaev
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| |
Collapse
|
6
|
Patenge N, Pappesch R, Khani A, Kreikemeyer B. Genome-wide analyses of small non-coding RNAs in streptococci. Front Genet 2015; 6:189. [PMID: 26042151 PMCID: PMC4438229 DOI: 10.3389/fgene.2015.00189] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2015] [Accepted: 05/08/2015] [Indexed: 01/01/2023] Open
Abstract
Streptococci represent a diverse group of Gram-positive bacteria, which colonize a wide range of hosts among animals and humans. Streptococcal species occur as commensal as well as pathogenic organisms. Many of the pathogenic species can cause severe, invasive infections in their hosts leading to a high morbidity and mortality. The consequence is a tremendous suffering on the part of men and livestock besides the significant financial burden in the agricultural and healthcare sectors. An environmentally stimulated and tightly controlled expression of virulence factor genes is of fundamental importance for streptococcal pathogenicity. Bacterial small non-coding RNAs (sRNAs) modulate the expression of genes involved in stress response, sugar metabolism, surface composition, and other properties that are related to bacterial virulence. Even though the regulatory character is shared by this class of RNAs, variation on the molecular level results in a high diversity of functional mechanisms. The knowledge about the role of sRNAs in streptococci is still limited, but in recent years, genome-wide screens for sRNAs have been conducted in an increasing number of species. Bioinformatics prediction approaches have been employed as well as expression analyses by classical array techniques or next generation sequencing. This review will give an overview of whole genome screens for sRNAs in streptococci with a focus on describing the different methods and comparing their outcome considering sRNA conservation among species, functional similarities, and relevance for streptococcal infection.
Collapse
Affiliation(s)
- Nadja Patenge
- Institute of Medical Microbiology, Virology, Hygiene and Bacteriology, Rostock University Medical Center Rostock, Germany
| | - Roberto Pappesch
- Institute of Medical Microbiology, Virology, Hygiene and Bacteriology, Rostock University Medical Center Rostock, Germany
| | - Afsaneh Khani
- Institute of Medical Microbiology, Virology, Hygiene and Bacteriology, Rostock University Medical Center Rostock, Germany
| | - Bernd Kreikemeyer
- Institute of Medical Microbiology, Virology, Hygiene and Bacteriology, Rostock University Medical Center Rostock, Germany
| |
Collapse
|
7
|
Van Puyvelde S, Vanderleyden J, De Keersmaecker SCJ. Experimental approaches to identify small RNAs and their diverse roles in bacteria--what we have learnt in one decade of MicA research. Microbiologyopen 2015; 4:699-711. [PMID: 25974745 PMCID: PMC4618604 DOI: 10.1002/mbo3.263] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2014] [Revised: 04/01/2015] [Accepted: 04/03/2015] [Indexed: 01/12/2023] Open
Abstract
Nowadays the identification of small RNAs (sRNAs) and characterization of their role within regulatory networks takes a prominent place in deciphering complex bacterial phenotypes. Compared to the study of other components of bacterial cells, this is a relatively new but fast-growing research field. Although reports on new sRNAs appear regularly, some sRNAs are already subject of research for a longer time. One of such sRNAs is MicA, a sRNA best described for its role in outer membrane remodeling, but probably having a much broader function than anticipated. An overview of what we have learnt from MicA led to the conclusion that even for this well-described sRNA, we still do not have the overall picture. More general, the story of MicA might become an experimental lead for unraveling the many sRNAs with unknown functions. In this review, three important topics in the sRNA field are covered, exemplified from the perspective of MicA: (i) identification of new sRNAs, (ii) target identification and unraveling the biological function, (iii) structural analysis. The complex mechanisms of action of MicA deliver some original insights in the sRNA field which includes the existence of dimer formation or simultaneous cis and trans regulation, and might further inspire the understanding of the function of other sRNAs.
Collapse
Affiliation(s)
- Sandra Van Puyvelde
- Centre of Microbial and Plant Genetics, KU Leuven, Kasteelpark Arenberg 20, Heverlee, Belgium.,Department of Biomedical Sciences, Diagnostic Bacteriology Unit, Institute of Tropical Medicine, Nationalestraat 155, Antwerp, Belgium
| | - Jozef Vanderleyden
- Centre of Microbial and Plant Genetics, KU Leuven, Kasteelpark Arenberg 20, Heverlee, Belgium
| | | |
Collapse
|
8
|
Song HS, McClure RS, Bernstein HC, Overall CC, Hill EA, Beliaev AS. Integrated in silico Analyses of Regulatory and Metabolic Networks of Synechococcus sp. PCC 7002 Reveal Relationships between Gene Centrality and Essentiality. Life (Basel) 2015; 5:1127-40. [PMID: 25826650 PMCID: PMC4500133 DOI: 10.3390/life5021127] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2015] [Revised: 03/17/2015] [Accepted: 03/19/2015] [Indexed: 12/22/2022] Open
Abstract
Cyanobacteria dynamically relay environmental inputs to intracellular adaptations through a coordinated adjustment of photosynthetic efficiency and carbon processing rates. The output of such adaptations is reflected through changes in transcriptional patterns and metabolic flux distributions that ultimately define growth strategy. To address interrelationships between metabolism and regulation, we performed integrative analyses of metabolic and gene co-expression networks in a model cyanobacterium, Synechococcus sp. PCC 7002. Centrality analyses using the gene co-expression network identified a set of key genes, which were defined here as "topologically important." Parallel in silico gene knock-out simulations, using the genome-scale metabolic network, classified what we termed as "functionally important" genes, deletion of which affected growth or metabolism. A strong positive correlation was observed between topologically and functionally important genes. Functionally important genes exhibited variable levels of topological centrality; however, the majority of topologically central genes were found to be functionally essential for growth. Subsequent functional enrichment analysis revealed that both functionally and topologically important genes in Synechococcus sp. PCC 7002 are predominantly associated with translation and energy metabolism, two cellular processes critical for growth. This research demonstrates how synergistic network-level analyses can be used for reconciliation of metabolic and gene expression data to uncover fundamental biological principles.
Collapse
Affiliation(s)
- Hyun-Seob Song
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA.
| | - Ryan S McClure
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA.
| | - Hans C Bernstein
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA.
| | - Christopher C Overall
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA.
| | - Eric A Hill
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA.
| | - Alexander S Beliaev
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA.
| |
Collapse
|