1
|
Novak JK, Gardner JG. Current models in bacterial hemicellulase-encoding gene regulation. Appl Microbiol Biotechnol 2024; 108:39. [PMID: 38175245 PMCID: PMC10766802 DOI: 10.1007/s00253-023-12977-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 12/06/2023] [Accepted: 12/07/2023] [Indexed: 01/05/2024]
Abstract
The discovery and characterization of bacterial carbohydrate-active enzymes is a fundamental component of biotechnology innovation, particularly for renewable fuels and chemicals; however, these studies have increasingly transitioned to exploring the complex regulation required for recalcitrant polysaccharide utilization. This pivot is largely due to the current need to engineer and optimize enzymes for maximal degradation in industrial or biomedical applications. Given the structural simplicity of a single cellulose polymer, and the relatively few enzyme classes required for complete bioconversion, the regulation of cellulases in bacteria has been thoroughly discussed in the literature. However, the diversity of hemicelluloses found in plant biomass and the multitude of carbohydrate-active enzymes required for their deconstruction has resulted in a less comprehensive understanding of bacterial hemicellulase-encoding gene regulation. Here we review the mechanisms of this process and common themes found in the transcriptomic response during plant biomass utilization. By comparing regulatory systems from both Gram-negative and Gram-positive bacteria, as well as drawing parallels to cellulase regulation, our goals are to highlight the shared and distinct features of bacterial hemicellulase-encoding gene regulation and provide a set of guiding questions to improve our understanding of bacterial lignocellulose utilization. KEY POINTS: • Canonical regulatory mechanisms for bacterial hemicellulase-encoding gene expression include hybrid two-component systems (HTCS), extracytoplasmic function (ECF)-σ/anti-σ systems, and carbon catabolite repression (CCR). • Current transcriptomic approaches are increasingly being used to identify hemicellulase-encoding gene regulatory patterns coupled with computational predictions for transcriptional regulators. • Future work should emphasize genetic approaches to improve systems biology tools available for model bacterial systems and emerging microbes with biotechnology potential. Specifically, optimization of Gram-positive systems will require integration of degradative and fermentative capabilities, while optimization of Gram-negative systems will require bolstering the potency of lignocellulolytic capabilities.
Collapse
Affiliation(s)
- Jessica K Novak
- Department of Biological Sciences, University of Maryland - Baltimore County, Baltimore, MD, USA
| | - Jeffrey G Gardner
- Department of Biological Sciences, University of Maryland - Baltimore County, Baltimore, MD, USA.
| |
Collapse
|
2
|
Dalldorf C, Rychel K, Szubin R, Hefner Y, Patel A, Zielinski DC, Palsson BO. The hallmarks of a tradeoff in transcriptomes that balances stress and growth functions. mSystems 2024; 9:e0030524. [PMID: 38829048 PMCID: PMC11264592 DOI: 10.1128/msystems.00305-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 04/24/2024] [Indexed: 06/05/2024] Open
Abstract
Fast growth phenotypes are achieved through optimal transcriptomic allocation, in which cells must balance tradeoffs in resource allocation between diverse functions. One such balance between stress readiness and unbridled growth in E. coli has been termed the fear versus greed (f/g) tradeoff. Two specific RNA polymerase (RNAP) mutations observed in adaptation to fast growth have been previously shown to affect the f/g tradeoff, suggesting that genetic adaptations may be primed to control f/g resource allocation. Here, we conduct a greatly expanded study of the genetic control of the f/g tradeoff across diverse conditions. We introduced 12 RNA polymerase (RNAP) mutations commonly acquired during adaptive laboratory evolution (ALE) and obtained expression profiles of each. We found that these single RNAP mutation strains resulted in large shifts in the f/g tradeoff primarily in the RpoS regulon and ribosomal genes, likely through modifying RNAP-DNA interactions. Two of these mutations additionally caused condition-specific transcriptional adaptations. While this tradeoff was previously characterized by the RpoS regulon and ribosomal expression, we find that the GAD regulon plays an important role in stress readiness and ppGpp in translation activity, expanding the scope of the tradeoff. A phylogenetic analysis found the greed-related genes of the tradeoff present in numerous bacterial species. The results suggest that the f/g tradeoff represents a general principle of transcriptome allocation in bacteria where small genetic changes can result in large phenotypic adaptations to growth conditions.IMPORTANCETo increase growth, E. coli must raise ribosomal content at the expense of non-growth functions. Previous studies have linked RNAP mutations to this transcriptional shift and increased growth but were focused on only two mutations found in the protein's central region. RNAP mutations, however, commonly occur over a large structural range. To explore RNAP mutations' impact, we have introduced 12 RNAP mutations found in laboratory evolution experiments and obtained expression profiles of each. The mutations nearly universally increased growth rates by adjusting said tradeoff away from non-growth functions. In addition to this shift, a few caused condition-specific adaptations. We explored the prevalence of this tradeoff across phylogeny and found it to be a widespread and conserved trend among bacteria.
Collapse
Affiliation(s)
| | - Kevin Rychel
- Department of Bioengineering, University of California San Diego, La Jolla, USA
| | - Richard Szubin
- Department of Bioengineering, University of California San Diego, La Jolla, USA
| | - Ying Hefner
- Department of Bioengineering, University of California San Diego, La Jolla, USA
| | - Arjun Patel
- Department of Bioengineering, University of California San Diego, La Jolla, USA
| | - Daniel C. Zielinski
- Department of Bioengineering, University of California San Diego, La Jolla, USA
| | - Bernhard O. Palsson
- Department of Bioengineering, University of California San Diego, La Jolla, USA
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, USA
- Department of Pediatrics, University of California San Diego, La Jolla, California, USA
- Center for Microbiome Innovation, University of California San Diego, La Jolla, California, USA
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark
| |
Collapse
|
3
|
Phaneuf PV, Kim SH, Rychel K, Rode C, Beulig F, Palsson BO, Yang L. Meta-analysis Driven Strain Design for Mitigating Oxidative Stresses Important in Biomanufacturing. ACS Synth Biol 2024; 13:2045-2059. [PMID: 38934464 PMCID: PMC11264330 DOI: 10.1021/acssynbio.3c00572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 06/11/2024] [Accepted: 06/11/2024] [Indexed: 06/28/2024]
Abstract
As the availability of data sets increases, meta-analysis leveraging aggregated and interoperable data types is proving valuable. This study leveraged a meta-analysis workflow to identify mutations that could improve robustness to reactive oxygen species (ROS) stresses using an industrially important melatonin production strain as an example. ROS stresses often occur during cultivation and negatively affect strain performance. Cellular response to ROS is also linked to the SOS response and resistance to pH fluctuations, which is important to strain robustness in large-scale biomanufacturing. This work integrated more than 7000 E. coli adaptive laboratory evolution (ALE) mutations across 59 experiments to statistically associate mutated genes to 2 ROS tolerance ALE conditions from 72 unique conditions. Mutant oxyR, fur, iscR, and ygfZ were significantly associated and hypothesized to contribute fitness in ROS stress. Across these genes, 259 total mutations were inspected in conjunction with transcriptomics from 46 iModulon experiments. Ten mutations were chosen for reintroduction based on mutation clustering and coinciding transcriptional changes as evidence of fitness impact. Strains with mutations reintroduced into oxyR, fur, iscR, and ygfZ exhibited increased tolerance to H2O2 and acid stress and reduced SOS response, all of which are related to ROS. Additionally, new evidence was generated toward understanding the function of ygfZ, an uncharacterized gene. This meta-analysis approach utilized aggregated and interoperable multiomics data sets to identify mutations conferring industrially relevant phenotypes with the least drawbacks, describing an approach for data-driven strain engineering to optimize microbial cell factories.
Collapse
Affiliation(s)
- PV Phaneuf
- Novo
Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220. Kongens Lyngby 2800, Denmark
| | - SH Kim
- Novo
Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220. Kongens Lyngby 2800, Denmark
| | - K Rychel
- Department
of Bioengineering, University of California,
San Diego, La Jolla ,California92093-0412 ,United States
| | - C Rode
- Novo
Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220. Kongens Lyngby 2800, Denmark
| | - F Beulig
- Novo
Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220. Kongens Lyngby 2800, Denmark
| | - BO Palsson
- Novo
Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220. Kongens Lyngby 2800, Denmark
- Department
of Bioengineering, University of California,
San Diego, La Jolla ,California92093-0412 ,United States
- Bioinformatics
and Systems Biology Program, University
of California, San Diego, La Jolla ,California92093-0021, United States
- Department
of Pediatrics, University of California,
San Diego, La Jolla ,California 92093-0412, United States
| | - L Yang
- Novo
Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220. Kongens Lyngby 2800, Denmark
| |
Collapse
|
4
|
Shin J, Zielinski DC, Palsson BO. Deciphering nutritional stress responses via knowledge-enriched transcriptomics for microbial engineering. Metab Eng 2024; 84:34-47. [PMID: 38825177 DOI: 10.1016/j.ymben.2024.05.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 03/27/2024] [Accepted: 05/28/2024] [Indexed: 06/04/2024]
Abstract
Understanding diverse bacterial nutritional requirements and responses is foundational in microbial research and biotechnology. In this study, we employed knowledge-enriched transcriptomic analytics to decipher complex stress responses of Vibrio natriegens to supplied nutrients, aiming to enhance microbial engineering efforts. We computed 64 independently modulated gene sets that comprise a quantitative basis for transcriptome dynamics across a comprehensive transcriptomics dataset containing a broad array of nutrient conditions. Our approach led to the i) identification of novel transporter systems for diverse substrates, ii) a detailed understanding of how trace elements affect metabolism and growth, and iii) extensive characterization of nutrient-induced stress responses, including osmotic stress, low glycolytic flux, proteostasis, and altered protein expression. By clarifying the relationship between the acetate-associated regulon and glycolytic flux status of various nutrients, we have showcased its vital role in directing optimal carbon source selection. Our findings offer deep insights into the transcriptional landscape of bacterial nutrition and underscore its significance in tailoring strain engineering strategies, thereby facilitating the development of more efficient and robust microbial systems for biotechnological applications.
Collapse
Affiliation(s)
- Jongoh Shin
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Daniel C Zielinski
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Bernhard O Palsson
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA; Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, 2800, Denmark; Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
5
|
Patel A, McGrosso D, Hefner Y, Campeau A, Sastry AV, Maurya S, Rychel K, Gonzalez DJ, Palsson BO. Proteome allocation is linked to transcriptional regulation through a modularized transcriptome. Nat Commun 2024; 15:5234. [PMID: 38898010 PMCID: PMC11187210 DOI: 10.1038/s41467-024-49231-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 05/28/2024] [Indexed: 06/21/2024] Open
Abstract
It has proved challenging to quantitatively relate the proteome to the transcriptome on a per-gene basis. Recent advances in data analytics have enabled a biologically meaningful modularization of the bacterial transcriptome. We thus investigate whether matched datasets of transcriptomes and proteomes from bacteria under diverse conditions can be modularized in the same way to reveal novel relationships between their compositions. We find that; (1) the modules of the proteome and the transcriptome are comprised of a similar list of gene products, (2) the modules in the proteome often represent combinations of modules from the transcriptome, (3) known transcriptional and post-translational regulation is reflected in differences between two sets of modules, allowing for knowledge-mapping when interpreting module functions, and (4) through statistical modeling, absolute proteome allocation can be inferred from the transcriptome alone. Quantitative and knowledge-based relationships can thus be found at the genome-scale between the proteome and transcriptome in bacteria.
Collapse
Affiliation(s)
- Arjun Patel
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Dominic McGrosso
- Department of Pharmacology, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Ying Hefner
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Anaamika Campeau
- Department of Pharmacology, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Anand V Sastry
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Svetlana Maurya
- Department of Pharmacology, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Kevin Rychel
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, 92093, USA
| | - David J Gonzalez
- Department of Pharmacology, University of California, San Diego, La Jolla, CA, 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Bernhard O Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, 92093, USA.
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800 Kgs, Lyngby, Denmark.
| |
Collapse
|
6
|
Schumacher K, Braun D, Kleigrewe K, Jung K. Motility-activating mutations upstream of flhDC reduce acid shock survival of Escherichia coli. Microbiol Spectr 2024; 12:e0054424. [PMID: 38651876 PMCID: PMC11237407 DOI: 10.1128/spectrum.00544-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 04/03/2024] [Indexed: 04/25/2024] Open
Abstract
Many neutralophilic bacterial species try to evade acid stress with an escape strategy, which is reflected in the increased expression of genes coding for flagellar components. Extremely acid-tolerant bacteria, such as Escherichia coli, survive the strong acid stress, e.g., in the stomach of vertebrates. Recently, we were able to show that the induction of motility genes in E. coli is strictly dependent on the degree of acid stress, i.e., they are induced under mild acid stress but not under severe acid stress. However, it was not known to what extent fine-tuned expression of motility genes is related to fitness and the ability to survive periods of acid shock. In this study, we demonstrate that the expression of FlhDC, the master regulator of flagellation, is inversely correlated with the acid shock survival of E. coli. We encountered this phenomenon when analyzing mutants from the Keio collection, in which the expression of flhDC was altered by an insertion sequence element. These results suggest a fitness trade-off between acid tolerance and motility.IMPORTANCEEscherichia coli is extremely acid-resistant, which is crucial for survival in the gastrointestinal tract of vertebrates. Recently, we systematically studied the response of E. coli to mild and severe acidic conditions using Ribo-Seq and RNA-Seq. We found that motility genes are induced at pH 5.8 but not at pH 4.4, indicating stress-dependent synthesis of flagellar components. In this study, we demonstrate that motility-activating mutations upstream of flhDC, encoding the master regulator of flagella genes, reduce the ability of E. coli to survive periods of acid shock. Furthermore, we show an inverse correlation between motility and acid survival using a chromosomal isopropyl β-D-thio-galactopyranoside (IPTG)-inducible flhDC promoter and by sampling differentially motile subpopulations from swim agar plates. These results reveal a previously undiscovered trade-off between motility and acid tolerance and suggest a differentiation of E. coli into motile and acid-tolerant subpopulations, driven by the integration of insertion sequence elements.
Collapse
Affiliation(s)
- Kilian Schumacher
- Faculty of Biology, Microbiology, Ludwig-Maximilians-Universität München, Martinsried, Germany
| | - Djanna Braun
- Faculty of Biology, Microbiology, Ludwig-Maximilians-Universität München, Martinsried, Germany
| | - Karin Kleigrewe
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), Technical University of Munich, Freising, Germany
| | - Kirsten Jung
- Faculty of Biology, Microbiology, Ludwig-Maximilians-Universität München, Martinsried, Germany
| |
Collapse
|
7
|
Wu S, Zhou H, Chen D, Lu Y, Li Y, Qiao J. Multi-omic analysis tools for microbial metabolites prediction. Brief Bioinform 2024; 25:bbae264. [PMID: 38859767 PMCID: PMC11165163 DOI: 10.1093/bib/bbae264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 05/08/2024] [Indexed: 06/12/2024] Open
Abstract
How to resolve the metabolic dark matter of microorganisms has long been a challenging problem in discovering active molecules. Diverse omics tools have been developed to guide the discovery and characterization of various microbial metabolites, which make it gradually possible to predict the overall metabolites for individual strains. The combinations of multi-omic analysis tools effectively compensates for the shortcomings of current studies that focus only on single omics or a broad class of metabolites. In this review, we systematically update, categorize and sort out different analysis tools for microbial metabolites prediction in the last five years to appeal for the multi-omic combination on the understanding of the metabolic nature of microbes. First, we provide the general survey on different updated prediction databases, webservers, or software that based on genomics, transcriptomics, proteomics, and metabolomics, respectively. Then, we discuss the essentiality on the integration of multi-omics data to predict metabolites of different microbial strains and communities, as well as stressing the combination of other techniques, such as systems biology methods and data-driven algorithms. Finally, we identify key challenges and trends in developing multi-omic analysis tools for more comprehensive prediction on diverse microbial metabolites that contribute to human health and disease treatment.
Collapse
Affiliation(s)
- Shengbo Wu
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Zhejiang Institute of Tianjin University, Shaoxing, Shaoxing 312300, China
| | - Haonan Zhou
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Danlei Chen
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Zhejiang Institute of Tianjin University, Shaoxing, Shaoxing 312300, China
| | - Yutong Lu
- Zhejiang Institute of Tianjin University, Shaoxing, Shaoxing 312300, China
| | - Yanni Li
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Key Laboratory of Systems Bioengineering, Ministry of Education (Tianjin University), Tianjin 300072, China
| | - Jianjun Qiao
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Zhejiang Institute of Tianjin University, Shaoxing, Shaoxing 312300, China
- Key Laboratory of Systems Bioengineering, Ministry of Education (Tianjin University), Tianjin 300072, China
- Frontiers Science Center for Synthetic Biology (Ministry of Education), Tianjin University, Tianjin 300072, China
| |
Collapse
|
8
|
Kion-Crosby W, Barquist L. Network depth affects inference of gene sets from bacterial transcriptomes using denoising autoencoders. BIOINFORMATICS ADVANCES 2024; 4:vbae066. [PMID: 39027639 PMCID: PMC11256956 DOI: 10.1093/bioadv/vbae066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/05/2024] [Accepted: 05/02/2024] [Indexed: 07/20/2024]
Abstract
Summary The increasing number of publicly available bacterial gene expression data sets provides an unprecedented resource for the study of gene regulation in diverse conditions, but emphasizes the need for self-supervised methods for the automated generation of new hypotheses. One approach for inferring coordinated regulation from bacterial expression data is through neural networks known as denoising autoencoders (DAEs) which encode large datasets in a reduced bottleneck layer. We have generalized this application of DAEs to include deep networks and explore the effects of network architecture on gene set inference using deep learning. We developed a DAE-based pipeline to extract gene sets from transcriptomic data in Escherichia coli, validate our method by comparing inferred gene sets with known pathways, and have used this pipeline to explore how the choice of network architecture impacts gene set recovery. We find that increasing network depth leads the DAEs to explain gene expression in terms of fewer, more concisely defined gene sets, and that adjusting the width results in a tradeoff between generalizability and biological inference. Finally, leveraging our understanding of the impact of DAE architecture, we apply our pipeline to an independent uropathogenic E.coli dataset to identify genes uniquely induced during human colonization. Availability and implementation https://github.com/BarquistLab/DAE_architecture_exploration.
Collapse
Affiliation(s)
- Willow Kion-Crosby
- Helmholtz Institute for RNA-based Infection Research (HIRI)/Helmholtz Centre for Infection Research (HZI), 97080 Würzburg, Germany
- Faculty of Medicine, University of Würzburg, 97080 Würzburg, Germany
| | - Lars Barquist
- Helmholtz Institute for RNA-based Infection Research (HIRI)/Helmholtz Centre for Infection Research (HZI), 97080 Würzburg, Germany
- Faculty of Medicine, University of Würzburg, 97080 Würzburg, Germany
- Department of Biology, University of Toronto, Mississauga, ON L5L 1C6, Canada
| |
Collapse
|
9
|
Kim K, Choe D, Kang M, Cho SH, Cho S, Jeong KJ, Palsson B, Cho BK. Serial adaptive laboratory evolution enhances mixed carbon metabolic capacity of Escherichia coli. Metab Eng 2024; 83:160-171. [PMID: 38636729 DOI: 10.1016/j.ymben.2024.04.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 03/31/2024] [Accepted: 04/14/2024] [Indexed: 04/20/2024]
Abstract
Microbes have inherent capacities for utilizing various carbon sources, however they often exhibit sub-par fitness due to low metabolic efficiency. To test whether a bacterial strain can optimally utilize multiple carbon sources, Escherichia coli was serially evolved in L-lactate and glycerol. This yielded two end-point strains that evolved first in L-lactate then in glycerol, and vice versa. The end-point strains displayed a universal growth advantage on single and a mixture of adaptive carbon sources, enabled by a concerted action of carbon source-specialists and generalist mutants. The combination of just four variants of glpK, ppsA, ydcI, and rph-pyrE, accounted for more than 80% of end-point strain fitness. In addition, machine learning analysis revealed a coordinated activity of transcriptional regulators imparting condition-specific regulation of gene expression. The effectiveness of the serial adaptive laboratory evolution (ALE) scheme in bioproduction applications was assessed under single and mixed-carbon culture conditions, in which serial ALE strain exhibited superior productivity of acetoin compared to ancestral strains. Together, systems-level analysis elucidated the molecular basis of serial evolution, which hold potential utility in bioproduction applications.
Collapse
Affiliation(s)
- Kangsan Kim
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea; KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea
| | - Donghui Choe
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Minjeong Kang
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea; KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea
| | - Sang-Hyeok Cho
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea; KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea
| | - Suhyung Cho
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea
| | - Ki Jun Jeong
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea; Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea; Graduate School of Engineering Biology, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea
| | - Bernhard Palsson
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA; Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA
| | - Byung-Kwan Cho
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea; KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea; Graduate School of Engineering Biology, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea.
| |
Collapse
|
10
|
Josephs-Spaulding J, Rajput A, Hefner Y, Szubin R, Balasubramanian A, Li G, Zielinski DC, Jahn L, Sommer M, Phaneuf P, Palsson BO. Reconstructing the transcriptional regulatory network of probiotic L. reuteri is enabled by transcriptomics and machine learning. mSystems 2024; 9:e0125723. [PMID: 38349131 PMCID: PMC10949432 DOI: 10.1128/msystems.01257-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 01/09/2024] [Indexed: 03/20/2024] Open
Abstract
Limosilactobacillus reuteri, a probiotic microbe instrumental to human health and sustainable food production, adapts to diverse environmental shifts via dynamic gene expression. We applied the independent component analysis (ICA) to 117 RNA-seq data sets to decode its transcriptional regulatory network (TRN), identifying 35 distinct signals that modulate specific gene sets. Our findings indicate that the ICA provides a qualitative advancement and captures nuanced relationships within gene clusters that other methods may miss. This study uncovers the fundamental properties of L. reuteri's TRN and deepens our understanding of its arginine metabolism and the co-regulation of riboflavin metabolism and fatty acid conversion. It also sheds light on conditions that regulate genes within a specific biosynthetic gene cluster and allows for the speculation of the potential role of isoprenoid biosynthesis in L. reuteri's adaptive response to environmental changes. By integrating transcriptomics and machine learning, we provide a system-level understanding of L. reuteri's response mechanism to environmental fluctuations, thus setting the stage for modeling the probiotic transcriptome for applications in microbial food production. IMPORTANCE We have studied Limosilactobacillus reuteri, a beneficial probiotic microbe that plays a significant role in our health and production of sustainable foods, a type of foods that are nutritionally dense and healthier and have low-carbon emissions compared to traditional foods. Similar to how humans adapt their lifestyles to different environments, this microbe adjusts its behavior by modulating the expression of genes. We applied machine learning to analyze large-scale data sets on how these genes behave across diverse conditions. From this, we identified 35 unique patterns demonstrating how L. reuteri adjusts its genes based on 50 unique environmental conditions (such as various sugars, salts, microbial cocultures, human milk, and fruit juice). This research helps us understand better how L. reuteri functions, especially in processes like breaking down certain nutrients and adapting to stressful changes. More importantly, with our findings, we become closer to using this knowledge to improve how we produce more sustainable and healthier foods with the help of microbes.
Collapse
Affiliation(s)
- Jonathan Josephs-Spaulding
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Copenhagen, Denmark
| | - Akanksha Rajput
- Department of Bioengineering, University of California, San Diego, California, USA
| | - Ying Hefner
- Department of Bioengineering, University of California, San Diego, California, USA
| | - Richard Szubin
- Department of Bioengineering, University of California, San Diego, California, USA
| | | | - Gaoyuan Li
- Department of Bioengineering, University of California, San Diego, California, USA
| | - Daniel C. Zielinski
- Department of Bioengineering, University of California, San Diego, California, USA
| | - Leonie Jahn
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Copenhagen, Denmark
| | - Morten Sommer
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Copenhagen, Denmark
| | - Patrick Phaneuf
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Copenhagen, Denmark
| | - Bernhard O. Palsson
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Copenhagen, Denmark
- Department of Bioengineering, University of California, San Diego, California, USA
| |
Collapse
|
11
|
Borchert AJ, Bleem AC, Lim HG, Rychel K, Dooley KD, Kellermyer ZA, Hodges TL, Palsson BO, Beckham GT. Machine learning analysis of RB-TnSeq fitness data predicts functional gene modules in Pseudomonas putida KT2440. mSystems 2024; 9:e0094223. [PMID: 38323821 PMCID: PMC10949508 DOI: 10.1128/msystems.00942-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 01/07/2024] [Indexed: 02/08/2024] Open
Abstract
There is growing interest in engineering Pseudomonas putida KT2440 as a microbial chassis for the conversion of renewable and waste-based feedstocks, and metabolic engineering of P. putida relies on the understanding of the functional relationships between genes. In this work, independent component analysis (ICA) was applied to a compendium of existing fitness data from randomly barcoded transposon insertion sequencing (RB-TnSeq) of P. putida KT2440 grown in 179 unique experimental conditions. ICA identified 84 independent groups of genes, which we call fModules ("functional modules"), where gene members displayed shared functional influence in a specific cellular process. This machine learning-based approach both successfully recapitulated previously characterized functional relationships and established hitherto unknown associations between genes. Selected gene members from fModules for hydroxycinnamate metabolism and stress resistance, acetyl coenzyme A assimilation, and nitrogen metabolism were validated with engineered mutants of P. putida. Additionally, functional gene clusters from ICA of RB-TnSeq data sets were compared with regulatory gene clusters from prior ICA of RNAseq data sets to draw connections between gene regulation and function. Because ICA profiles the functional role of several distinct gene networks simultaneously, it can reduce the time required to annotate gene function relative to manual curation of RB-TnSeq data sets. IMPORTANCE This study demonstrates a rapid, automated approach for elucidating functional modules within complex genetic networks. While Pseudomonas putida randomly barcoded transposon insertion sequencing data were used as a proof of concept, this approach is applicable to any organism with existing functional genomics data sets and may serve as a useful tool for many valuable applications, such as guiding metabolic engineering efforts in other microbes or understanding functional relationships between virulence-associated genes in pathogenic microbes. Furthermore, this work demonstrates that comparison of data obtained from independent component analysis of transcriptomics and gene fitness datasets can elucidate regulatory-functional relationships between genes, which may have utility in a variety of applications, such as metabolic modeling, strain engineering, or identification of antimicrobial drug targets.
Collapse
Affiliation(s)
- Andrew J. Borchert
- Renewable Resources and Enabling Sciences Center, National Renewable Energy Laboratory, Golden, Colorado, USA
- Center for Bioenergy Innovation, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
| | - Alissa C. Bleem
- Renewable Resources and Enabling Sciences Center, National Renewable Energy Laboratory, Golden, Colorado, USA
- Center for Bioenergy Innovation, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Agile BioFoundry, Emeryville, California, USA
| | - Hyun Gyu Lim
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
- Joint BioEnergy Institute, Emeryville, California, USA
- Department of Biological Engineering, Inha University, Incheon, Korea
| | - Kevin Rychel
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
| | - Keven D. Dooley
- Renewable Resources and Enabling Sciences Center, National Renewable Energy Laboratory, Golden, Colorado, USA
- Agile BioFoundry, Emeryville, California, USA
| | - Zoe A. Kellermyer
- Renewable Resources and Enabling Sciences Center, National Renewable Energy Laboratory, Golden, Colorado, USA
- Center for Bioenergy Innovation, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
| | - Tracy L. Hodges
- Renewable Resources and Enabling Sciences Center, National Renewable Energy Laboratory, Golden, Colorado, USA
- Agile BioFoundry, Emeryville, California, USA
| | - Bernhard O. Palsson
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
- Joint BioEnergy Institute, Emeryville, California, USA
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark
- Department of Pediatrics, University of California, San Diego, California, USA
| | - Gregg T. Beckham
- Renewable Resources and Enabling Sciences Center, National Renewable Energy Laboratory, Golden, Colorado, USA
- Center for Bioenergy Innovation, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Agile BioFoundry, Emeryville, California, USA
| |
Collapse
|
12
|
Choe D, Olson CA, Szubin R, Yang H, Sung J, Feist AM, Palsson BO. Advancing the scale of synthetic biology via cross-species transfer of cellular functions enabled by iModulon engraftment. Nat Commun 2024; 15:2356. [PMID: 38490991 PMCID: PMC10943186 DOI: 10.1038/s41467-024-46486-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Accepted: 02/29/2024] [Indexed: 03/18/2024] Open
Abstract
Machine learning applied to large compendia of transcriptomic data has enabled the decomposition of bacterial transcriptomes to identify independently modulated sets of genes, such iModulons represent specific cellular functions. The identification of iModulons enables accurate identification of genes necessary and sufficient for cross-species transfer of cellular functions. We demonstrate cross-species transfer of: 1) the biotransformation of vanillate to protocatechuate, 2) a malonate catabolic pathway, 3) a catabolic pathway for 2,3-butanediol, and 4) an antimicrobial resistance to ampicillin found in multiple Pseudomonas species to Escherichia coli. iModulon-based engineering is a transformative strategy as it includes all genes comprising the transferred cellular function, including genes without functional annotation. Adaptive laboratory evolution was deployed to optimize the cellular function transferred, revealing mutations in the host. Combining big data analytics and laboratory evolution thus enhances the level of understanding of systems biology, and synthetic biology for strain design and development.
Collapse
Affiliation(s)
- Donghui Choe
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Connor A Olson
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Richard Szubin
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Hannah Yang
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Jaemin Sung
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Adam M Feist
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Copenhagen, Denmark
| | - Bernhard O Palsson
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA.
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Copenhagen, Denmark.
| |
Collapse
|
13
|
Wytock TP, Motter AE. Cell reprogramming design by transfer learning of functional transcriptional networks. Proc Natl Acad Sci U S A 2024; 121:e2312942121. [PMID: 38437548 PMCID: PMC10945810 DOI: 10.1073/pnas.2312942121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 01/26/2024] [Indexed: 03/06/2024] Open
Abstract
Recent developments in synthetic biology, next-generation sequencing, and machine learning provide an unprecedented opportunity to rationally design new disease treatments based on measured responses to gene perturbations and drugs to reprogram cells. The main challenges to seizing this opportunity are the incomplete knowledge of the cellular network and the combinatorial explosion of possible interventions, both of which are insurmountable by experiments. To address these challenges, we develop a transfer learning approach to control cell behavior that is pre-trained on transcriptomic data associated with human cell fates, thereby generating a model of the network dynamics that can be transferred to specific reprogramming goals. The approach combines transcriptional responses to gene perturbations to minimize the difference between a given pair of initial and target transcriptional states. We demonstrate our approach's versatility by applying it to a microarray dataset comprising >9,000 microarrays across 54 cell types and 227 unique perturbations, and an RNASeq dataset consisting of >10,000 sequencing runs across 36 cell types and 138 perturbations. Our approach reproduces known reprogramming protocols with an AUROC of 0.91 while innovating over existing methods by pre-training an adaptable model that can be tailored to specific reprogramming transitions. We show that the number of gene perturbations required to steer from one fate to another increases with decreasing developmental relatedness and that fewer genes are needed to progress along developmental paths than to regress. These findings establish a proof-of-concept for our approach to computationally design control strategies and provide insights into how gene regulatory networks govern phenotype.
Collapse
Affiliation(s)
- Thomas P. Wytock
- Department of Physics and Astronomy, Northwestern University, Evanston, IL60208
- Center for Network Dynamics, Northwestern University, Evanston, IL60208
| | - Adilson E. Motter
- Department of Physics and Astronomy, Northwestern University, Evanston, IL60208
- Center for Network Dynamics, Northwestern University, Evanston, IL60208
- Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, IL60208
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL60208
- National Institute for Theory and Mathematics in Biology, Evanston, IL60208
| |
Collapse
|
14
|
Wytock TP, Motter AE. Cell reprogramming design by transfer learning of functional transcriptional networks. ARXIV 2024:arXiv:2403.04837v1. [PMID: 38495570 PMCID: PMC10942484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Recent developments in synthetic biology, next-generation sequencing, and machine learning provide an unprecedented opportunity to rationally design new disease treatments based on measured responses to gene perturbations and drugs to reprogram cell behavior. The main challenges to seizing this opportunity are the incomplete knowledge of the cellular network and the combinatorial explosion of possible interventions, both of which are insurmountable by experiments. To address these challenges, we develop a transfer learning approach to control cell behavior that is pre-trained on transcriptomic data associated with human cell fates to generate a model of the functional network dynamics that can be transferred to specific reprogramming goals. The approach additively combines transcriptional responses to gene perturbations (single-gene knockdowns and overexpressions) to minimize the transcriptional difference between a given pair of initial and target states. We demonstrate the flexibility of our approach by applying it to a microarray dataset comprising over 9,000 microarrays across 54 cell types and 227 unique perturbations, and an RNASeq dataset consisting of over 10,000 sequencing runs across 36 cell types and 138 perturbations. Our approach reproduces known reprogramming protocols with an average AUROC of 0.91 while innovating over existing methods by pre-training an adaptable model that can be tailored to specific reprogramming transitions. We show that the number of gene perturbations required to steer from one fate to another increases as the developmental relatedness decreases. We also show that fewer genes are needed to progress along developmental paths than to regress. Together, these findings establish a proof-of-concept for our approach to computationally design control strategies and demonstrate their ability to provide insights into the dynamics of gene regulatory networks.
Collapse
Affiliation(s)
- Thomas P Wytock
- Department of Physics and Astronomy, Northwestern University, Evanston, Illinois 60208, USA
- Center for Network Dynamics, Northwestern University, Evanston, Illinois 60208, USA
| | - Adilson E Motter
- Department of Physics and Astronomy, Northwestern University, Evanston, Illinois 60208, USA
- Center for Network Dynamics, Northwestern University, Evanston, Illinois 60208, USA
- Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, Illinois 60208, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, Illinois 60208, USA
- National Institute for Theory and Mathematics in Biology, Evanston, Illinois 60208, USA
| |
Collapse
|
15
|
Hoffman T, Kinne J, Cho KH. Pro-SMP finder-A systematic approach for discovering small membrane proteins in prokaryotes. PLoS One 2024; 19:e0299169. [PMID: 38422081 PMCID: PMC10903887 DOI: 10.1371/journal.pone.0299169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 02/05/2024] [Indexed: 03/02/2024] Open
Abstract
Prokaryotic chromosomes contain numerous small open reading frames (ORFs) of less than 200 bases. Since high-throughput proteomics methods often miss proteins containing fewer than 60 amino acids, it is difficult to decern if they encode proteins. Recent studies have revealed that many small proteins are membrane proteins with a single membrane-anchoring α-helix. As membrane anchoring or transmembrane motifs are accurately identifiable with high confidence using computational algorithms like Phobius and TMHMM, small membrane proteins (SMPS) can be predicted with high accuracy. This study employed a systematic approach, utilizing well-verified algorithms such as Orfipy, Phobius, and Blast to identify SMPs in prokaryotic organisms. Our main search parameters targeted candidate SMPs with an open reading frame between 60-180 nucleotides, a membrane-anchoring or transmembrane region 15 and 30 amino acids long, and sequence conservation among other microorganisms. Our findings indicate that each prokaryote possesses many SMPs, with some identified in the intergenic regions of currently annotated chromosomes. More extensively studied microorganisms, such as Escherichia coli and Bacillus subtilis, have more SMPs identified in their genomes compared to less studied microorganisms, suggesting the possibility of undiscovered SMPs in less studied microorganisms. In this study, we describe the common SMPs identified across various microorganisms and explore their biological roles. We have also developed a software pipeline and an accompanying online interface for discovering SMPs (http://cs.indstate.edu/pro-smp-finder). This resource aims to assist researchers in identifying new SMPs encoded in microbial genomes of interest.
Collapse
Affiliation(s)
- Tara Hoffman
- Department of Math and Computer Science, Indiana State University, Terre Haute, Indiana, United States of America
| | - Jeff Kinne
- Department of Math and Computer Science, Indiana State University, Terre Haute, Indiana, United States of America
| | - Kyu Hong Cho
- Department of Biology, Indiana State University, Terre Haute, Indiana, United States of America
| |
Collapse
|
16
|
Kim K, Choe D, Cho S, Palsson B, Cho BK. Reduction-to-synthesis: the dominant approach to genome-scale synthetic biology. Trends Biotechnol 2024:S0167-7799(24)00037-4. [PMID: 38423803 DOI: 10.1016/j.tibtech.2024.02.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 02/09/2024] [Accepted: 02/12/2024] [Indexed: 03/02/2024]
Abstract
Advances in systems and synthetic biology have propelled the construction of reduced bacterial genomes. Genome reduction was initially focused on exploring properties of minimal genomes, but more recently it has been deployed as an engineering strategy to enhance strain performance. This review provides the latest updates on reduced genomes, focusing on dual-track approaches of top-down reduction and bottom-up synthesis for their construction. Using cases from studies that are based on established industrial workhorse strains, we discuss the construction of a series of synthetic phenotypes that are candidates for biotechnological applications. Finally, we address the possible uses of reduced genomes for biotechnological applications and the needed future research directions that may ultimately lead to the total synthesis of rationally designed genomes.
Collapse
Affiliation(s)
- Kangsan Kim
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea; KI for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| | - Donghui Choe
- Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA
| | - Suhyung Cho
- KI for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| | - Bernhard Palsson
- Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA; Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Kongens, Lyngby, Denmark
| | - Byung-Kwan Cho
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea; KI for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea; Graduate School of Engineering Biology, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea.
| |
Collapse
|
17
|
Menon ND, Poudel S, Sastry AV, Rychel K, Szubin R, Dillon N, Tsunemoto H, Hirose Y, Nair BG, Kumar GB, Palsson BO, Nizet V. Independent component analysis reveals 49 independently modulated gene sets within the global transcriptional regulatory architecture of multidrug-resistant Acinetobacter baumannii. mSystems 2024; 9:e0060623. [PMID: 38189271 PMCID: PMC10878099 DOI: 10.1128/msystems.00606-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Accepted: 11/29/2023] [Indexed: 01/09/2024] Open
Abstract
Acinetobacter baumannii causes severe infections in humans, resists multiple antibiotics, and survives in stressful environmental conditions due to modulations of its complex transcriptional regulatory network (TRN). Unfortunately, our global understanding of the TRN in this emerging opportunistic pathogen is limited. Here, we apply independent component analysis, an unsupervised machine learning method, to a compendium of 139 RNA-seq data sets of three multidrug-resistant A. baumannii international clonal complex I strains (AB5075, AYE, and AB0057). This analysis allows us to define 49 independently modulated gene sets, which we call iModulons. Analysis of the identified A. baumannii iModulons reveals validating parallels to previously defined biological operons/regulons and provides a framework for defining unknown regulons. By utilizing the iModulons, we uncover potential mechanisms for a RpoS-independent general stress response, define global stress-virulence trade-offs, and identify conditions that may induce plasmid-borne multidrug resistance. The iModulons provide a model of the TRN that emphasizes the importance of transcriptional regulation of virulence phenotypes in A. baumannii. Furthermore, they suggest the possibility of future interventions to guide gene expression toward diminished pathogenic potential.IMPORTANCEThe rise in hospital outbreaks of multidrug-resistant Acinetobacter baumannii infections underscores the urgent need for alternatives to traditional broad-spectrum antibiotic therapies. The success of A. baumannii as a significant nosocomial pathogen is largely attributed to its ability to resist antibiotics and survive environmental stressors. However, there is limited literature available on the global, complex regulatory circuitry that shapes these phenotypes. Computational tools that can assist in the elucidation of A. baumannii's transcriptional regulatory network architecture can provide much-needed context for a comprehensive understanding of pathogenesis and virulence, as well as for the development of targeted therapies that modulate these pathways.
Collapse
Affiliation(s)
- Nitasha D. Menon
- School of Biotechnology, Amrita Vishwa Vidyapeetham, Amritapuri, Kerala, India
- Division of Host-Microbe Systems and Therapeutics, Department of Pediatrics, University of California, San Diego, La Jolla, California, USA
| | - Saugat Poudel
- Department of Bioengineering, University of California, San Diego, La Jolla, California, USA
| | - Anand V. Sastry
- Department of Bioengineering, University of California, San Diego, La Jolla, California, USA
| | - Kevin Rychel
- Department of Bioengineering, University of California, San Diego, La Jolla, California, USA
| | - Richard Szubin
- Department of Bioengineering, University of California, San Diego, La Jolla, California, USA
| | - Nicholas Dillon
- Division of Host-Microbe Systems and Therapeutics, Department of Pediatrics, University of California, San Diego, La Jolla, California, USA
- Department of Biological Sciences, University of Texas at Dallas, Dallas, Texas, USA
| | - Hannah Tsunemoto
- Division of Biological Sciences, University of California, San Diego, La Jolla, California, USA
| | - Yujiro Hirose
- Division of Host-Microbe Systems and Therapeutics, Department of Pediatrics, University of California, San Diego, La Jolla, California, USA
- Department of Microbiology, Graduate School of Dentistry, Osaka University, Suita, Osaka, Japan
| | - Bipin G. Nair
- School of Biotechnology, Amrita Vishwa Vidyapeetham, Amritapuri, Kerala, India
| | - Geetha B. Kumar
- School of Biotechnology, Amrita Vishwa Vidyapeetham, Amritapuri, Kerala, India
| | - Bernhard O. Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, California, USA
| | - Victor Nizet
- Division of Host-Microbe Systems and Therapeutics, Department of Pediatrics, University of California, San Diego, La Jolla, California, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, USA
| |
Collapse
|
18
|
Qiu S, Huang Y, Liang S, Zeng H, Yang A. Systematic elucidation of independently modulated genes in Lactiplantibacillus plantarum reveals a trade-off between secondary and primary metabolism. Microb Biotechnol 2024; 17:e14425. [PMID: 38393514 PMCID: PMC10886434 DOI: 10.1111/1751-7915.14425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 02/02/2024] [Indexed: 02/25/2024] Open
Abstract
Lactiplantibacillus plantarum is a probiotic bacterium widely used in food and health industries, but its gene regulatory information is limited in existing databases, which impedes the research of its physiology and its applications. To obtain a better understanding of the transcriptional regulatory network of L. plantarum, independent component analysis of its transcriptomes was used to derive 45 sets of independently modulated genes (iModulons). Those iModulons were annotated for associated transcription factors and functional pathways, and active iModulons in response to different growth conditions were identified and characterized in detail. Eventually, the analysis of iModulon activities reveals a trade-off between regulatory activities of secondary and primary metabolism in L. plantarum.
Collapse
Affiliation(s)
- Sizhe Qiu
- Department of Engineering ScienceUniversity of OxfordOxfordUK
- School of Food and HealthBeijing Technology and Business UniversityBeijingChina
| | - Yidi Huang
- School of Computer Science and EngineeringBeihang UniversityBeijingChina
| | - Shishun Liang
- Department of Life ScienceImperial College LondonLondonUK
| | - Hong Zeng
- School of Food and HealthBeijing Technology and Business UniversityBeijingChina
| | - Aidong Yang
- Department of Engineering ScienceUniversity of OxfordOxfordUK
| |
Collapse
|
19
|
Huang Y, Wipat A, Bacardit J. Transcriptional biomarker discovery toward building a load stress reporting system for engineered Escherichia coli strains. Biotechnol Bioeng 2024; 121:355-365. [PMID: 37807718 PMCID: PMC10953381 DOI: 10.1002/bit.28567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 09/15/2023] [Accepted: 09/25/2023] [Indexed: 10/10/2023]
Abstract
Foreign proteins are produced by introducing synthetic constructs into host bacteria for biotechnology applications. This process can cause resource competition between synthetic circuits and host cells, placing a metabolic burden on the host cells which may result in load stress and detrimental physiological changes. Consequently, the host bacteria can experience slow growth, and the synthetic system may suffer from suboptimal function. To help in the detection of bacterial load stress, we developed machine-learning strategies to select a minimal number of genes that could serve as biomarkers for the design of load stress reporters. We identified pairs of biomarkers that showed discriminative capacity to detect the load stress states induced in 41 engineered Escherichia coli strains.
Collapse
Affiliation(s)
- Yiming Huang
- Interdisciplinary Computing and Complex BioSystems GroupNewcastle UniversityNewcastle upon TyneUK
| | - Anil Wipat
- Interdisciplinary Computing and Complex BioSystems GroupNewcastle UniversityNewcastle upon TyneUK
| | - Jaume Bacardit
- Interdisciplinary Computing and Complex BioSystems GroupNewcastle UniversityNewcastle upon TyneUK
| |
Collapse
|
20
|
Qiu S, Wan X, Liang Y, Lamoureux CR, Akbari A, Palsson BO, Zielinski DC. Inferred regulons are consistent with regulator binding sequences in E. coli. PLoS Comput Biol 2024; 20:e1011824. [PMID: 38252668 PMCID: PMC10833566 DOI: 10.1371/journal.pcbi.1011824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 02/01/2024] [Accepted: 01/12/2024] [Indexed: 01/24/2024] Open
Abstract
The transcriptional regulatory network (TRN) of E. coli consists of thousands of interactions between regulators and DNA sequences. Regulons are typically determined either from resource-intensive experimental measurement of functional binding sites, or inferred from analysis of high-throughput gene expression datasets. Recently, independent component analysis (ICA) of RNA-seq compendia has shown to be a powerful method for inferring bacterial regulons. However, it remains unclear to what extent regulons predicted by ICA structure have a biochemical basis in promoter sequences. Here, we address this question by developing machine learning models that predict inferred regulon structures in E. coli based on promoter sequence features. Models were constructed successfully (cross-validation AUROC > = 0.8) for 85% (40/47) of ICA-inferred E. coli regulons. We found that: 1) The presence of a high scoring regulator motif in the promoter region was sufficient to specify regulatory activity in 40% (19/47) of the regulons, 2) Additional features, such as DNA shape and extended motifs that can account for regulator multimeric binding, helped to specify regulon structure for the remaining 60% of regulons (28/47); 3) investigating regulons where initial machine learning models failed revealed new regulator-specific sequence features that improved model accuracy. Finally, we found that strong regulatory binding sequences underlie both the genes shared between ICA-inferred and experimental regulons as well as genes in the E. coli core pan-regulon of Fur. This work demonstrates that the structure of ICA-inferred regulons largely can be understood through the strength of regulator binding sites in promoter regions, reinforcing the utility of top-down inference for regulon discovery.
Collapse
Affiliation(s)
- Sizhe Qiu
- Department of Bioengineering, University of California San Diego, La Jolla, CA, United States of America
| | - Xinlong Wan
- Department of Bioengineering, University of California San Diego, La Jolla, CA, United States of America
| | - Yueshan Liang
- Department of Bioengineering, University of California San Diego, La Jolla, CA, United States of America
| | - Cameron R. Lamoureux
- Department of Bioengineering, University of California San Diego, La Jolla, CA, United States of America
| | - Amir Akbari
- Department of Bioengineering, University of California San Diego, La Jolla, CA, United States of America
| | - Bernhard O. Palsson
- Department of Bioengineering, University of California San Diego, La Jolla, CA, United States of America
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark
| | - Daniel C. Zielinski
- Department of Bioengineering, University of California San Diego, La Jolla, CA, United States of America
| |
Collapse
|
21
|
Steach H, Viswanath S, He Y, Zhang X, Ivanova N, Hirn M, Perlmutter M, Krishnaswamy S. Inferring Metabolic States from Single Cell Transcriptomic Data via Geometric Deep Learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.05.570153. [PMID: 38105974 PMCID: PMC10723270 DOI: 10.1101/2023.12.05.570153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
The ability to measure gene expression at single-cell resolution has elevated our understanding of how biological features emerge from complex and interdependent networks at molecular, cellular, and tissue scales. As technologies have evolved that complement scRNAseq measurements with things like single-cell proteomic, epigenomic, and genomic information, it becomes increasingly apparent how much biology exists as a product of multimodal regulation. Biological processes such as transcription, translation, and post-translational or epigenetic modification impose both energetic and specific molecular demands on a cell and are therefore implicitly constrained by the metabolic state of the cell. While metabolomics is crucial for defining a holistic model of any biological process, the chemical heterogeneity of the metabolome makes it particularly difficult to measure, and technologies capable of doing this at single-cell resolution are far behind other multiomics modalities. To address these challenges, we present GEFMAP (Gene Expression-based Flux Mapping and Metabolic Pathway Prediction), a method based on geometric deep learning for predicting flux through reactions in a global metabolic network using transcriptomics data, which we ultimately apply to scRNAseq. GEFMAP leverages the natural graph structure of metabolic networks to learn both a biological objective for each cell and estimate a mass-balanced relative flux rate for each reaction in each cell using novel deep learning models.
Collapse
|
22
|
Kulakowski S, Banerjee D, Scown CD, Mukhopadhyay A. Improving microbial bioproduction under low-oxygen conditions. Curr Opin Biotechnol 2023; 84:103016. [PMID: 37924688 DOI: 10.1016/j.copbio.2023.103016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 09/17/2023] [Accepted: 10/07/2023] [Indexed: 11/06/2023]
Abstract
Microbial bioconversion provides access to a wide range of sustainably produced chemicals and commodities. However, industrial-scale bioproduction process operations are preferred to be anaerobic due to the cost associated with oxygen transfer. Anaerobic bioconversion generally offers limited substrate utilization profiles, lower product yields, and reduced final product diversity compared with aerobic processes. Bioproduction under conditions of reduced oxygen can overcome the limitations of fully aerobic and anaerobic bioprocesses, but many microbial hosts are not developed for low-oxygen bioproduction. Here, we describe advances in microbial strain engineering involving the use of redox cofactor engineering, genome-scale metabolic modeling, and functional genomics to enable improved bioproduction processes under low oxygen and provide a viable path for scaling these bioproduction systems to industrial scales.
Collapse
Affiliation(s)
- Shawn Kulakowski
- Joint BioEnergy Institute, Emeryville, CA 94608, USA; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Deepanwita Banerjee
- Joint BioEnergy Institute, Emeryville, CA 94608, USA; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Corinne D Scown
- Joint BioEnergy Institute, Emeryville, CA 94608, USA; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; Energy Analysis and Environmental Impacts Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Aindrila Mukhopadhyay
- Joint BioEnergy Institute, Emeryville, CA 94608, USA; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; Environmental Genomics & Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.
| |
Collapse
|
23
|
Hanke P, Parrello B, Vasieva O, Akins C, Chlenski P, Babnigg G, Henry C, Foflonker F, Brettin T, Antonopoulos D, Stevens R, Fonstein M. Engineering of increased L-Threonine production in bacteria by combinatorial cloning and machine learning. Metab Eng Commun 2023; 17:e00225. [PMID: 37435441 PMCID: PMC10331477 DOI: 10.1016/j.mec.2023.e00225] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 06/02/2023] [Accepted: 06/03/2023] [Indexed: 07/13/2023] Open
Abstract
The goal of this study is to develop a general strategy for bacterial engineering using an integrated synthetic biology and machine learning (ML) approach. This strategy was developed in the context of increasing L-threonine production in Escherichia coli ATCC 21277. A set of 16 genes was initially selected based on metabolic pathway relevance to threonine biosynthesis and used for combinatorial cloning to construct a set of 385 strains to generate training data (i.e., a range of L-threonine titers linked to each of the specific gene combinations). Hybrid (regression/classification) deep learning (DL) models were developed and used to predict additional gene combinations in subsequent rounds of combinatorial cloning for increased L-threonine production based on the training data. As a result, E. coli strains built after just three rounds of iterative combinatorial cloning and model prediction generated higher L-threonine titers (from 2.7 g/L to 8.4 g/L) than those of patented L-threonine strains being used as controls (4-5 g/L). Interesting combinations of genes in L-threonine production included deletions of the tdh, metL, dapA, and dhaM genes as well as overexpression of the pntAB, ppc, and aspC genes. Mechanistic analysis of the metabolic system constraints for the best performing constructs offers ways to improve the models by adjusting weights for specific gene combinations. Graph theory analysis of pairwise gene modifications and corresponding levels of L-threonine production also suggests additional rules that can be incorporated into future ML models.
Collapse
Affiliation(s)
- Paul Hanke
- Argonne National Laboratory, 9700 S. Cass Ave, Argonne, IL, 60439, USA
| | - Bruce Parrello
- University of Chicago, 5801 S. Ellis Ave, Chicago, IL, 60637, USA
| | - Olga Vasieva
- BSMI, 1818 Skokie Blvd., #201, Northbrook, IL, 60062, USA
| | - Chase Akins
- Argonne National Laboratory, 9700 S. Cass Ave, Argonne, IL, 60439, USA
| | - Philippe Chlenski
- Department of Computer Science, Columbia University, New York, NY, 10027, USA
| | - Gyorgy Babnigg
- Argonne National Laboratory, 9700 S. Cass Ave, Argonne, IL, 60439, USA
| | - Chris Henry
- Argonne National Laboratory, 9700 S. Cass Ave, Argonne, IL, 60439, USA
| | - Fatima Foflonker
- Argonne National Laboratory, 9700 S. Cass Ave, Argonne, IL, 60439, USA
| | - Thomas Brettin
- Argonne National Laboratory, 9700 S. Cass Ave, Argonne, IL, 60439, USA
| | | | - Rick Stevens
- Argonne National Laboratory, 9700 S. Cass Ave, Argonne, IL, 60439, USA
- University of Chicago, 5801 S. Ellis Ave, Chicago, IL, 60637, USA
| | - Michael Fonstein
- Argonne National Laboratory, 9700 S. Cass Ave, Argonne, IL, 60439, USA
| |
Collapse
|
24
|
Miano A, Rychel K, Lezia A, Sastry A, Palsson B, Hasty J. High-resolution temporal profiling of E. coli transcriptional response. Nat Commun 2023; 14:7606. [PMID: 37993418 PMCID: PMC10665441 DOI: 10.1038/s41467-023-43173-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Accepted: 11/02/2023] [Indexed: 11/24/2023] Open
Abstract
Understanding how cells dynamically adapt to their environment is a primary focus of biology research. Temporal information about cellular behavior is often limited by both small numbers of data time-points and the methods used to analyze this data. Here, we apply unsupervised machine learning to a data set containing the activity of 1805 native promoters in E. coli measured every 10 minutes in a high-throughput microfluidic device via fluorescence time-lapse microscopy. Specifically, this data set reveals E. coli transcriptome dynamics when exposed to different heavy metal ions. We use a bioinformatics pipeline based on Independent Component Analysis (ICA) to generate insights and hypotheses from this data. We discovered three primary, time-dependent stages of promoter activation to heavy metal stress (fast, intermediate, and steady). Furthermore, we uncovered a global strategy E. coli uses to reallocate resources from stress-related promoters to growth-related promoters following exposure to heavy metal stress.
Collapse
Affiliation(s)
- Arianna Miano
- Department of Bioengineering, University of California San Diego, 9500 Gliman Dr, La Jolla, CA, USA.
| | - Kevin Rychel
- Department of Bioengineering, University of California San Diego, 9500 Gliman Dr, La Jolla, CA, USA
| | - Andrew Lezia
- Department of Bioengineering, University of California San Diego, 9500 Gliman Dr, La Jolla, CA, USA
| | - Anand Sastry
- Department of Bioengineering, University of California San Diego, 9500 Gliman Dr, La Jolla, CA, USA
| | - Bernhard Palsson
- Department of Bioengineering, University of California San Diego, 9500 Gliman Dr, La Jolla, CA, USA
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800 Kgs, Lyngby, Denmark
| | - Jeff Hasty
- Department of Bioengineering, University of California San Diego, 9500 Gliman Dr, La Jolla, CA, USA
- Department of Molecular Biology, School of Biological Sciences, University of California San Diego, 9500 Gliman Dr, La Jolla, CA, USA
- Synthetic Biology Institute, University of California San Diego, 9500 Gliman Dr, La Jolla, CA, USA
| |
Collapse
|
25
|
Parthiban S, Vijeesh T, Gayathri T, Shanmugaraj B, Sharma A, Sathishkumar R. Artificial intelligence-driven systems engineering for next-generation plant-derived biopharmaceuticals. FRONTIERS IN PLANT SCIENCE 2023; 14:1252166. [PMID: 38034587 PMCID: PMC10684705 DOI: 10.3389/fpls.2023.1252166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 10/17/2023] [Indexed: 12/02/2023]
Abstract
Recombinant biopharmaceuticals including antigens, antibodies, hormones, cytokines, single-chain variable fragments, and peptides have been used as vaccines, diagnostics and therapeutics. Plant molecular pharming is a robust platform that uses plants as an expression system to produce simple and complex recombinant biopharmaceuticals on a large scale. Plant system has several advantages over other host systems such as humanized expression, glycosylation, scalability, reduced risk of human or animal pathogenic contaminants, rapid and cost-effective production. Despite many advantages, the expression of recombinant proteins in plant system is hindered by some factors such as non-human post-translational modifications, protein misfolding, conformation changes and instability. Artificial intelligence (AI) plays a vital role in various fields of biotechnology and in the aspect of plant molecular pharming, a significant increase in yield and stability can be achieved with the intervention of AI-based multi-approach to overcome the hindrance factors. Current limitations of plant-based recombinant biopharmaceutical production can be circumvented with the aid of synthetic biology tools and AI algorithms in plant-based glycan engineering for protein folding, stability, viability, catalytic activity and organelle targeting. The AI models, including but not limited to, neural network, support vector machines, linear regression, Gaussian process and regressor ensemble, work by predicting the training and experimental data sets to design and validate the protein structures thereby optimizing properties such as thermostability, catalytic activity, antibody affinity, and protein folding. This review focuses on, integrating systems engineering approaches and AI-based machine learning and deep learning algorithms in protein engineering and host engineering to augment protein production in plant systems to meet the ever-expanding therapeutics market.
Collapse
Affiliation(s)
- Subramanian Parthiban
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| | - Thandarvalli Vijeesh
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| | - Thashanamoorthi Gayathri
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| | - Balamurugan Shanmugaraj
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| | - Ashutosh Sharma
- Tecnologico de Monterrey, School of Engineering and Sciences, Centre of Bioengineering, Queretaro, Mexico
| | - Ramalingam Sathishkumar
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| |
Collapse
|
26
|
Zhao J, Sun X, Mao Z, Zheng Y, Geng Z, Zhang Y, Ma H, Wang Z. Independent component analysis of Corynebacterium glutamicum transcriptomes reveals its transcriptional regulatory network. Microbiol Res 2023; 276:127485. [PMID: 37683565 DOI: 10.1016/j.micres.2023.127485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 08/28/2023] [Accepted: 08/29/2023] [Indexed: 09/10/2023]
Abstract
Gene expression in bacteria is regulated by multiple transcription factors. Clarifying the regulation mechanism of gene expression is necessary to understand bacterial physiological activities. To further understand the structure of the transcriptional regulatory network of Corynebacterium glutamicum, we applied independent component analysis, an unsupervised machine learning algorithm, to the high-quality C. glutamicum gene expression profile which includes 263 samples from 29 independent projects. We obtained 87 robust independent regulatory modules (iModulons). These iModulons explain 76.7% of the variance in the expression profile and constitute the quantitative transcriptional regulatory network of C. glutamicum. By analyzing the constituent genes in iModulons, we identified potential targets for 20 transcription factors. We also captured the changes in iModulon activities under different growth rates and dissolved oxygen concentrations, demonstrating the ability of iModulons to comprehensively interpret transcriptional responses to environmental changes. In summary, this study provides a genome-scale quantitative transcriptional regulatory network for C. glutamicum and informs future research on complex changes in the transcriptome.
Collapse
Affiliation(s)
- Jianxiao Zhao
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China; SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China; Biodesign Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China; National Technology Innovation Center of Synthetic Biology, Tianjin 300308, China
| | - Xi Sun
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China; SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Zhitao Mao
- Biodesign Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China; National Technology Innovation Center of Synthetic Biology, Tianjin 300308, China
| | - Yangyang Zheng
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China; SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Zhouxiao Geng
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China; SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Yuhan Zhang
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China; SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Hongwu Ma
- Biodesign Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China; National Technology Innovation Center of Synthetic Biology, Tianjin 300308, China.
| | - Zhiwen Wang
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China; SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China.
| |
Collapse
|
27
|
Lamoureux CR, Decker KT, Sastry AV, Rychel K, Gao Y, McConn J, Zielinski D, Palsson BO. A multi-scale expression and regulation knowledge base for Escherichia coli. Nucleic Acids Res 2023; 51:10176-10193. [PMID: 37713610 PMCID: PMC10602906 DOI: 10.1093/nar/gkad750] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 08/02/2023] [Accepted: 09/05/2023] [Indexed: 09/17/2023] Open
Abstract
Transcriptomic data is accumulating rapidly; thus, scalable methods for extracting knowledge from this data are critical. Here, we assembled a top-down expression and regulation knowledge base for Escherichia coli. The expression component is a 1035-sample, high-quality RNA-seq compendium consisting of data generated in our lab using a single experimental protocol. The compendium contains diverse growth conditions, including: 9 media; 39 supplements, including antibiotics; 42 heterologous proteins; and 76 gene knockouts. Using this resource, we elucidated global expression patterns. We used machine learning to extract 201 modules that account for 86% of known regulatory interactions, creating the regulatory component. With these modules, we identified two novel regulons and quantified systems-level regulatory responses. We also integrated 1675 curated, publicly-available transcriptomes into the resource. We demonstrated workflows for analyzing new data against this knowledge base via deconstruction of regulation during aerobic transition. This resource illuminates the E. coli transcriptome at scale and provides a blueprint for top-down transcriptomic analysis of non-model organisms.
Collapse
Affiliation(s)
- Cameron R Lamoureux
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Katherine T Decker
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Anand V Sastry
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Kevin Rychel
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Ye Gao
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - John Luke McConn
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Daniel C Zielinski
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Bernhard O Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800 Kgs. Lyngby, Denmark
| |
Collapse
|
28
|
Bajpe H, Rychel K, Lamoureux CR, Sastry AV, Palsson BO. Machine learning uncovers the Pseudomonas syringae transcriptome in microbial communities and during infection. mSystems 2023; 8:e0043723. [PMID: 37638727 PMCID: PMC10654099 DOI: 10.1128/msystems.00437-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 07/19/2023] [Indexed: 08/29/2023] Open
Abstract
IMPORTANCE Pseudomonas syringae pv. tomato DC3000 is a model plant pathogen that infects tomatoes and Arabidopsis thaliana. The current understanding of global transcriptional regulation in the pathogen is limited. Here, we applied iModulon analysis to a compendium of RNA-seq data to unravel its transcriptional regulatory network. We characterize each co-regulated gene set, revealing the activity of major regulators across diverse conditions. We provide new insights on the transcriptional dynamics in interactions with the plant immune system and with other bacterial species, such as AlgU-dependent regulation of flagellar genes during plant infection and downregulation of siderophore production in the presence of a siderophore cheater. This study demonstrates the novel application of iModulons in studying temporal dynamics during host-pathogen and microbe-microbe interactions, and reveals specific insights of interest.
Collapse
Affiliation(s)
- Heera Bajpe
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
| | - Kevin Rychel
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
| | - Cameron R. Lamoureux
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
| | - Anand V. Sastry
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
| | - Bernhard O. Palsson
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
- Department of Pediatrics, University of California San Diego, La Jolla, California, USA
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, California, USA
- Center for Microbiome Innovation, University of California San Diego, La Jolla, California, USA
- Novo Nordisk Foundation Center for Biosustainability, Kongens Lyngby, Denmark
| |
Collapse
|
29
|
Kim K, Kang M, Cho BK. Systems and synthetic biology-driven engineering of live bacterial therapeutics. Front Bioeng Biotechnol 2023; 11:1267378. [PMID: 37929193 PMCID: PMC10620806 DOI: 10.3389/fbioe.2023.1267378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 10/09/2023] [Indexed: 11/07/2023] Open
Abstract
The past decade has seen growing interest in bacterial engineering for therapeutically relevant applications. While early efforts focused on repurposing genetically tractable model strains, such as Escherichia coli, engineering gut commensals is gaining traction owing to their innate capacity to survive and stably propagate in the intestine for an extended duration. Although limited genetic tractability has been a major roadblock, recent advances in systems and synthetic biology have unlocked our ability to effectively harness native gut commensals for therapeutic and diagnostic purposes, ranging from the rational design of synthetic microbial consortia to the construction of synthetic cells that execute "sense-and-respond" logic operations that allow real-time detection and therapeutic payload delivery in response to specific signals in the intestine. In this review, we outline the current progress and latest updates on microbial therapeutics, with particular emphasis on gut commensal engineering driven by synthetic biology and systems understanding of their molecular phenotypes. Finally, the challenges and prospects of engineering gut commensals for therapeutic applications are discussed.
Collapse
Affiliation(s)
- Kangsan Kim
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Minjeong Kang
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Byung-Kwan Cho
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
- Graduate School of Engineering Biology, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| |
Collapse
|
30
|
Rychel K, Tan J, Patel A, Lamoureux C, Hefner Y, Szubin R, Johnsen J, Mohamed ETT, Phaneuf PV, Anand A, Olson CA, Park JH, Sastry AV, Yang L, Feist AM, Palsson BO. Laboratory evolution, transcriptomics, and modeling reveal mechanisms of paraquat tolerance. Cell Rep 2023; 42:113105. [PMID: 37713311 PMCID: PMC10591938 DOI: 10.1016/j.celrep.2023.113105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 07/09/2023] [Accepted: 08/23/2023] [Indexed: 09/17/2023] Open
Abstract
Relationships between the genome, transcriptome, and metabolome underlie all evolved phenotypes. However, it has proved difficult to elucidate these relationships because of the high number of variables measured. A recently developed data analytic method for characterizing the transcriptome can simplify interpretation by grouping genes into independently modulated sets (iModulons). Here, we demonstrate how iModulons reveal deep understanding of the effects of causal mutations and metabolic rewiring. We use adaptive laboratory evolution to generate E. coli strains that tolerate high levels of the redox cycling compound paraquat, which produces reactive oxygen species (ROS). We combine resequencing, iModulons, and metabolic models to elucidate six interacting stress-tolerance mechanisms: (1) modification of transport, (2) activation of ROS stress responses, (3) use of ROS-sensitive iron regulation, (4) motility, (5) broad transcriptional reallocation toward growth, and (6) metabolic rewiring to decrease NADH production. This work thus demonstrates the power of iModulon knowledge mapping for evolution analysis.
Collapse
Affiliation(s)
- Kevin Rychel
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Justin Tan
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Arjun Patel
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Cameron Lamoureux
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Ying Hefner
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Richard Szubin
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Josefin Johnsen
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800 Kgs. Lyngby, Denmark
| | - Elsayed Tharwat Tolba Mohamed
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800 Kgs. Lyngby, Denmark
| | - Patrick V Phaneuf
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800 Kgs. Lyngby, Denmark
| | - Amitesh Anand
- Tata Institute of Fundamental Research, Homi Bhabha Road, Colaba, Mumbai, Maharashtra, India
| | - Connor A Olson
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Joon Ho Park
- Department of Chemical Engineering, Massachusetts Institute of Technology, 500 Main Street, Building 76, Cambridge, MA 02139, USA
| | - Anand V Sastry
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Laurence Yang
- Department of Chemical Engineering, Queen's University, Kingston, ON K7L 3N6, Canada
| | - Adam M Feist
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA; Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800 Kgs. Lyngby, Denmark
| | - Bernhard O Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA; Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800 Kgs. Lyngby, Denmark.
| |
Collapse
|
31
|
Han Y, Li W, Filko A, Li J, Zhang F. Genome-wide promoter responses to CRISPR perturbations of regulators reveal regulatory networks in Escherichia coli. Nat Commun 2023; 14:5757. [PMID: 37717013 PMCID: PMC10505187 DOI: 10.1038/s41467-023-41572-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 09/08/2023] [Indexed: 09/18/2023] Open
Abstract
Elucidating genome-scale regulatory networks requires a comprehensive collection of gene expression profiles, yet measuring gene expression responses for every transcription factor (TF)-gene pair in living prokaryotic cells remains challenging. Here, we develop pooled promoter responses to TF perturbation sequencing (PPTP-seq) via CRISPR interference to address this challenge. Using PPTP-seq, we systematically measure the activity of 1372 Escherichia coli promoters under single knockdown of 183 TF genes, illustrating more than 200,000 possible TF-gene responses in one experiment. We perform PPTP-seq for E. coli growing in three different media. The PPTP-seq data reveal robust steady-state promoter activities under most single TF knockdown conditions. PPTP-seq also enables identifications of, to the best of our knowledge, previously unknown TF autoregulatory responses and complex transcriptional control on one-carbon metabolism. We further find context-dependent promoter regulation by multiple TFs whose relative binding strengths determined promoter activities. Additionally, PPTP-seq reveals different promoter responses in different growth media, suggesting condition-specific gene regulation. Overall, PPTP-seq provides a powerful method to examine genome-wide transcriptional regulatory networks and can be potentially expanded to reveal gene expression responses to other genetic elements.
Collapse
Affiliation(s)
- Yichao Han
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA
| | - Wanji Li
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA
| | - Alden Filko
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA
| | - Jingyao Li
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA
| | - Fuzhong Zhang
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA.
- Division of Biological and Biomedical Sciences, Washington University in St. Louis, Saint Louis, Missouri, USA.
- Institute of Materials Science and Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA.
| |
Collapse
|
32
|
Sandberg TE, Wise KS, Dalldorf C, Szubin R, Feist AM, Glass JI, Palsson BO. Adaptive evolution of a minimal organism with a synthetic genome. iScience 2023; 26:107500. [PMID: 37636038 PMCID: PMC10448532 DOI: 10.1016/j.isci.2023.107500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 02/28/2023] [Accepted: 07/26/2023] [Indexed: 08/29/2023] Open
Abstract
The bacterial strain JCVI-syn3.0 stands as the first example of a living organism with a minimized synthetic genome, derived from the Mycoplasma mycoides genome and chemically synthesized in vitro. Here, we report the experimental evolution of a syn3.0- derived strain. Ten independent replicates were evolved for several hundred generations, leading to growth rate improvements of > 15%. Endpoint strains possessed an average of 8 mutations composed of indels and SNPs, with a pronounced C/G- > A/T transversion bias. Multiple genes were repeated mutational targets across the independent lineages, including phase variable lipoprotein activation, 5 distinct; nonsynonymous substitutions in the same membrane transporter protein, and inactivation of an uncharacterized gene. Transcriptomic analysis revealed an overall tradeoff reflected in upregulated ribosomal proteins and downregulated DNA and RNA related proteins during adaptation. This work establishes the suitability of synthetic, minimal strains for laboratory evolution, providing a means to optimize strain growth characteristics and elucidate gene functionality.
Collapse
Affiliation(s)
- Troy E. Sandberg
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Kim S. Wise
- J. Craig Venter Institute, San Diego, La Jolla, CA, USA
| | - Christopher Dalldorf
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Richard Szubin
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Adam M. Feist
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800 Kongens, Lyngby, Denmark
| | - John I. Glass
- J. Craig Venter Institute, San Diego, La Jolla, CA, USA
| | - Bernhard O. Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
- Center for Microbiome Innovation, University of California San Diego, La Jolla, CA 92093, USA
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800 Kongens, Lyngby, Denmark
| |
Collapse
|
33
|
Johnson MM, Hockenberry AJ, McGuffie MJ, Vieira LC, Wilke CO. Growth-dependent Gene Expression Variation Influences the Strength of Codon Usage Biases. Mol Biol Evol 2023; 40:msad189. [PMID: 37619989 PMCID: PMC10482319 DOI: 10.1093/molbev/msad189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 08/11/2023] [Indexed: 08/26/2023] Open
Abstract
The most highly expressed genes in microbial genomes tend to use a limited set of synonymous codons, often referred to as "preferred codons." The existence of preferred codons is commonly attributed to selection pressures on various aspects of protein translation including accuracy and/or speed. However, gene expression is condition-dependent and even within single-celled organisms transcript and protein abundances can vary depending on a variety of environmental and other factors. Here, we show that growth rate-dependent expression variation is an important constraint that significantly influences the evolution of gene sequences. Using large-scale transcriptomic and proteomic data sets in Escherichia coli and Saccharomyces cerevisiae, we confirm that codon usage biases are strongly associated with gene expression but highlight that this relationship is most pronounced when gene expression measurements are taken during rapid growth conditions. Specifically, genes whose relative expression increases during periods of rapid growth have stronger codon usage biases than comparably expressed genes whose expression decreases during rapid growth conditions. These findings highlight that gene expression measured in any particular condition tells only part of the story regarding the forces shaping the evolution of microbial gene sequences. More generally, our results imply that microbial physiology during rapid growth is critical for explaining long-term translational constraints.
Collapse
Affiliation(s)
- Mackenzie M Johnson
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, USA
| | - Adam J Hockenberry
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, USA
| | - Matthew J McGuffie
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, TX, USA
| | - Luiz Carlos Vieira
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, USA
| | - Claus O Wilke
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
34
|
Gao Y, Poudel S, Seif Y, Shen Z, Palsson BO. Elucidating the CodY regulon in Staphylococcus aureus USA300 substrains TCH1516 and LAC. mSystems 2023; 8:e0027923. [PMID: 37310465 PMCID: PMC10470025 DOI: 10.1128/msystems.00279-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 04/25/2023] [Indexed: 06/14/2023] Open
Abstract
CodY is a conserved broad-acting transcription factor that regulates the expression of genes related to amino acid metabolism and virulence in Gram-positive bacteria. Here, we performed the first in vivo determination of CodY target genes using a novel CodY monoclonal antibody in methicillin-resistant Staphylococcus aureus (MRSA) USA300. Our results showed (i) the same 135 CodY promoter binding sites regulating the 165 target genes identified in two closely related virulent S. aureus USA300 TCH1516 and LAC strains; (ii) the differential binding intensity for the same target genes under the same conditions was due to sequence differences in the same CodY-binding site in the two strains; (iii) a CodY regulon comprising 72 target genes that are differentially regulated relative to a CodY deletion strain, representing genes that are mainly involved in amino acid transport and metabolism, inorganic ion transport and metabolism, transcription and translation, and virulence, all based on transcriptomic data; and (iv) CodY systematically regulated central metabolic flux to generate branched-chain amino acids (BCAAs) by mapping the CodY regulon onto a genome-scale metabolic model of S. aureus. Our study performed the first system-level analysis of CodY in two closely related USA300 TCH1516 and LAC strains, revealing new insights into the similarities and differences of CodY regulatory roles between the closely related strains. IMPORTANCE With the increasing availability of whole-genome sequences for many strains within the same pathogenic species, a comparative analysis of key regulators is needed to understand how the different strains uniquely coordinate metabolism and expression of virulence. To successfully infect the human host, Staphylococcus aureus USA300 relies on the transcription factor CodY to reorganize metabolism and express virulence factors. While CodY is a known key transcription factor, its target genes are not characterized on a genome-wide basis. We performed a comparative analysis to describe the transcriptional regulation of CodY between two dominant USA300 strains. This study motivates the characterization of common pathogenic strains and an evaluation of the possibility of developing specialized treatments for major strains circulating in the population.
Collapse
Affiliation(s)
- Ye Gao
- Department of Biological Sciences, University of California San Diego, La Jolla, California, USA
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
| | - Saugat Poudel
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
| | - Yara Seif
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
| | - Zeyang Shen
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
| | - Bernhard O. Palsson
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
- Department of Pediatrics, University of California San Diego, La Jolla, California, USA
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, California, USA
- Novo Nordisk Foundation Center for Biosustainability, Kongens Lyngby, Denmark
| |
Collapse
|
35
|
Gao ZP, Gu WC, Li J, Qiu QT, Ma BG. Independent Component Analysis Reveals the Transcriptional Regulatory Modules in Bradyrhizobium diazoefficiens USDA110. Int J Mol Sci 2023; 24:12544. [PMID: 37628727 PMCID: PMC10454721 DOI: 10.3390/ijms241612544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 07/30/2023] [Accepted: 08/05/2023] [Indexed: 08/27/2023] Open
Abstract
The dynamic adaptation of bacteria to environmental changes is achieved through the coordinated expression of many genes, which constitutes a transcriptional regulatory network (TRN). Bradyrhizobium diazoefficiens USDA110 is an important model strain for the study of symbiotic nitrogen fixation (SNF), and its SNF ability largely depends on the TRN. In this study, independent component analysis was applied to 226 high-quality gene expression profiles of B. diazoefficiens USDA110 microarray datasets, from which 64 iModulons were identified. Using these iModulons and their condition-specific activity levels, we (1) provided new insights into the connection between the FixLJ-FixK2-FixK1 regulatory cascade and quorum sensing, (2) discovered the independence of the FixLJ-FixK2-FixK1 and NifA/RpoN regulatory cascades in response to oxygen, (3) identified the FixLJ-FixK2 cascade as a mediator connecting the FixK2-2 iModulon and the Phenylalanine iModulon, (4) described the differential activation of iModulons in B. diazoefficiens USDA110 under different environmental conditions, and (5) proposed a notion of active-TRN based on the changes in iModulon activity to better illustrate the relationship between gene regulation and environmental condition. In sum, this research offered an iModulon-based TRN for B. diazoefficiens USDA110, which formed a foundation for comprehensively understanding the intricate transcriptional regulation during SNF.
Collapse
Affiliation(s)
| | | | | | | | - Bin-Guang Ma
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China; (Z.-P.G.); (W.-C.G.); (J.L.); (Q.-T.Q.)
| |
Collapse
|
36
|
Hoang MD, Riessner S, Oropeza Vargas JE, von den Eichen N, Heins AL. Influence of Varying Pre-Culture Conditions on the Level of Population Heterogeneity in Batch Cultures with an Escherichia coli Triple Reporter Strain. Microorganisms 2023; 11:1763. [PMID: 37512936 PMCID: PMC10384452 DOI: 10.3390/microorganisms11071763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 06/26/2023] [Accepted: 06/29/2023] [Indexed: 07/30/2023] Open
Abstract
When targeting robust, high-yielding bioprocesses, phenomena such as population heterogeneity have to be considered. Therefore, the influence of the conditions which the cells experience prior to the main culture should also be evaluated. Here, the influence of a pre-culture medium (complex vs. minimal medium), optical density for inoculation of the main culture (0.005, 0.02 and 0.0125) and harvest time points of the pre-culture in exponential growth phase (early, mid and late) on the level of population heterogeneity in batch cultures of the Escherichia coli triple reporter strain G7BL21(DE3) in stirred-tank bioreactors was studied. This strain allows monitoring the growth (rrnB-EmGFP), general stress response (rpoS-mStrawberry) and oxygen limitation (nar-TagRFP657) of single cells through the expression of fluorescent proteins. Data from batch cultivations with varying pre-culture conditions were analysed with principal component analysis. According to fluorescence data, the pre-culture medium had the largest impact on population heterogeneities during the bioprocess. While a minimal medium as a pre-culture medium elevated the differences in cellular growth behaviour in the subsequent batch process, a complex medium increased the general stress response and led to a higher population heterogeneity. The latter was promoted by an early harvest of the cells with low inoculation density. Seemingly, nar-operon expression acted independently of the pre-culture conditions.
Collapse
Affiliation(s)
- Manh Dat Hoang
- Chair of Biochemical Engineering, TUM School of Engineering and Design, Technical University of Munich, 85748 Garching, Germany
| | - Sophi Riessner
- Chair of Biochemical Engineering, TUM School of Engineering and Design, Technical University of Munich, 85748 Garching, Germany
| | - Jose Enrique Oropeza Vargas
- Chair of Biochemical Engineering, TUM School of Engineering and Design, Technical University of Munich, 85748 Garching, Germany
| | - Nikolas von den Eichen
- Chair of Biochemical Engineering, TUM School of Engineering and Design, Technical University of Munich, 85748 Garching, Germany
| | - Anna-Lena Heins
- Chair of Biochemical Engineering, TUM School of Engineering and Design, Technical University of Munich, 85748 Garching, Germany
| |
Collapse
|
37
|
Mohammad-Taheri S, Tewari V, Kapre R, Rahiminasab E, Sachs K, Tapley Hoyt C, Zucker J, Vitek O. Optimal adjustment sets for causal query estimation in partially observed biomolecular networks. Bioinformatics 2023; 39:i494-i503. [PMID: 37387179 PMCID: PMC10311316 DOI: 10.1093/bioinformatics/btad270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
Causal query estimation in biomolecular networks commonly selects a 'valid adjustment set', i.e. a subset of network variables that eliminates the bias of the estimator. A same query may have multiple valid adjustment sets, each with a different variance. When networks are partially observed, current methods use graph-based criteria to find an adjustment set that minimizes asymptotic variance. Unfortunately, many models that share the same graph topology, and therefore same functional dependencies, may differ in the processes that generate the observational data. In these cases, the topology-based criteria fail to distinguish the variances of the adjustment sets. This deficiency can lead to sub-optimal adjustment sets, and to miss-characterization of the effect of the intervention. We propose an approach for deriving 'optimal adjustment sets' that takes into account the nature of the data, bias and finite-sample variance of the estimator, and cost. It empirically learns the data generating processes from historical experimental data, and characterizes the properties of the estimators by simulation. We demonstrate the utility of the proposed approach in four biomolecular Case studies with different topologies and different data generation processes. The implementation and reproducible Case studies are at https://github.com/srtaheri/OptimalAdjustmentSet.
Collapse
Affiliation(s)
- Sara Mohammad-Taheri
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA
| | - Vartika Tewari
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA
| | - Rohan Kapre
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA
| | | | - Karen Sachs
- Next Generation Analytics, Palo Alto California, USA
- Modulo Bio Inc, Los Altos, California, USA
- Answer ALS, New Orleans, LA, USA
| | - Charles Tapley Hoyt
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, Massachusetts, USA
| | - Jeremy Zucker
- Pacific Northwest National Laboratory, Richland, Washington 99354, USA
| | - Olga Vitek
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA
| |
Collapse
|
38
|
Shin J, Rychel K, Palsson BO. Systems biology of competency in Vibrio natriegens is revealed by applying novel data analytics to the transcriptome. Cell Rep 2023; 42:112619. [PMID: 37285268 DOI: 10.1016/j.celrep.2023.112619] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 04/27/2023] [Accepted: 05/22/2023] [Indexed: 06/09/2023] Open
Abstract
Vibrio natriegens regulates natural competence through the TfoX and QstR transcription factors, which are involved in external DNA capture and transport. However, the extensive genetic and transcriptional regulatory basis for competency remains unknown. We used a machine-learning approach to decompose Vibrio natriegens's transcriptome into 45 groups of independently modulated sets of genes (iModulons). Our findings show that competency is associated with the repression of two housekeeping iModulons (iron metabolism and translation) and the activation of six iModulons; including TfoX and QstR, a novel iModulon of unknown function, and three housekeeping iModulons (representing motility, polycations, and reactive oxygen species [ROS] responses). Phenotypic screening of 83 gene deletion strains demonstrates that loss of iModulon function reduces or eliminates competency. This database-iModulon-discovery cycle unveils the transcriptomic basis for competency and its relationship to housekeeping functions. These results provide the genetic basis for systems biology of competency in this organism.
Collapse
Affiliation(s)
- Jongoh Shin
- Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA
| | - Kevin Rychel
- Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA
| | - Bernhard O Palsson
- Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA; Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Lyngby, Denmark; Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
39
|
Rodionova IA, Lim HG, Rodionov DA, Hutchison Y, Dalldorf C, Gao Y, Monk J, Palsson BO. CyuR is a Dual Regulator for L-Cysteine Dependent Antimicrobial Resistance in Escherichia coli. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.16.541025. [PMID: 37292663 PMCID: PMC10245726 DOI: 10.1101/2023.05.16.541025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Hydrogen sulfide (H 2 S), mainly produced from L-cysteine (Cys), renders bacteria highly resistant to oxidative stress. This mitigation of oxidative stress was suggested to be an important survival mechanism to achieve antimicrobial resistance (AMR) in many pathogenic bacteria. CyuR (known as DecR or YbaO) is a recently characterized Cys-dependent transcription regulator, responsible for the activation of the cyuAP operon and generation of hydrogen sulfide from Cys. Despite its potential importance, the regulatory network of CyuR remains poorly understood. In this study, we investigated the roles of the CyuR regulon in a Cys-dependent AMR mechanism in E. coli strains. We found: 1) Cys metabolism has a significant role in AMR and its effect is conserved in many E. coli strains, including clinical isolates; 2) CyuR negatively controls the expression of mdlAB encoding a transporter that exports antibiotics such as cefazolin and vancomycin; 3) CyuR binds to a DNA sequence motif 'GAAwAAATTGTxGxxATTTsyCC' in the absence of Cys, confirmed by an in vitro binding assay; and 4) CyuR may regulate 25 additional genes as suggested by in silico motif scanning and transcriptome sequencing. Collectively, our findings expanded the understanding of the biological roles of CyuR relevant to antibiotic resistance associated with Cys.
Collapse
|
40
|
Bettridge K, Harris FE, Yehya N, Xiao J. RNAP Promoter Search and Transcription Kinetics in Live E. coli Cells. J Phys Chem B 2023; 127:3816-3828. [PMID: 37098218 PMCID: PMC11212508 DOI: 10.1021/acs.jpcb.2c09142] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/27/2023]
Abstract
Bacterial transcription has been studied extensively in vitro, which has provided detailed molecular mechanisms of transcription. The in vivo cellular environment, however, may impose different rules on transcription than the homogeneous and well-controlled in vitro environment. How an RNA polymerase (RNAP) molecule searches rapidly through vast nonspecific chromosomal DNA in the three-dimensional nucleoid space and identifies a specific promoter sequence remains elusive. Transcription kinetics in vivo could also be impacted by specific cellular environments including nucleoid organization and nutrient availability. In this work, we investigated the promoter search dynamics and transcription kinetics of RNAP in live E. coli cells. Using single-molecule tracking (SMT) and fluorescence recovery after photobleaching (FRAP) across different genetic, drug inhibition, and growth conditions, we observed that RNAP's promoter search is facilitated by nonspecific DNA interactions and is largely independent of nucleoid organization, growth condition, transcription activity, or promoter class. RNAP's transcription kinetics, however, are sensitive to these conditions and mainly modulated at the levels of actively engaged RNAP and the promoter escape rate. Our work establishes a foundation for further mechanistic studies of bacterial transcription in live cells.
Collapse
Affiliation(s)
- Kelsey Bettridge
- Department of Biophysics and Biophysical Chemistry, Johns Hopkins School of Medicine, Baltimore, Maryland 21287-0010, United States
| | - Frances E Harris
- Department of Biophysics and Biophysical Chemistry, Johns Hopkins School of Medicine, Baltimore, Maryland 21287-0010, United States
| | - Nicolás Yehya
- Department of Biophysics and Biophysical Chemistry, Johns Hopkins School of Medicine, Baltimore, Maryland 21287-0010, United States
| | - Jie Xiao
- Department of Biophysics and Biophysical Chemistry, Johns Hopkins School of Medicine, Baltimore, Maryland 21287-0010, United States
| |
Collapse
|
41
|
Kwon MS, Adidjaja JJ, Kim HU. Predicting the effects of cultivation condition on gene regulation in Escherichia coli by using deep learning. Comput Struct Biotechnol J 2023; 21:2613-2620. [PMID: 38213890 PMCID: PMC10781998 DOI: 10.1016/j.csbj.2023.04.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 04/02/2023] [Accepted: 04/12/2023] [Indexed: 01/13/2024] Open
Abstract
Cell's physiology is affected by cultivation conditions at varying degrees, including carbon sources and inorganic nutrients in growth medium, and the presence or absence of aeration. When examining the effects of cultivation conditions on the cell, the cell's transcriptional response is often examined first among other phenotypes (e.g., proteome and metabolome). In this regard, we developed DeepMGR, a deep learning model that predicts the effects of culture media on gene regulation in Escherichia coli. DeepMGR specifically classifies the direction of gene regulation (i.e., upregulation, no regulation, or downregulation) for an input gene in comparison with M9 minimal medium with glucose as a control condition. For this classification task, DeepMGR uses a feedforward neural network to process: i) DNA sequence of a target gene, ii) presence or absence of aeration and trace elements, and iii) concentration and structural information (SMILES) of up to ten nutrients. The complete DeepMGR showed accuracy of 0.867 and F1 score of 0.703 for a test set from the gold standard dataset. DeepMGR was further subjected to simulation studies for validation where regulation directions for groups of homologous genes were predicted, and the DeepMGR results were compared with the literature with focus on carbon sources that upregulate specific genes. DeepMGR will be useful for designing experiments to understand gene regulations, especially in the context of metabolic engineering.
Collapse
Affiliation(s)
- Mun Su Kwon
- Systems Biology and Medicine Laboratory, Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
| | - Joshua Julio Adidjaja
- Systems Biology and Medicine Laboratory, Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
| | - Hyun Uk Kim
- Systems Biology and Medicine Laboratory, Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
- BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, Daejeon 34141, Republic of Korea
| |
Collapse
|
42
|
Dalldorf C, Rychel K, Szubin R, Hefner Y, Patel A, Zielinski DC, Palsson BO. The hallmarks of a tradeoff in transcriptomes that balances stress and growth functions. RESEARCH SQUARE 2023:rs.3.rs-2729651. [PMID: 37090546 PMCID: PMC10120744 DOI: 10.21203/rs.3.rs-2729651/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Fit phenotypes are achieved through optimal transcriptomic allocation. Here, we performed a high-resolution, multi-scale study of the transcriptomic tradeoff between two key fitness phenotypes, stress response (fear) and growth (greed), in Escherichia coli. We introduced twelve RNA polymerase (RNAP) mutations commonly acquired during adaptive laboratory evolution (ALE) and found that single mutations resulted in large shifts in the fear vs. greed tradeoff, likely through destabilizing the rpoB-rpoC interface. RpoS and GAD regulons drive the fear response while ribosomal proteins and the ppGpp regulon underlie greed. Growth rate selection pressure during ALE results in endpoint strains that often have RNAP mutations, with synergistic mutations reflective of particular conditions. A phylogenetic analysis found the tradeoff in numerous bacteria species. The results suggest that the fear vs. greed tradeoff represents a general principle of transcriptome allocation in bacteria where small genetic changes can result in large phenotypic adaptations to growth conditions.
Collapse
Affiliation(s)
- Christopher Dalldorf
- Department of Bioengineering, University of California, San Diego, La Jolla, USA
| | - Kevin Rychel
- Department of Bioengineering, University of California, San Diego, La Jolla, USA
| | - Richard Szubin
- Department of Bioengineering, University of California, San Diego, La Jolla, USA
| | - Ying Hefner
- Department of Bioengineering, University of California, San Diego, La Jolla, USA
| | - Arjun Patel
- Department of Bioengineering, University of California, San Diego, La Jolla, USA
| | - Daniel C. Zielinski
- Department of Bioengineering, University of California, San Diego, La Jolla, USA
| | - Bernhard O. Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, USA
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, USA
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
- Center for Microbiome Innovation, University of California San Diego, La Jolla, CA 92093, USA
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800 Kongens, Lyngby, Denmark
| |
Collapse
|
43
|
Takano S, Takahashi H, Yama Y, Miyazaki R, Furusawa C, Tsuru S. Inference of transcriptome signatures of Escherichia coli in long-term stationary phase. Sci Rep 2023; 13:5647. [PMID: 37024648 PMCID: PMC10079935 DOI: 10.1038/s41598-023-32525-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 03/29/2023] [Indexed: 04/08/2023] Open
Abstract
"Non-growing" is a dominant life form of microorganisms in nature, where available nutrients and resources are limited. In laboratory culture systems, Escherichia coli can survive for years under starvation, denoted as long-term stationary phase, where a small fraction of cells manages to survive by recycling resources released from nonviable cells. Although the physiology by which viable cells in long-term stationary phase adapt to prolonged starvation is of great interest, their genome-wide response has not been fully understood. In this study, we analyzed transcriptional profiles of cells exposed to the supernatant of 30-day long-term stationary phase culture and found that their transcriptome profiles displayed several similar responses to those of cells in the 16-h short-term stationary phase. Nevertheless, our results revealed that cells in long-term stationary phase supernatant exhibit higher expressions of stress-response genes such as phage shock proteins (psp), and lower expressions of growth-related genes such as ribosomal proteins than those in the short-term stationary phase. We confirmed that the mutant lacking the psp operon showed lower survival and growth rate in the long-term stationary phase culture. This study identified transcriptional responses for stress-resistant physiology in the long-term stationary phase environment.
Collapse
Affiliation(s)
- Sotaro Takano
- Bioproduction Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Japan
- International Center for Materials Nanoarchitectonics (NIMS), Research Center for Macromolecules and Biomaterials, Tsukuba, Japan
| | - Hiromi Takahashi
- Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan
| | - Yoshie Yama
- Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan
| | - Ryo Miyazaki
- Bioproduction Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Japan
- Computational Bio Big Data Open Innovation Laboratory (CBBD-OIL), AIST, Tokyo, Japan
| | - Chikara Furusawa
- Graduate School of Science, Universal Biology Institute, The University of Tokyo, Tokyo, Japan
- Center for Biosystem Dynamics Research, RIKEN, Kobe, Japan
| | - Saburo Tsuru
- Graduate School of Science, Universal Biology Institute, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
44
|
Bang I, Lee SM, Park S, Park JY, Nong LK, Gao Y, Palsson BO, Kim D. Deep-learning optimized DEOCSU suite provides an iterable pipeline for accurate ChIP-exo peak calling. Brief Bioinform 2023; 24:7005164. [PMID: 36702751 DOI: 10.1093/bib/bbad024] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 01/02/2023] [Accepted: 01/08/2023] [Indexed: 01/28/2023] Open
Abstract
Recognizing binding sites of DNA-binding proteins is a key factor for elucidating transcriptional regulation in organisms. ChIP-exo enables researchers to delineate genome-wide binding landscapes of DNA-binding proteins with near single base-pair resolution. However, the peak calling step hinders ChIP-exo application since the published algorithms tend to generate false-positive and false-negative predictions. Here, we report the development of DEOCSU (DEep-learning Optimized ChIP-exo peak calling SUite), a novel machine learning-based ChIP-exo peak calling suite. DEOCSU entails the deep convolutional neural network model which was trained with curated ChIP-exo peak data to distinguish the visualized data of bona fide peaks from false ones. Performance validation of the trained deep-learning model indicated its high accuracy, high precision and high recall of over 95%. Applying the new suite to both in-house and publicly available ChIP-exo datasets obtained from bacteria, eukaryotes and archaea revealed an accurate prediction of peaks containing canonical motifs, highlighting the versatility and efficiency of DEOCSU. Furthermore, DEOCSU can be executed on a cloud computing platform or the local environment. With visualization software included in the suite, adjustable options such as the threshold of peak probability, and iterable updating of the pre-trained model, DEOCSU can be optimized for users' specific needs.
Collapse
Affiliation(s)
- Ina Bang
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Sang-Mok Lee
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Seojoung Park
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Joon Young Park
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Linh Khanh Nong
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Ye Gao
- Department of Bioengineering, University of California San Diego, La Jolla CA 92093, USA
| | - Bernhard O Palsson
- Department of Bioengineering, University of California San Diego, La Jolla CA 92093, USA
- Department of Pediatrics, University of California San Diego, La Jolla CA 92093, USA
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Donghyuk Kim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| |
Collapse
|
45
|
Laboratory evolution reveals general and specific tolerance mechanisms for commodity chemicals. Metab Eng 2023; 76:179-192. [PMID: 36738854 DOI: 10.1016/j.ymben.2023.01.012] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 01/06/2023] [Accepted: 01/30/2023] [Indexed: 02/05/2023]
Abstract
Although strain tolerance to high product concentrations is a barrier to the economically viable biomanufacturing of industrial chemicals, chemical tolerance mechanisms are often unknown. To reveal tolerance mechanisms, an automated platform was utilized to evolve Escherichia coli to grow optimally in the presence of 11 industrial chemicals (1,2-propanediol, 2,3-butanediol, glutarate, adipate, putrescine, hexamethylenediamine, butanol, isobutyrate, coumarate, octanoate, hexanoate), reaching tolerance at concentrations 60%-400% higher than initial toxic levels. Sequencing genomes of 223 isolates from 89 populations, reverse engineering, and cross-compound tolerance profiling were employed to uncover tolerance mechanisms. We show that: 1) cells are tolerized via frequent mutation of membrane transporters or cell wall-associated proteins (e.g., ProV, KgtP, SapB, NagA, NagC, MreB), transcription and translation machineries (e.g., RpoA, RpoB, RpoC, RpsA, RpsG, NusA, Rho), stress signaling proteins (e.g., RelA, SspA, SpoT, YobF), and for certain chemicals, regulators and enzymes in metabolism (e.g., MetJ, NadR, GudD, PurT); 2) osmotic stress plays a significant role in tolerance when chemical concentrations exceed a general threshold and mutated genes frequently overlap with those enabling chemical tolerance in membrane transporters and cell wall-associated proteins; 3) tolerization to a specific chemical generally improves tolerance to structurally similar compounds whereas a tradeoff can occur on dissimilar chemicals, and 4) using pre-tolerized starting isolates can hugely enhance the subsequent production of chemicals when a production pathway is inserted in many, but not all, evolved tolerized host strains, underpinning the need for evolving multiple parallel populations. Taken as a whole, this study provides a comprehensive genotype-phenotype map based on identified mutations and growth phenotypes for 223 chemical tolerant isolates.
Collapse
|
46
|
Patel A, McGrosso D, Hefner Y, Campeau A, Sastry AV, Maurya S, Rychel K, Gonzalez DJ, Palsson BO. Proteome allocation is linked to transcriptional regulation through a modularized transcriptome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.20.529291. [PMID: 36865326 PMCID: PMC9980150 DOI: 10.1101/2023.02.20.529291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
Abstract
It has proved challenging to quantitatively relate the proteome to the transcriptome on a per-gene basis. Recent advances in data analytics have enabled a biologically meaningful modularization of the bacterial transcriptome. We thus investigated whether matched datasets of transcriptomes and proteomes from bacteria under diverse conditions could be modularized in the same way to reveal novel relationships between their compositions. We found that; 1) the modules of the proteome and the transcriptome are comprised of a similar list of gene products, 2) the modules in the proteome often represent combinations of modules from the transcriptome, 3) known transcriptional and post-translational regulation is reflected in differences between two sets of modules, allowing for knowledge-mapping when interpreting module functions, and 4) through statistical modeling, absolute proteome allocation can be inferred from the transcriptome alone. Quantitative and knowledge-based relationships can thus be found at the genome-scale between the proteome and transcriptome in bacteria.
Collapse
Affiliation(s)
- Arjun Patel
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Dominic McGrosso
- Department of Pharmacology, University of California, San Diego, La Jolla, CA 92093, USA
| | - Ying Hefner
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Anaamika Campeau
- Department of Pharmacology, University of California, San Diego, La Jolla, CA 92093, USA
| | - Anand V. Sastry
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Svetlana Maurya
- Department of Pharmacology, University of California, San Diego, La Jolla, CA 92093, USA
| | - Kevin Rychel
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - David J Gonzalez
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA 92093, USA
| | - Bernhard O. Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800 Kgs. Lyngby, Denmark
| |
Collapse
|
47
|
Grézal G, Spohn R, Méhi O, Dunai A, Lázár V, Bálint B, Nagy I, Pál C, Papp B. Plasticity and Stereotypic Rewiring of the Transcriptome Upon Bacterial Evolution of Antibiotic Resistance. Mol Biol Evol 2023; 40:7013728. [PMID: 36718533 PMCID: PMC9927579 DOI: 10.1093/molbev/msad020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 12/01/2022] [Accepted: 01/04/2023] [Indexed: 02/01/2023] Open
Abstract
Bacterial evolution of antibiotic resistance frequently has deleterious side effects on microbial growth, virulence, and susceptibility to other antimicrobial agents. However, it is unclear how these trade-offs could be utilized for manipulating antibiotic resistance in the clinic, not least because the underlying molecular mechanisms are poorly understood. Using laboratory evolution, we demonstrate that clinically relevant resistance mutations in Escherichia coli constitutively rewire a large fraction of the transcriptome in a repeatable and stereotypic manner. Strikingly, lineages adapted to functionally distinct antibiotics and having no resistance mutations in common show a wide range of parallel gene expression changes that alter oxidative stress response, iron homeostasis, and the composition of the bacterial outer membrane and cell surface. These common physiological alterations are associated with changes in cell morphology and enhanced sensitivity to antimicrobial peptides. Finally, the constitutive transcriptomic changes induced by resistance mutations are largely distinct from those induced by antibiotic stresses in the wild type. This indicates a limited role for genetic assimilation of the induced antibiotic stress response during resistance evolution. Our work suggests that diverse resistance mutations converge on similar global transcriptomic states that shape genetic susceptibility to antimicrobial compounds.
Collapse
Affiliation(s)
- Gábor Grézal
- HCEMM-BRC Metabolic Systems Biology Lab, Szeged, Hungary,Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network, Szeged, Hungary
| | - Réka Spohn
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network, Szeged, Hungary
| | - Orsolya Méhi
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network, Szeged, Hungary,HCEMM-BRC Translational Microbiology Research Lab, Szeged, Hungary
| | - Anett Dunai
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network, Szeged, Hungary
| | - Viktória Lázár
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network, Szeged, Hungary,HCEMM-BRC Pharmacodynamic Drug Interaction Research Group, Szeged, Hungary
| | - Balázs Bálint
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network, Szeged, Hungary,SeqOmics Biotechnology Ltd., Mórahalom, Hungary
| | - István Nagy
- SeqOmics Biotechnology Ltd., Mórahalom, Hungary,Sequencing Platform, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network, Szeged, Hungary
| | - Csaba Pál
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network, Szeged, Hungary,National Laboratory of Biotechnology, Biological Research Centre, Eötvös Loránd Research Network, Szeged, Hungary
| | | |
Collapse
|
48
|
Lee H, Im H, Hwang SH, Ko D, Choi SH. Two novel genes identified by large-scale transcriptomic analysis are essential for biofilm and rugose colony development of Vibrio vulnificus. PLoS Pathog 2023; 19:e1011064. [PMID: 36656902 PMCID: PMC9888727 DOI: 10.1371/journal.ppat.1011064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 01/31/2023] [Accepted: 12/13/2022] [Indexed: 01/20/2023] Open
Abstract
Many pathogenic bacteria form biofilms to survive under environmental stresses and host immune defenses. Differential expression (DE) analysis of the genes in biofilm and planktonic cells under a single condition, however, has limitations to identify the genes essential for biofilm formation. Independent component analysis (ICA), a machine learning algorithm, was adopted to comprehensively identify the biofilm genes of Vibrio vulnificus, a fulminating human pathogen, in this study. ICA analyzed the large-scale transcriptome data of V. vulnificus cells under various biofilm and planktonic conditions and then identified a total of 72 sets of independently co-regulated genes, iModulons. Among the three iModulons specifically activated in biofilm cells, BrpT-iModulon mainly consisted of known genes of the regulon of BrpT, a transcriptional regulator controlling biofilm formation of V. vulnificus. Interestingly, the BrpT-iModulon additionally contained two novel genes, VV1_3061 and VV2_1694, designated as cabH and brpN, respectively. cabH and brpN were shared in other Vibrio species and not yet identified by DE analyses. Genetic and biochemical analyses revealed that cabH and brpN are directly up-regulated by BrpT. The deletion of cabH and brpN impaired the robust biofilm and rugose colony formation. CabH, structurally similar to the previously known calcium-binding matrix protein CabA, was essential for attachment to the surface. BrpN, carrying an acyltransferase-3 domain as observed in BrpL, played an important role in exopolysaccharide production. Altogether, ICA identified two novel genes, cabH and brpN, which are regulated by BrpT and essential for the development of robust biofilms and rugose colonies of V. vulnificus.
Collapse
Affiliation(s)
- Hojun Lee
- National Research Laboratory of Molecular Microbiology and Toxicology, Department of Agricultural Biotechnology, Seoul National University, Seoul, Republic of Korea
- Center for Food and Bioconvergence, and Research Institute of Agriculture and Life Science, Seoul National University, Seoul, Republic of Korea
| | - Hanhyeok Im
- National Research Laboratory of Molecular Microbiology and Toxicology, Department of Agricultural Biotechnology, Seoul National University, Seoul, Republic of Korea
- Center for Food and Bioconvergence, and Research Institute of Agriculture and Life Science, Seoul National University, Seoul, Republic of Korea
| | - Seung-Ho Hwang
- National Research Laboratory of Molecular Microbiology and Toxicology, Department of Agricultural Biotechnology, Seoul National University, Seoul, Republic of Korea
- Center for Food and Bioconvergence, and Research Institute of Agriculture and Life Science, Seoul National University, Seoul, Republic of Korea
| | - Duhyun Ko
- National Research Laboratory of Molecular Microbiology and Toxicology, Department of Agricultural Biotechnology, Seoul National University, Seoul, Republic of Korea
- Center for Food and Bioconvergence, and Research Institute of Agriculture and Life Science, Seoul National University, Seoul, Republic of Korea
| | - Sang Ho Choi
- National Research Laboratory of Molecular Microbiology and Toxicology, Department of Agricultural Biotechnology, Seoul National University, Seoul, Republic of Korea
- Center for Food and Bioconvergence, and Research Institute of Agriculture and Life Science, Seoul National University, Seoul, Republic of Korea
- * E-mail:
| |
Collapse
|
49
|
Chen JW, Shrestha L, Green G, Leier A, Marquez-Lago TT. The hitchhikers' guide to RNA sequencing and functional analysis. Brief Bioinform 2023; 24:bbac529. [PMID: 36617463 PMCID: PMC9851315 DOI: 10.1093/bib/bbac529] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 10/18/2022] [Accepted: 11/07/2022] [Indexed: 01/10/2023] Open
Abstract
DNA and RNA sequencing technologies have revolutionized biology and biomedical sciences, sequencing full genomes and transcriptomes at very high speeds and reasonably low costs. RNA sequencing (RNA-Seq) enables transcript identification and quantification, but once sequencing has concluded researchers can be easily overwhelmed with questions such as how to go from raw data to differential expression (DE), pathway analysis and interpretation. Several pipelines and procedures have been developed to this effect. Even though there is no unique way to perform RNA-Seq analysis, it usually follows these steps: 1) raw reads quality check, 2) alignment of reads to a reference genome, 3) aligned reads' summarization according to an annotation file, 4) DE analysis and 5) gene set analysis and/or functional enrichment analysis. Each step requires researchers to make decisions, and the wide variety of options and resulting large volumes of data often lead to interpretation challenges. There also seems to be insufficient guidance on how best to obtain relevant information and derive actionable knowledge from transcription experiments. In this paper, we explain RNA-Seq steps in detail and outline differences and similarities of different popular options, as well as advantages and disadvantages. We also discuss non-coding RNA analysis, multi-omics, meta-transcriptomics and the use of artificial intelligence methods complementing the arsenal of tools available to researchers. Lastly, we perform a complete analysis from raw reads to DE and functional enrichment analysis, visually illustrating how results are not absolute truths and how algorithmic decisions can greatly impact results and interpretation.
Collapse
Affiliation(s)
- Jiung-Wen Chen
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Lisa Shrestha
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| | - George Green
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - André Leier
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| | - Tatiana T Marquez-Lago
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Microbiology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| |
Collapse
|
50
|
Tjaden B. Escherichia coli transcriptome assembly from a compendium of RNA-seq data sets. RNA Biol 2023; 20:77-84. [PMID: 36920168 PMCID: PMC10392735 DOI: 10.1080/15476286.2023.2189331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Revised: 01/09/2023] [Accepted: 01/27/2023] [Indexed: 03/16/2023] Open
Abstract
Owing to the complexities of bacterial RNA biology, the transcriptomes of even the best studied bacteria are not fully understood. To help elucidate the transcriptional landscape of E. coli, we compiled a compendium of 3,376 RNA-seq data sets composed of more than 7 trillion sequenced bases, which we evaluate with a transcript assembly pipeline. We report expression profiles for all annotated E. coli genes as well as 5,071 other transcripts. Additionally, we observe hundreds of instances of co-transcribed genes that are novel with respect to existing operon databases. By integrating data from a large number of sequencing experiments corresponding to a wide range of conditions, we are able to obtain a comprehensive view of the E. coli transcriptome.
Collapse
Affiliation(s)
- Brian Tjaden
- Department of Computer Science, Wellesley College, Wellesley, MA, USA
| |
Collapse
|