1
|
Zilberzwige-Tal S, Fontanarrosa P, Bychenko D, Dorfan Y, Gazit E, Myers CJ. Investigating and Modeling the Factors That Affect Genetic Circuit Performance. ACS Synth Biol 2023; 12:3189-3204. [PMID: 37916512 PMCID: PMC10661042 DOI: 10.1021/acssynbio.3c00151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Indexed: 11/03/2023]
Abstract
Over the past 2 decades, synthetic biology has yielded ever more complex genetic circuits that are able to perform sophisticated functions in response to specific signals. Yet, genetic circuits are not immediately transferable to an outside-the-lab setting where their performance is highly compromised. We propose introducing a broader test step to the design-build-test-learn workflow to include factors that might contribute to unexpected genetic circuit performance. As a proof of concept, we have designed and evaluated a genetic circuit in various temperatures, inducer concentrations, nonsterilized soil exposure, and bacterial growth stages. We determined that the circuit's performance is dramatically altered when these factors differ from the optimal lab conditions. We observed significant changes in the time for signal detection as well as signal intensity when the genetic circuit was tested under nonoptimal lab conditions. As a learning effort, we then proceeded to generate model predictions in untested conditions, which is currently lacking in synthetic biology application design. Furthermore, broader test and learn steps uncovered a negative correlation between the time it takes for a gate to turn ON and the bacterial growth phases. As the synthetic biology discipline transitions from proof-of-concept genetic programs to appropriate and safe application implementations, more emphasis on test and learn steps (i.e., characterizing parts and circuits for a broad range of conditions) will provide missing insights on genetic circuit behavior outside the lab.
Collapse
Affiliation(s)
- Shai Zilberzwige-Tal
- The
Shmunis School of Biomedicine and Cancer Research, Life Sciences Faculty, Tel Aviv University, Tel Aviv-Yafo 6997801, Israel
| | - Pedro Fontanarrosa
- Department
of Electrical, Computer, and Energy Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Darya Bychenko
- The
Shmunis School of Biomedicine and Cancer Research, Life Sciences Faculty, Tel Aviv University, Tel Aviv-Yafo 6997801, Israel
| | - Yuval Dorfan
- Department
of Electrical, Computer, and Energy Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
- Bio-engineering,
Electrical Engineering Faculty, Holon Institute
of Technology (HIT), Holon 5810201, Israel
- Alagene
Ltd., Innovation Center, Reichman University, Herzliya 7670608, Israel
| | - Ehud Gazit
- The
Shmunis School of Biomedicine and Cancer Research, Life Sciences Faculty, Tel Aviv University, Tel Aviv-Yafo 6997801, Israel
| | - Chris J. Myers
- Department
of Electrical, Computer, and Energy Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| |
Collapse
|
2
|
Kwon MS, Adidjaja JJ, Kim HU. Predicting the effects of cultivation condition on gene regulation in Escherichia coli by using deep learning. Comput Struct Biotechnol J 2023; 21:2613-2620. [PMID: 38213890 PMCID: PMC10781998 DOI: 10.1016/j.csbj.2023.04.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 04/02/2023] [Accepted: 04/12/2023] [Indexed: 01/13/2024] Open
Abstract
Cell's physiology is affected by cultivation conditions at varying degrees, including carbon sources and inorganic nutrients in growth medium, and the presence or absence of aeration. When examining the effects of cultivation conditions on the cell, the cell's transcriptional response is often examined first among other phenotypes (e.g., proteome and metabolome). In this regard, we developed DeepMGR, a deep learning model that predicts the effects of culture media on gene regulation in Escherichia coli. DeepMGR specifically classifies the direction of gene regulation (i.e., upregulation, no regulation, or downregulation) for an input gene in comparison with M9 minimal medium with glucose as a control condition. For this classification task, DeepMGR uses a feedforward neural network to process: i) DNA sequence of a target gene, ii) presence or absence of aeration and trace elements, and iii) concentration and structural information (SMILES) of up to ten nutrients. The complete DeepMGR showed accuracy of 0.867 and F1 score of 0.703 for a test set from the gold standard dataset. DeepMGR was further subjected to simulation studies for validation where regulation directions for groups of homologous genes were predicted, and the DeepMGR results were compared with the literature with focus on carbon sources that upregulate specific genes. DeepMGR will be useful for designing experiments to understand gene regulations, especially in the context of metabolic engineering.
Collapse
Affiliation(s)
- Mun Su Kwon
- Systems Biology and Medicine Laboratory, Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
| | - Joshua Julio Adidjaja
- Systems Biology and Medicine Laboratory, Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
| | - Hyun Uk Kim
- Systems Biology and Medicine Laboratory, Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
- BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, Daejeon 34141, Republic of Korea
| |
Collapse
|
3
|
Garcia BJ, Urrutia J, Zheng G, Becker D, Corbet C, Maschhoff P, Cristofaro A, Gaffney N, Vaughn M, Saxena U, Chen YP, Gordon DB, Eslami M. A toolkit for enhanced reproducibility of RNASeq analysis for synthetic biologists. SYNTHETIC BIOLOGY (OXFORD, ENGLAND) 2022; 7:ysac012. [PMID: 36035514 PMCID: PMC9408027 DOI: 10.1093/synbio/ysac012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 06/17/2022] [Accepted: 08/22/2022] [Indexed: 11/13/2022]
Abstract
Sequencing technologies, in particular RNASeq, have become critical tools in the design, build, test and learn cycle of synthetic biology. They provide a better understanding of synthetic designs, and they help identify ways to improve and select designs. While these data are beneficial to design, their collection and analysis is a complex, multistep process that has implications on both discovery and reproducibility of experiments. Additionally, tool parameters, experimental metadata, normalization of data and standardization of file formats present challenges that are computationally intensive. This calls for high-throughput pipelines expressly designed to handle the combinatorial and longitudinal nature of synthetic biology. In this paper, we present a pipeline to maximize the analytical reproducibility of RNASeq for synthetic biologists. We also explore the impact of reproducibility on the validation of machine learning models. We present the design of a pipeline that combines traditional RNASeq data processing tools with structured metadata tracking to allow for the exploration of the combinatorial design in a high-throughput and reproducible manner. We then demonstrate utility via two different experiments: a control comparison experiment and a machine learning model experiment. The first experiment compares datasets collected from identical biological controls across multiple days for two different organisms. It shows that a reproducible experimental protocol for one organism does not guarantee reproducibility in another. The second experiment quantifies the differences in experimental runs from multiple perspectives. It shows that the lack of reproducibility from these different perspectives can place an upper bound on the validation of machine learning models trained on RNASeq data.
Graphical Abstract
Collapse
Affiliation(s)
- Benjamin J Garcia
- Department of Biological Engineering, Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Joshua Urrutia
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX, USA
| | | | | | | | | | - Alexander Cristofaro
- Department of Biological Engineering, Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Niall Gaffney
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX, USA
| | - Matthew Vaughn
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX, USA
| | - Uma Saxena
- Department of Biological Engineering, Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | - D Benjamin Gordon
- Department of Biological Engineering, Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, USA
| | | |
Collapse
|
4
|
Ghadami A, Epureanu BI. Data-driven prediction in dynamical systems: recent developments. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2022; 380:20210213. [PMID: 35719077 PMCID: PMC9207538 DOI: 10.1098/rsta.2021.0213] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
In recent years, we have witnessed a significant shift toward ever-more complex and ever-larger-scale systems in the majority of the grand societal challenges tackled in applied sciences. The need to comprehend and predict the dynamics of complex systems have spurred developments in large-scale simulations and a multitude of methods across several disciplines. The goals of understanding and prediction in complex dynamical systems, however, have been hindered by high dimensionality, complexity and chaotic behaviours. Recent advances in data-driven techniques and machine-learning approaches have revolutionized how we model and analyse complex systems. The integration of these techniques with dynamical systems theory opens up opportunities to tackle previously unattainable challenges in modelling and prediction of dynamical systems. While data-driven prediction methods have made great strides in recent years, it is still necessary to develop new techniques to improve their applicability to a wider range of complex systems in science and engineering. This focus issue shares recent developments in the field of complex dynamical systems with emphasis on data-driven, data-assisted and artificial intelligence-based discovery of dynamical systems. This article is part of the theme issue 'Data-driven prediction in dynamical systems'.
Collapse
Affiliation(s)
- Amin Ghadami
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI, USA
| | - Bogdan I. Epureanu
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|