1
|
Jiang R, Yuan S, Zhou Y, Wei Y, Li F, Wang M, Chen B, Yu H. Strategies to overcome the challenges of low or no expression of heterologous proteins in Escherichia coli. Biotechnol Adv 2024; 75:108417. [PMID: 39038691 DOI: 10.1016/j.biotechadv.2024.108417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 07/18/2024] [Accepted: 07/19/2024] [Indexed: 07/24/2024]
Abstract
Protein expression is a critical process in diverse biological systems. For Escherichia coli, a widely employed microbial host in industrial catalysis and healthcare, researchers often face significant challenges in constructing recombinant expression systems. To maximize the potential of E. coli expression systems, it is essential to address problems regarding the low or absent production of certain target proteins. This article presents viable solutions to the main factors posing challenges to heterologous protein expression in E. coli, which includes protein toxicity, the intrinsic influence of gene sequences, and mRNA structure. These strategies include specialized approaches for managing toxic protein expression, addressing issues related to mRNA structure and codon bias, advanced codon optimization methodologies that consider multiple factors, and emerging optimization techniques facilitated by big data and machine learning.
Collapse
Affiliation(s)
- Ruizhao Jiang
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, China; Key Laboratory of Industrial Biocatalysis (Tsinghua University), the Ministry of Education, Beijing 100084, China
| | - Shuting Yuan
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, China; Key Laboratory of Industrial Biocatalysis (Tsinghua University), the Ministry of Education, Beijing 100084, China
| | - Yilong Zhou
- Tanwei College, Tsinghua University, Beijing 100084, China
| | - Yuwen Wei
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, China; Key Laboratory of Industrial Biocatalysis (Tsinghua University), the Ministry of Education, Beijing 100084, China
| | - Fulong Li
- Beijing Evolyzer Co.,Ltd., 100176, China
| | | | - Bo Chen
- Beijing Evolyzer Co.,Ltd., 100176, China
| | - Huimin Yu
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, China; Key Laboratory of Industrial Biocatalysis (Tsinghua University), the Ministry of Education, Beijing 100084, China; Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
2
|
Goshisht MK. Machine Learning and Deep Learning in Synthetic Biology: Key Architectures, Applications, and Challenges. ACS OMEGA 2024; 9:9921-9945. [PMID: 38463314 PMCID: PMC10918679 DOI: 10.1021/acsomega.3c05913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/19/2024] [Accepted: 01/30/2024] [Indexed: 03/12/2024]
Abstract
Machine learning (ML), particularly deep learning (DL), has made rapid and substantial progress in synthetic biology in recent years. Biotechnological applications of biosystems, including pathways, enzymes, and whole cells, are being probed frequently with time. The intricacy and interconnectedness of biosystems make it challenging to design them with the desired properties. ML and DL have a synergy with synthetic biology. Synthetic biology can be employed to produce large data sets for training models (for instance, by utilizing DNA synthesis), and ML/DL models can be employed to inform design (for example, by generating new parts or advising unrivaled experiments to perform). This potential has recently been brought to light by research at the intersection of engineering biology and ML/DL through achievements like the design of novel biological components, best experimental design, automated analysis of microscopy data, protein structure prediction, and biomolecular implementations of ANNs (Artificial Neural Networks). I have divided this review into three sections. In the first section, I describe predictive potential and basics of ML along with myriad applications in synthetic biology, especially in engineering cells, activity of proteins, and metabolic pathways. In the second section, I describe fundamental DL architectures and their applications in synthetic biology. Finally, I describe different challenges causing hurdles in the progress of ML/DL and synthetic biology along with their solutions.
Collapse
Affiliation(s)
- Manoj Kumar Goshisht
- Department of Chemistry, Natural and
Applied Sciences, University of Wisconsin—Green
Bay, Green
Bay, Wisconsin 54311-7001, United States
| |
Collapse
|
3
|
Liu J, Cui L, Shi X, Yan J, Wang Y, Ni Y, He J, Wang X. Generation of DNAzyme in Bacterial Cells by a Bacterial Retron System. ACS Synth Biol 2024; 13:300-309. [PMID: 38171507 DOI: 10.1021/acssynbio.3c00509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
DNAzymes are catalytically active single-stranded DNAs in which DNAzyme 10-23 (Dz 10-23) consists of a catalytic core and a substrate-binding arm that reduces gene expression through sequence-specific mRNA cleavage. However, the in vivo application of Dz 10-23 depends on exogenous delivery, which leads to its inability to be synthesized and stabilized in vivo, thus limiting its application. As a unique reverse transcription system, the bacterial retron system can synthesize single-stranded DNA in vivo using ncRNA msr/msd as a template. The objective of this work is to reduce target gene expression using Dz 10-23 generated in vivo by the retron system. In this regard, we successfully generated Dz 10-23 by cloning the Dz 10-23 coding sequence into the retron msd gene and tested its ability to reduce specific gene expression by examining the mRNA levels of cfp encoding cyan fluorescence protein and other functional genes such as mreB and ftsZ. We found that Dz had different repressive effects when targeting different mRNA regions, and in general, the repressive effect was stronger when targeting downstream of mRNAs. Our results also suggested that the reduction effect was due to cleavage of the substrate mRNA by Dz 10-23 rather than the antisense effect of the substrate-binding arm. Therefore, this study not only provided a retron-based method for the intracellular generation of Dz 10-23 but also demonstrated that Dz 10-23 could reduce gene expression by cleaving target mRNAs in cells. We believe that this new strategy would have great potential in the regulation of gene expression.
Collapse
Affiliation(s)
- Jie Liu
- National Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Lina Cui
- National Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Xinyu Shi
- National Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Jiahao Yan
- National Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Yifei Wang
- National Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Yuyang Ni
- College of Life Sciences, Shangrao Normal University, Shangrao 334001, PR China
| | - Jin He
- National Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Xun Wang
- National Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| |
Collapse
|
4
|
Khandia R, Pandey MK, Rzhepakovsky IV, Khan AA, Alexiou A. Synonymous Codon Variant Analysis for Autophagic Genes Dysregulated in Neurodegeneration. Mol Neurobiol 2023; 60:2252-2267. [PMID: 36637744 DOI: 10.1007/s12035-022-03081-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 09/27/2022] [Indexed: 01/14/2023]
Abstract
Neurodegenerative disorders are often a culmination of the accumulation of abnormally folded proteins and defective organelles. Autophagy is a process of removing these defective proteins, organelles, and harmful substances from the body, and it works to maintain homeostasis. If autophagic removal of defective proteins has interfered, it affects neuronal health. Some of the autophagic genes are specifically found to be associated with neurodegenerative phenotypes. Non-functional, mutated, or gene copies having silent mutations, often termed synonymous variants, might explain this. However, these synonymous variant which codes for exactly similar proteins have different translation rates, stability, and gene expression profiling. Hence, it would be interesting to study the pattern of synonymous variant usage. In the study, synonymous variant usage in various transcripts of autophagic genes ATG5, ATG7, ATG8A, ATG16, and ATG17/FIP200 reported to cause neurodegeneration (if dysregulated) is studied. These genes were analyzed for their synonymous variant usage; nucleotide composition; any possible nucleotide skew in a gene; physical properties of autophagic protein including GRAVY and AROMA; hydropathicity; instability index; and frequency of acidic, basic, neutral amino acids; and gene expression level. The study will help understand various evolutionary forces acting on these genes and the possible augmentation of a gene if showing unusual behavior.
Collapse
Affiliation(s)
- Rekha Khandia
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, 462026, India.
| | - Megha Katare Pandey
- Department of Translational Medicine, All India Institute of Medical Sciences, Bhopal, 462020, India
| | | | - Azmat Ali Khan
- Pharmaceutical Biotechnology Laboratory, Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, Riyadh, 11451, Saudi Arabia.
| | - Athanasios Alexiou
- Novel Global Community Educational Foundation, Hebersham, Australia
- AFNP Med, Wien, Austria
| |
Collapse
|
5
|
Höllerer S, Jeschek M. Ultradeep characterisation of translational sequence determinants refutes rare-codon hypothesis and unveils quadruplet base pairing of initiator tRNA and transcript. Nucleic Acids Res 2023; 51:2377-2396. [PMID: 36727459 PMCID: PMC10018350 DOI: 10.1093/nar/gkad040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 12/05/2022] [Accepted: 01/13/2023] [Indexed: 02/03/2023] Open
Abstract
Translation is a key determinant of gene expression and an important biotechnological engineering target. In bacteria, 5'-untranslated region (5'-UTR) and coding sequence (CDS) are well-known mRNA parts controlling translation and thus cellular protein levels. However, the complex interaction of 5'-UTR and CDS has so far only been studied for few sequences leading to non-generalisable and partly contradictory conclusions. Herein, we systematically assess the dynamic translation from over 1.2 million 5'-UTR-CDS pairs in Escherichia coli to investigate their collective effect using a new method for ultradeep sequence-function mapping. This allows us to disentangle and precisely quantify effects of various sequence determinants of translation. We find that 5'-UTR and CDS individually account for 53% and 20% of variance in translation, respectively, and show conclusively that, contrary to a common hypothesis, tRNA abundance does not explain expression changes between CDSs with different synonymous codons. Moreover, the obtained large-scale data provide clear experimental evidence for a base-pairing interaction between initiator tRNA and mRNA beyond the anticodon-codon interaction, an effect that is often masked for individual sequences and therefore inaccessible to low-throughput approaches. Our study highlights the indispensability of ultradeep sequence-function mapping to accurately determine the contribution of parts and phenomena involved in gene regulation.
Collapse
Affiliation(s)
- Simon Höllerer
- Department of Biosystems Science and Engineering, Swiss Federal Institute of Technology – ETH Zurich, Basel CH-4058, Switzerland
| | - Markus Jeschek
- To whom correspondence should be addressed. Tel: +49 941 943 3161; Fax: +49 941 943 2403;
| |
Collapse
|
6
|
Fages-Lartaud M, Mueller Y, Elie F, Courtade G, Hohmann-Marriott MF. Standard Intein Gene Expression Ramps (SIGER) for Protein-Independent Expression Control. ACS Synth Biol 2023; 12:1058-1071. [PMID: 36920366 PMCID: PMC10127266 DOI: 10.1021/acssynbio.2c00530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
Abstract
Coordination of multigene expression is one of the key challenges of metabolic engineering for the development of cell factories. Constraints on translation initiation and early ribosome kinetics of mRNA are imposed by features of the 5'UTR in combination with the start of the gene, referred to as the "gene ramp", such as rare codons and mRNA secondary structures. These features strongly influence the translation yield and protein quality by regulating the ribosome distribution on mRNA strands. The utilization of genetic expression sequences, such as promoters and 5'UTRs in combination with different target genes, leads to a wide variety of gene ramp compositions with irregular translation rates, leading to unpredictable levels of protein yield and quality. Here, we present the Standard Intein Gene Expression Ramp (SIGER) system for controlling protein expression. The SIGER system makes use of inteins to decouple the translation initiation features from the gene of a target protein. We generated sequence-specific gene expression sequences for two inteins (DnaB and DnaX) that display defined levels of protein expression. Additionally, we used inteins that possess the ability to release the C-terminal fusion protein in vivo to avoid the impairment of protein functionality by the fused intein. Overall, our results show that SIGER systems are unique tools to mitigate the undesirable effects of gene ramp variation and to control the relative ratios of enzymes involved in molecular pathways. As a proof of concept of the potential of the system, we also used a SIGER system to express two difficult-to-produce proteins, GumM and CBM73.
Collapse
Affiliation(s)
- Maxime Fages-Lartaud
- Department of Biotechnology and Food Science, Norwegian University of Science and Technology, Trondheim N-7491, Norway
| | - Yasmin Mueller
- Department of Biotechnology and Food Science, Norwegian University of Science and Technology, Trondheim N-7491, Norway
| | - Florence Elie
- Department of Biotechnology and Food Science, Norwegian University of Science and Technology, Trondheim N-7491, Norway
| | - Gaston Courtade
- Department of Biotechnology and Food Science, Norwegian University of Science and Technology, Trondheim N-7491, Norway
| | - Martin Frank Hohmann-Marriott
- Department of Biotechnology and Food Science, Norwegian University of Science and Technology, Trondheim N-7491, Norway.,United Scientists CORE (Limited), Dunedin 9016, Aotearoa, New Zealand
| |
Collapse
|
7
|
Wang X, N MPA, Jeon HJ, He J, Lim HM. Identification of a Rho-Dependent Termination Site In Vivo Using Synthetic Small RNA. Microbiol Spectr 2023; 11:e0395022. [PMID: 36651730 PMCID: PMC9927376 DOI: 10.1128/spectrum.03950-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 01/02/2023] [Indexed: 01/19/2023] Open
Abstract
Rho promotes Rho-dependent termination (RDT) at the Rho-dependent terminator, producing a variable-length region without secondary structure at the 3' end of mRNA. Determining the exact RDT site in vivo is challenging, because the 3' end of mRNA is rapidly removed after RDT by 3'-to-5' exonuclease processing. Here, we applied synthetic small RNA (sysRNA) to identify the RDT region in vivo by exploiting its complementary base-pairing ability to target mRNA. Through the combined analyses of rapid amplification of cDNA 3' ends, primer extension, and capillary electrophoresis, we could precisely map and quantify mRNA 3' ends. We found that complementary double-stranded RNA (dsRNA) formed between sysRNA and mRNA was efficiently cleaved by RNase III in the middle of the dsRNA region. The formation of dsRNA appeared to protect the cleaved RNA 3' ends from rapid degradation by 3'-to-5' exonuclease, thereby stabilizing the mRNA 3' end. We further verified that the signal intensity at the 3' end was positively correlated with the amount of mRNA. By constructing a series of sysRNAs with close target sites and comparing the difference in signal intensity at the 3' end of wild-type and Rho-impaired strains, we finally identified a region of increased mRNA expression within the 21-bp range, which was determined as the RDT region. Our results demonstrated the ability to use sysRNA as a novel tool to identify RDT regions in vivo and expand the range of applications of sysRNA. IMPORTANCE sysRNA, which was formerly widely employed, has steadily lost popularity as more novel techniques for suppressing gene expression come into existence because of issues such as unstable inhibition effect and low inhibition efficiency. However, it remains an interesting topic as a regulatory tool due to its ease of design and low metabolic burden on cells. Here, for the first time, we discovered a new method to identify RDT regions in vivo using sysRNA. This new feature is important because since the discovery of the Rho protein in 1969, specific identification of RDT sites in vivo has been difficult due to the rapid processing of RNA 3' ends by exonucleases, and sysRNA might provide a new approach to address this challenge.
Collapse
Affiliation(s)
- Xun Wang
- State Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, Hubei, People’s Republic of China
| | - Monford Paul Abishek N
- Department of Biological Sciences, College of Biological Sciences and Biotechnology, Chungnam National University, Daejeon, Republic of Korea
| | - Heung Jin Jeon
- Department of Biological Sciences, College of Biological Sciences and Biotechnology, Chungnam National University, Daejeon, Republic of Korea
- Infection Control Convergence Research Center, Chungnam National University College of Medicine, Daejeon, Republic of Korea
| | - Jin He
- State Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, Hubei, People’s Republic of China
| | - Heon M. Lim
- Department of Biological Sciences, College of Biological Sciences and Biotechnology, Chungnam National University, Daejeon, Republic of Korea
| |
Collapse
|
8
|
Beardall WA, Stan GB, Dunlop MJ. Deep Learning Concepts and Applications for Synthetic Biology. GEN BIOTECHNOLOGY 2022; 1:360-371. [PMID: 36061221 PMCID: PMC9428732 DOI: 10.1089/genbio.2022.0017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 07/14/2022] [Indexed: 12/24/2022]
Abstract
Synthetic biology has a natural synergy with deep learning. It can be used to generate large data sets to train models, for example by using DNA synthesis, and deep learning models can be used to inform design, such as by generating novel parts or suggesting optimal experiments to conduct. Recently, research at the interface of engineering biology and deep learning has highlighted this potential through successes including the design of novel biological parts, protein structure prediction, automated analysis of microscopy data, optimal experimental design, and biomolecular implementations of artificial neural networks. In this review, we present an overview of synthetic biology-relevant classes of data and deep learning architectures. We also highlight emerging studies in synthetic biology that capitalize on deep learning to enable novel understanding and design, and discuss challenges and future opportunities in this space.
Collapse
Affiliation(s)
- William A.V. Beardall
- Department of Bioengineering, Imperial College London, London, United Kingdom
- Imperial College Centre of Excellence in Synthetic Biology, Imperial College London, London, United Kingdom
| | - Guy-Bart Stan
- Department of Bioengineering, Imperial College London, London, United Kingdom
- Imperial College Centre of Excellence in Synthetic Biology, Imperial College London, London, United Kingdom
| | - Mary J. Dunlop
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
- Biological Design Center, Boston University, Boston, Massachusetts, USA
| |
Collapse
|