1
|
Zelenka NR, Di Cara N, Sharma K, Sarvaharman S, Ghataora JS, Parmeggiani F, Nivala J, Abdallah ZS, Marucci L, Gorochowski TE. Data hazards in synthetic biology. Synth Biol (Oxf) 2024; 9:ysae010. [PMID: 38973982 PMCID: PMC11227101 DOI: 10.1093/synbio/ysae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 05/17/2024] [Accepted: 06/19/2024] [Indexed: 07/09/2024] Open
Abstract
Data science is playing an increasingly important role in the design and analysis of engineered biology. This has been fueled by the development of high-throughput methods like massively parallel reporter assays, data-rich microscopy techniques, computational protein structure prediction and design, and the development of whole-cell models able to generate huge volumes of data. Although the ability to apply data-centric analyses in these contexts is appealing and increasingly simple to do, it comes with potential risks. For example, how might biases in the underlying data affect the validity of a result and what might the environmental impact of large-scale data analyses be? Here, we present a community-developed framework for assessing data hazards to help address these concerns and demonstrate its application to two synthetic biology case studies. We show the diversity of considerations that arise in common types of bioengineering projects and provide some guidelines and mitigating steps. Understanding potential issues and dangers when working with data and proactively addressing them will be essential for ensuring the appropriate use of emerging data-intensive AI methods and help increase the trustworthiness of their applications in synthetic biology.
Collapse
Affiliation(s)
- Natalie R Zelenka
- Jean Golding Institute, University of Bristol, Bristol, UK
- BrisEngBio, University of Bristol, Bristol, UK
| | - Nina Di Cara
- School of Psychological Science, University of Bristol, Bristol, UK
| | - Kieren Sharma
- School of Engineering Mathematics and Technology, University of Bristol, Bristol, UK
| | | | - Jasdeep S Ghataora
- BrisEngBio, University of Bristol, Bristol, UK
- School of Biological Sciences, University of Bristol, Bristol, UK
| | - Fabio Parmeggiani
- BrisEngBio, University of Bristol, Bristol, UK
- School of Biochemistry, University of Bristol, Bristol, UK
- School of Pharmacy and Pharmaceutical Sciences, Cardiff University, Cardiff, UK
| | - Jeff Nivala
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| | - Zahraa S Abdallah
- School of Engineering Mathematics and Technology, University of Bristol, Bristol, UK
| | - Lucia Marucci
- BrisEngBio, University of Bristol, Bristol, UK
- School of Engineering Mathematics and Technology, University of Bristol, Bristol, UK
| | - Thomas E Gorochowski
- BrisEngBio, University of Bristol, Bristol, UK
- School of Biological Sciences, University of Bristol, Bristol, UK
| |
Collapse
|
2
|
Gilliot PA, Gorochowski TE. Transfer learning for cross-context prediction of protein expression from 5'UTR sequence. Nucleic Acids Res 2024:gkae491. [PMID: 38864396 DOI: 10.1093/nar/gkae491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 04/28/2024] [Accepted: 05/28/2024] [Indexed: 06/13/2024] Open
Abstract
Model-guided DNA sequence design can accelerate the reprogramming of living cells. It allows us to engineer more complex biological systems by removing the need to physically assemble and test each potential design. While mechanistic models of gene expression have seen some success in supporting this goal, data-centric, deep learning-based approaches often provide more accurate predictions. This accuracy, however, comes at a cost - a lack of generalization across genetic and experimental contexts that has limited their wider use outside the context in which they were trained. Here, we address this issue by demonstrating how a simple transfer learning procedure can effectively tune a pre-trained deep learning model to predict protein translation rate from 5' untranslated region (5'UTR) sequence for diverse contexts in Escherichia coli using a small number of new measurements. This allows for important model features learnt from expensive massively parallel reporter assays to be easily transferred to new settings. By releasing our trained deep learning model and complementary calibration procedure, this study acts as a starting point for continually refined model-based sequence design that builds on previous knowledge and future experimental efforts.
Collapse
Affiliation(s)
- Pierre-Aurélien Gilliot
- School of Biological Sciences, University of Bristol, 24 Tyndall Avenue, Bristol BS8 1TQ, UK
| | - Thomas E Gorochowski
- School of Biological Sciences, University of Bristol, 24 Tyndall Avenue, Bristol BS8 1TQ, UK
- BrisEngBio, School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, UK
| |
Collapse
|
3
|
Papkou A, Garcia-Pastor L, Escudero JA, Wagner A. A rugged yet easily navigable fitness landscape. Science 2023; 382:eadh3860. [PMID: 37995212 DOI: 10.1126/science.adh3860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 09/29/2023] [Indexed: 11/25/2023]
Abstract
Fitness landscape theory predicts that rugged landscapes with multiple peaks impair Darwinian evolution, but experimental evidence is limited. In this study, we used genome editing to map the fitness of >260,000 genotypes of the key metabolic enzyme dihydrofolate reductase in the presence of the antibiotic trimethoprim, which targets this enzyme. The resulting landscape is highly rugged and harbors 514 fitness peaks. However, its highest peaks are accessible to evolving populations via abundant fitness-increasing paths. Different peaks share large basins of attraction that render the outcome of adaptive evolution highly contingent on chance events. Our work shows that ruggedness need not be an obstacle to Darwinian evolution but can reduce its predictability. If true in general, the complexity of optimization problems on realistic landscapes may require reappraisal.
Collapse
Affiliation(s)
- Andrei Papkou
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
| | - Lucia Garcia-Pastor
- Departamento de Sanidad Animal and VISAVET Health Surveillance Centre, Universidad Complutense de Madrid, Madrid, Spain
| | - José Antonio Escudero
- Departamento de Sanidad Animal and VISAVET Health Surveillance Centre, Universidad Complutense de Madrid, Madrid, Spain
| | - Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- The Santa Fe Institute, Santa Fe, NM, USA
| |
Collapse
|
4
|
Höllerer S, Jeschek M. Ultradeep characterisation of translational sequence determinants refutes rare-codon hypothesis and unveils quadruplet base pairing of initiator tRNA and transcript. Nucleic Acids Res 2023; 51:2377-2396. [PMID: 36727459 PMCID: PMC10018350 DOI: 10.1093/nar/gkad040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 12/05/2022] [Accepted: 01/13/2023] [Indexed: 02/03/2023] Open
Abstract
Translation is a key determinant of gene expression and an important biotechnological engineering target. In bacteria, 5'-untranslated region (5'-UTR) and coding sequence (CDS) are well-known mRNA parts controlling translation and thus cellular protein levels. However, the complex interaction of 5'-UTR and CDS has so far only been studied for few sequences leading to non-generalisable and partly contradictory conclusions. Herein, we systematically assess the dynamic translation from over 1.2 million 5'-UTR-CDS pairs in Escherichia coli to investigate their collective effect using a new method for ultradeep sequence-function mapping. This allows us to disentangle and precisely quantify effects of various sequence determinants of translation. We find that 5'-UTR and CDS individually account for 53% and 20% of variance in translation, respectively, and show conclusively that, contrary to a common hypothesis, tRNA abundance does not explain expression changes between CDSs with different synonymous codons. Moreover, the obtained large-scale data provide clear experimental evidence for a base-pairing interaction between initiator tRNA and mRNA beyond the anticodon-codon interaction, an effect that is often masked for individual sequences and therefore inaccessible to low-throughput approaches. Our study highlights the indispensability of ultradeep sequence-function mapping to accurately determine the contribution of parts and phenomena involved in gene regulation.
Collapse
Affiliation(s)
- Simon Höllerer
- Department of Biosystems Science and Engineering, Swiss Federal Institute of Technology – ETH Zurich, Basel CH-4058, Switzerland
| | - Markus Jeschek
- To whom correspondence should be addressed. Tel: +49 941 943 3161; Fax: +49 941 943 2403;
| |
Collapse
|
5
|
Gilliot PA, Gorochowski TE. Design and Analysis of Massively Parallel Reporter Assays Using FORECAST. Methods Mol Biol 2023; 2553:41-56. [PMID: 36227538 DOI: 10.1007/978-1-0716-2617-7_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Machine learning is revolutionizing molecular biology and bioengineering by providing powerful insights and predictions. Massively parallel reporter assays (MPRAs) have emerged as a particularly valuable class of high-throughput technique to support such algorithms. MPRAs enable the simultaneous characterization of thousands or even millions of genetic constructs and provide the large amounts of data needed to train models. However, while the scale of this approach is impressive, the design of effective MPRA experiments is challenging due to the many factors that can be varied and the difficulty in predicting how these will impact the quality and quantity of data obtained. Here, we present a computational tool called FORECAST, which can simulate MPRA experiments based on fluorescence-activated cell sorting and subsequent sequencing (commonly referred to as Flow-seq or Sort-seq experiments), as well as carry out rigorous statistical estimation of construct performance from this type of experimental data. FORECAST can be used to develop workflows to aid the design of MPRA experiments and reanalyze existing MPRA data sets.
Collapse
|
6
|
Liu Y, Wu Z, Wu D, Gao N, Lin J. Reconstitution of Multi-Protein Complexes through Ribozyme-Assisted Polycistronic Co-Expression. ACS Synth Biol 2022; 12:136-143. [PMID: 36512506 PMCID: PMC9872166 DOI: 10.1021/acssynbio.2c00416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
In living cells, proteins often exert their functions by interacting with other proteins forming protein complexes. Obtaining homogeneous samples of protein complexes with correct fold and stoichiometry is critical for its biochemical and biophysical characterization as well as functional investigation. Here, we developed a Ribozyme-Assisted Polycistronic co-expression system (pRAP) for heterologous co-production and in vivo assembly of multi-subunit complexes. In the pRAP system, a polycistronic mRNA transcript is co-transcriptionally converted into individual mono-cistrons in vivo. Each cistron can initiate translation with comparable efficiency, resulting in balanced production for all subunits, thus permitting faithful protein complex assembly. With pRAP polycistronic co-expression, we have successfully reconstituted large functional multi-subunit complexes involved in mammalian translation initiation. Our invention provides a valuable tool for studying the molecular mechanisms of biological processes.
Collapse
Affiliation(s)
- Yan Liu
- State
Key Laboratory of Genetic Engineering, School of Life Sciences, Zhongshan
Hospital, Fudan University, Shanghai 200438, China
| | - Zihan Wu
- State
Key Laboratory of Genetic Engineering, School of Life Sciences, Zhongshan
Hospital, Fudan University, Shanghai 200438, China
| | - Damu Wu
- State
Key Laboratory of Membrane Biology, Peking-Tsinghua Joint Center for
Life Sciences, School of Life Sciences, Peking University, Beijing 100871, China
| | - Ning Gao
- State
Key Laboratory of Membrane Biology, Peking-Tsinghua Joint Center for
Life Sciences, School of Life Sciences, Peking University, Beijing 100871, China
| | - Jinzhong Lin
- State
Key Laboratory of Genetic Engineering, School of Life Sciences, Zhongshan
Hospital, Fudan University, Shanghai 200438, China,. Tel.: +86-21-31246764
| |
Collapse
|
7
|
Duan Y, Zhang X, Zhai W, Zhang J, Zhang X, Xu G, Li H, Deng Z, Shi J, Xu Z. Deciphering the Rules of Ribosome Binding Site Differentiation in Context Dependence. ACS Synth Biol 2022; 11:2726-2740. [PMID: 35877551 DOI: 10.1021/acssynbio.2c00139] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The ribosome binding site (RBS) is a crucial element regulating translation. However, the activity of RBS is poorly predictable, because it is strongly affected by the local possible secondary structure, that is, context dependence. By the Flowseq technique, over 20 000 RBS variants were sorted and sequenced, and the translation of multiple genes under the same RBS was quantitatively characterized to evaluate the context dependence of each RBS variant in E. coli. Two regions, (-7 to -2) and (-17 to -12), of RBS were predicted with a higher possibility to pair with each other to slow down the translation initiation. Associations between phenotypes and the intrinsic factors suspected to affect translation efficiency and context dependence of the RBS, including nucleotide bias at each position, free energy, and conservation, were disentangled. The results showed that translation efficiency was influenced more significantly by conservation of the SD region (-16 to -8), while an AC-rich spacer region (-7 to -1) was associated with low context dependence. We confirmed these characteristics using a series of synthesized RBSs. The average correlation between multiple reporters was significantly higher for RBSs with an AC-rich spacer (0.714) compared with a GU-rich spacer (0.286). Overall, we proposed general design criteria to improve programmability and minimize context dependence of RBS. The characteristics unraveled here can be adapted to other bacteria for fine-tuning target-gene expression.
Collapse
Affiliation(s)
- Yanting Duan
- Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China.,National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China
| | - Xiaojuan Zhang
- Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China.,National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China
| | - Weiji Zhai
- Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China.,National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China
| | - Jinpeng Zhang
- Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China.,National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China
| | - Xiaomei Zhang
- School of Life Science and Health Engineering, Jiangnan University, Wuxi 214122, China.,Jiangsu Engineering Research Center for Bioactive Products Processing Technology, Jiangnan University, 1800 Lihu Avenue, Wuxi 214122, China
| | - Guoqiang Xu
- Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China.,National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China
| | - Hui Li
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
| | - Zhaohong Deng
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
| | - Jinsong Shi
- School of Life Science and Health Engineering, Jiangnan University, Wuxi 214122, China.,Jiangsu Engineering Research Center for Bioactive Products Processing Technology, Jiangnan University, 1800 Lihu Avenue, Wuxi 214122, China
| | - Zhenghong Xu
- Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China.,National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China
| |
Collapse
|
8
|
Translation initiation site of mRNA is selected through dynamic interaction with the ribosome. Proc Natl Acad Sci U S A 2022; 119:e2118099119. [PMID: 35605125 DOI: 10.1073/pnas.2118099119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
SignificanceRibosomes translate the genetic codes of messenger RNA (mRNA) to make proteins. Translation must begin at the correct initiation site; otherwise, abnormal proteins will be produced. Here, we show that a short ribosome-specific sequence in the upstream followed by an unstructured downstream sequence is a favorable initiation site. Those mRNAs lacking either of these two characteristics do not associate tightly with the ribosome. Initiator transfer RNA (tRNA) and initiation factors facilitate the binding. However, when the downstream site forms structures, initiation factor 3 triggers the dissociation of the accommodated initiator tRNA and the subsequent disassembly of the ribosome-mRNA complex. Thus, initiation factors help the ribosome distinguish unfavorable structured sequences that may not act as the mRNA translation initiation site.
Collapse
|
9
|
Gonzalez Somermeyer L, Fleiss A, Mishin AS, Bozhanova NG, Igolkina AA, Meiler J, Alaball Pujol ME, Putintseva EV, Sarkisyan KS, Kondrashov FA. Heterogeneity of the GFP fitness landscape and data-driven protein design. eLife 2022; 11:75842. [PMID: 35510622 PMCID: PMC9119679 DOI: 10.7554/elife.75842] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 03/25/2022] [Indexed: 11/24/2022] Open
Abstract
Studies of protein fitness landscapes reveal biophysical constraints guiding protein evolution and empower prediction of functional proteins. However, generalisation of these findings is limited due to scarceness of systematic data on fitness landscapes of proteins with a defined evolutionary relationship. We characterized the fitness peaks of four orthologous fluorescent proteins with a broad range of sequence divergence. While two of the four studied fitness peaks were sharp, the other two were considerably flatter, being almost entirely free of epistatic interactions. Mutationally robust proteins, characterized by a flat fitness peak, were not optimal templates for machine-learning-driven protein design - instead, predictions were more accurate for fragile proteins with epistatic landscapes. Our work paves insights for practical application of fitness landscape heterogeneity in protein engineering.
Collapse
Affiliation(s)
| | - Aubin Fleiss
- Synthetic Biology Group, MRC London Institute of Medical SciencesLondonUnited Kingdom,Institute of Clinical Sciences, Faculty of Medicine and Imperial College Centre for Synthetic Biology, Imperial College LondonLondonUnited Kingdom
| | - Alexander S Mishin
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of SciencesMoscowRussian Federation
| | - Nina G Bozhanova
- Department of Chemistry, Center for Structural Biology, Vanderbilt UniversityNashvilleUnited States
| | - Anna A Igolkina
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna BioCenterViennaAustria
| | - Jens Meiler
- Department of Chemistry, Center for Structural Biology, Vanderbilt UniversityNashvilleUnited States,Institute for Drug Discovery, Medical School, Leipzig UniversityLeipzigGermany
| | - Maria-Elisenda Alaball Pujol
- Synthetic Biology Group, MRC London Institute of Medical SciencesLondonUnited Kingdom,Institute of Clinical Sciences, Faculty of Medicine and Imperial College Centre for Synthetic Biology, Imperial College LondonLondonUnited Kingdom
| | | | - Karen S Sarkisyan
- Synthetic Biology Group, MRC London Institute of Medical SciencesLondonUnited Kingdom,Institute of Clinical Sciences, Faculty of Medicine and Imperial College Centre for Synthetic Biology, Imperial College LondonLondonUnited Kingdom,Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of SciencesMoscowRussian Federation
| | - Fyodor A Kondrashov
- Institute of Science and Technology AustriaKlosterneuburgAustria,Evolutionary and Synthetic Biology Unit, Okinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| |
Collapse
|
10
|
Relation Between the Number of Peaks and the Number of Reciprocal Sign Epistatic Interactions. Bull Math Biol 2022; 84:74. [PMID: 35713756 PMCID: PMC9205815 DOI: 10.1007/s11538-022-01029-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 05/15/2022] [Indexed: 01/25/2023]
Abstract
Empirical essays of fitness landscapes suggest that they may be rugged, that is having multiple fitness peaks. Such fitness landscapes, those that have multiple peaks, necessarily have special local structures, called reciprocal sign epistasis (Poelwijk et al. in J Theor Biol 272:141-144, 2011). Here, we investigate the quantitative relationship between the number of fitness peaks and the number of reciprocal sign epistatic interactions. Previously, it has been shown (Poelwijk et al. in J Theor Biol 272:141-144, 2011) that pairwise reciprocal sign epistasis is a necessary but not sufficient condition for the existence of multiple peaks. Applying discrete Morse theory, which to our knowledge has never been used in this context, we extend this result by giving the minimal number of reciprocal sign epistatic interactions required to create a given number of peaks.
Collapse
|
11
|
Benedict AB, Chamberlain JD, Calvopina DG, Griffitts JS. Translation initiation from sequence variants of the bacteriophage T7 g10RBS in Escherichia coli and Agrobacterium fabrum. Mol Biol Rep 2021; 49:833-838. [PMID: 34743270 PMCID: PMC8748333 DOI: 10.1007/s11033-021-06891-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 10/27/2021] [Indexed: 11/01/2022]
Abstract
BACKGROUND The bacteriophage T7 gene 10 ribosome binding site (g10RBS) has long been used for robust expression of recombinant proteins in Escherichia coli. This RBS consists of a Shine-Dalgarno (SD) sequence augmented by an upstream translational "enhancer" (Enh) element, supporting protein production at many times the level seen with simple synthetic SD-containing sequences. The objective of this study was to dissect the g10RBS to identify simpler derivatives that exhibit much of the original translation efficiency. METHODS AND RESULTS Twenty derivatives of g10RBS were tested using multiple promoter/reporter gene contexts. We have identified one derivative (which we call "CON_G") that maintains 100% activity in E. coli and is 33% shorter. Further minimization of CON_G results in variants that lose only modest amounts of activity. Certain nucleotide substitutions in the spacer region between the SD sequence and initiation codon show strong decreases in translation. When testing these 20 derivatives in the alphaproteobacterium Agrobacterium fabrum, most supported strong reporter protein expression that was not dependent on the Enh. CONCLUSIONS The g10RBS derivatives tested in this study display a range of observed activity, including a minimized version (CON_G) that retains 100% activity in E. coli while being 33% shorter. This high activity is evident in two different promoter/reporter sequence contexts. The array of RBS sequences presented here may be useful to researchers in need of fine-tuned expression of recombinant proteins of interest.
Collapse
Affiliation(s)
- Alex B Benedict
- Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT, 84602, USA
| | - Joshua D Chamberlain
- Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT, 84602, USA
| | - Diana G Calvopina
- Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT, 84602, USA
| | - Joel S Griffitts
- Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT, 84602, USA.
| |
Collapse
|
12
|
Song S, Zhang J. Unbiased inference of the fitness landscape ruggedness from imprecise fitness estimates. Evolution 2021; 75:2658-2671. [PMID: 34554581 DOI: 10.1111/evo.14363] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 09/14/2021] [Indexed: 01/17/2023]
Abstract
Fitness landscapes map genotypes to their corresponding fitness under given environments and allow explaining and predicting evolutionary trajectories. Of particular interest is the landscape ruggedness or the unevenness of the landscape, because it impacts many aspects of evolution such as the likelihood that a population is trapped in a local fitness peak. Although the ruggedness has been inferred from a number of empirically mapped fitness landscapes, it is unclear to what extent this inference is affected by fitness estimation error, which is inevitable in the experimental determination of fitness landscapes. Here, we address this question by simulating fitness landscapes under various theoretical models, with or without fitness estimation error. We find that all eight examined measures of landscape ruggedness are overestimated due to imprecise fitness quantification, but different measures are affected to different degrees. We devise a method to use replicate fitness measures to correct this bias and show that our method performs well under realistic conditions. We conclude that previously reported fitness landscape ruggedness is likely upward biased owing to the negligence of fitness estimation error and advise that future fitness landscape mapping should include at least three biological replicates to permit an unbiased inference of the ruggedness.
Collapse
Affiliation(s)
- Siliang Song
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, 48109
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, 48109
| |
Collapse
|
13
|
Abstract
Bacterial protein synthesis rates have evolved to maintain preferred stoichiometries at striking precision, from the components of protein complexes to constituents of entire pathways. Setting relative protein production rates to be well within a factor of two requires concerted tuning of transcription, RNA turnover, and translation, allowing many potential regulatory strategies to achieve the preferred output. The last decade has seen a greatly expanded capacity for precise interrogation of each step of the central dogma genome-wide. Here, we summarize how these technologies have shaped the current understanding of diverse bacterial regulatory architectures underpinning stoichiometric protein synthesis. We focus on the emerging expanded view of bacterial operons, which encode diverse primary and secondary mRNA structures for tuning protein stoichiometry. Emphasis is placed on how quantitative tuning is achieved. We discuss the challenges and open questions in the application of quantitative, genome-wide methodologies to the problem of precise protein production. Expected final online publication date for the Annual Review of Microbiology, Volume 75 is October 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- James C Taggart
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; ,
| | - Jean-Benoît Lalanne
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; , .,Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.,Current affiliation: Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA;
| | - Gene-Wei Li
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; ,
| |
Collapse
|
14
|
He MY, Lin YJ, Kao YL, Kuo P, Grauffel C, Lim C, Cheng YS, Chou HHD. Sensitive and Specific Cadmium Biosensor Developed by Reconfiguring Metal Transport and Leveraging Natural Gene Repositories. ACS Sens 2021; 6:995-1002. [PMID: 33444502 DOI: 10.1021/acssensors.0c02204] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Whole-cell biosensors are useful for monitoring heavy metal toxicity in public health and ecosystems, but their development has been hindered by intrinsic trade-offs between sensitivity and specificity. Here, we demonstrated an effective engineering solution by building a sensitive, specific, and high-response biosensor for carcinogenic cadmium ions. We genetically programmed the metal transport system of Escherichia coli to enrich intracellular cadmium ions and deprive interfering metal species. We then selected 16 cadmium-sensing transcription factors from the GenBank database and tested their reactivity to 14 metal ions in the engineered E. coli using the expression of the green fluorescent protein as the readout. The resulting cadmium biosensor was highly specific and showed a detection limit of 3 nM, a linear increase in fluorescent intensities from 0 to 200 nM, and a maximal 777-fold signal change. Using this whole-cell biosensor, a smartphone, and low-tech equipment, we developed a simple assay capable of measuring cadmium ions at the same concentration range in irrigation water and human urine. This method is user-friendly and cost-effective, making it affordable to screen large amounts of samples for cadmium toxicity in agriculture and medicine. Moreover, our work highlights natural gene repositories as a treasure chest for bioengineering.
Collapse
Affiliation(s)
- Mei-Ying He
- Department of Life Science, National Taiwan University, Taipei 106, Taiwan
| | - Yu-Jen Lin
- Department of Life Science, National Taiwan University, Taipei 106, Taiwan
| | - Yi-Ling Kao
- Department of Life Science, National Taiwan University, Taipei 106, Taiwan
| | - Pu Kuo
- Department of Life Science, National Taiwan University, Taipei 106, Taiwan
| | - Cédric Grauffel
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | - Carmay Lim
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | - Yi-Sheng Cheng
- Department of Life Science, National Taiwan University, Taipei 106, Taiwan
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei 106, Taiwan
| | - Hsin-Hung David Chou
- Department of Life Science, National Taiwan University, Taipei 106, Taiwan
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei 106, Taiwan
| |
Collapse
|
15
|
Wen JD, Kuo ST, Chou HHD. The diversity of Shine-Dalgarno sequences sheds light on the evolution of translation initiation. RNA Biol 2020; 18:1489-1500. [PMID: 33349119 DOI: 10.1080/15476286.2020.1861406] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Shine-Dalgarno (SD) sequences, the core element of prokaryotic ribosome-binding sites, facilitate mRNA translation by base-pair interaction with the anti-SD (aSD) sequence of 16S rRNA. In contrast to this paradigm, an inspection of thousands of prokaryotic species unravels tremendous SD sequence diversity both within and between genomes, whereas aSD sequences remain largely static. The pattern has led many to suggest unidentified mechanisms for translation initiation. Here we review known translation-initiation pathways in prokaryotes. Moreover, we seek to understand the cause and consequence of SD diversity through surveying recent advances in biochemistry, genomics, and high-throughput genetics. These findings collectively show: (1) SD:aSD base pairing is beneficial but nonessential to translation initiation. (2) The 5' untranslated region of mRNA evolves dynamically and correlates with organismal phylogeny and ecological niches. (3) Ribosomes have evolved distinct usage of translation-initiation pathways in different species. We propose a model portraying the SD diversity shaped by optimization of gene expression, adaptation to environments and growth demands, and the species-specific prerequisite of ribosomes to initiate translation. The model highlights the coevolution of ribosomes and mRNA features, leading to functional customization of the translation apparatus in each organism.
Collapse
Affiliation(s)
- Jin-Der Wen
- Institute of Molecular and Cellular Biology, National Taiwan University, Taipei, Taiwan.,Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei, Taiwan
| | - Syue-Ting Kuo
- Department of Life Science, National Taiwan University, Taipei, Taiwan
| | - Hsin-Hung David Chou
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei, Taiwan.,Department of Life Science, National Taiwan University, Taipei, Taiwan
| |
Collapse
|