1
|
Li H, Lv Y, Teng Z, Guo R, Jiang L. Shigella senses the environmental cue leucine to promote its virulence gene expression in the colon. J Mol Biol 2024:168798. [PMID: 39303765 DOI: 10.1016/j.jmb.2024.168798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 09/14/2024] [Accepted: 09/16/2024] [Indexed: 09/22/2024]
Abstract
Shigella is a foodborne enteropathogenic bacteria that causes severe bacillary dysentery in humans. Shigella primarily colonizes the human colon and causes disease via invasion of colon epithelial cells. However, the signal regulatory mechanisms associated with its colonization and pathogenesis in the colon remain poorly defined. Here, we report a leucine-mediated regulatory mechanism that promotes Shigella virulence gene expression and invasion of colon epithelial cells. Shigella in response to leucine, which is highly abundant in the colon, via the leucine-responsive regulator Lrp and the binding of Lrp with leucine induces the expression of a newly identified small RNA SsrV. SsrV then activates the expression of virF and downstream invasion-related virulence genes by increasing the protein level of the LysR-type transcription regulator LrhA, therefore enabling Shigella invasion of colon epithelial cells. Shigella lacking ssrV displays impaired invasion ability. Collectively, these findings suggest that Shigella employs a leucine-responsive environmental activation mechanism to establish colonization and pathogenicity.
Collapse
Affiliation(s)
- Huiying Li
- Department of Biochemistry and Molecular Biology, School of Clinical and Basic Medical Sciences, Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan 250062, China; National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, TEDA Institute of Biological Sciences and Biotechnology, Nankai University, Tianjin 300457, China
| | - Yongyao Lv
- Department of Biochemistry and Molecular Biology, School of Clinical and Basic Medical Sciences, Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan 250062, China
| | - Zhiqi Teng
- Department of Biochemistry and Molecular Biology, School of Clinical and Basic Medical Sciences, Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan 250062, China
| | - Rui Guo
- Shandong Center for Food and Drug Evaluation & Inspection, Jinan 250014, China
| | - Lingyan Jiang
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, TEDA Institute of Biological Sciences and Biotechnology, Nankai University, Tianjin 300457, China.
| |
Collapse
|
2
|
Xie P, Xu Y, Tang J, Wu S, Gao H. Multifaceted regulation of siderophore synthesis by multiple regulatory systems in Shewanella oneidensis. Commun Biol 2024; 7:498. [PMID: 38664541 PMCID: PMC11045786 DOI: 10.1038/s42003-024-06193-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 04/15/2024] [Indexed: 04/28/2024] Open
Abstract
Siderophore-dependent iron uptake is a mechanism by which microorganisms scavenge and utilize iron for their survival, growth, and many specialized activities, such as pathogenicity. The siderophore biosynthetic system PubABC in Shewanella can synthesize a series of distinct siderophores, yet how it is regulated in response to iron availability remains largely unexplored. Here, by whole genome screening we identify TCS components histidine kinase (HK) BarA and response regulator (RR) SsoR as positive regulators of siderophore biosynthesis. While BarA partners with UvrY to mediate expression of pubABC post-transcriptionally via the Csr regulatory cascade, SsoR is an atypical orphan RR of the OmpR/PhoB subfamily that activates transcription in a phosphorylation-independent manner. By combining structural analysis and molecular dynamics simulations, we observe conformational changes in OmpR/PhoB-like RRs that illustrate the impact of phosphorylation on dynamic properties, and that SsoR is locked in the 'phosphorylated' state found in phosphorylation-dependent counterparts of the same subfamily. Furthermore, we show that iron homeostasis global regulator Fur, in addition to mediating transcription of its own regulon, acts as the sensor of iron starvation to increase SsoR production when needed. Overall, this study delineates an intricate, multi-tiered transcriptional and post-transcriptional regulatory network that governs siderophore biosynthesis.
Collapse
Affiliation(s)
- Peilu Xie
- Institute of Microbiology and College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, China
| | - Yuanyou Xu
- Institute of Microbiology and College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, China
| | - Jiaxin Tang
- Institute of Microbiology and College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, China
| | - Shihua Wu
- Institute of Microbiology and College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, China.
| | - Haichun Gao
- Institute of Microbiology and College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, China.
| |
Collapse
|
3
|
Ligeti B, Szepesi-Nagy I, Bodnár B, Ligeti-Nagy N, Juhász J. ProkBERT family: genomic language models for microbiome applications. Front Microbiol 2024; 14:1331233. [PMID: 38282738 PMCID: PMC10810988 DOI: 10.3389/fmicb.2023.1331233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 12/11/2023] [Indexed: 01/30/2024] Open
Abstract
Background In the evolving landscape of microbiology and microbiome analysis, the integration of machine learning is crucial for understanding complex microbial interactions, and predicting and recognizing novel functionalities within extensive datasets. However, the effectiveness of these methods in microbiology faces challenges due to the complex and heterogeneous nature of microbial data, further complicated by low signal-to-noise ratios, context-dependency, and a significant shortage of appropriately labeled datasets. This study introduces the ProkBERT model family, a collection of large language models, designed for genomic tasks. It provides a generalizable sequence representation for nucleotide sequences, learned from unlabeled genome data. This approach helps overcome the above-mentioned limitations in the field, thereby improving our understanding of microbial ecosystems and their impact on health and disease. Methods ProkBERT models are based on transfer learning and self-supervised methodologies, enabling them to use the abundant yet complex microbial data effectively. The introduction of the novel Local Context-Aware (LCA) tokenization technique marks a significant advancement, allowing ProkBERT to overcome the contextual limitations of traditional transformer models. This methodology not only retains rich local context but also demonstrates remarkable adaptability across various bioinformatics tasks. Results In practical applications such as promoter prediction and phage identification, the ProkBERT models show superior performance. For promoter prediction tasks, the top-performing model achieved a Matthews Correlation Coefficient (MCC) of 0.74 for E. coli and 0.62 in mixed-species contexts. In phage identification, ProkBERT models consistently outperformed established tools like VirSorter2 and DeepVirFinder, achieving an MCC of 0.85. These results underscore the models' exceptional accuracy and generalizability in both supervised and unsupervised tasks. Conclusions The ProkBERT model family is a compact yet powerful tool in the field of microbiology and bioinformatics. Its capacity for rapid, accurate analyses and its adaptability across a spectrum of tasks marks a significant advancement in machine learning applications in microbiology. The models are available on GitHub (https://github.com/nbrg-ppcu/prokbert) and HuggingFace (https://huggingface.co/nerualbioinfo) providing an accessible tool for the community.
Collapse
Affiliation(s)
- Balázs Ligeti
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Budapest, Hungary
| | - István Szepesi-Nagy
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Budapest, Hungary
| | - Babett Bodnár
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Budapest, Hungary
| | - Noémi Ligeti-Nagy
- Language Technology Research Group, HUN-REN Hungarian Research Centre for Linguistics, Budapest, Hungary
| | - János Juhász
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Budapest, Hungary
- Institute of Medical Microbiology, Semmelweis University, Budapest, Hungary
| |
Collapse
|
4
|
Guan J, Zhou W, Guo J, Zheng L, Lu G, Hua F, Liu M, Ji X, Sun Y, Zhu L, Guo X. A Wohlfahrtiimonas chitiniclastica with a novel type of blaVEB-1-carrying plasmid isolated from a zebra in China. Front Microbiol 2023; 14:1276314. [PMID: 38029080 PMCID: PMC10656743 DOI: 10.3389/fmicb.2023.1276314] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 10/23/2023] [Indexed: 12/01/2023] Open
Abstract
Background Wohlfahrtiimonas chitiniclastica is an emerging fly-borne zoonotic pathogen, which causes infections in immunocompromised patients and some animals. Herein, we reported a W. chitiniclastica BM-Y from a dead zebra in China. Methods The complete genome sequencing of BM-Y showed that this isolate carried one chromosome and one novel type of blaVEB-1-carrying plasmid. Detailed genetic dissection was applied to this plasmid to display the genetic environment of blaVEB-1. Results Three novel insertion sequence (IS) elements, namely ISWoch1, ISWoch2, and ISWoch3, were found in this plasmid. aadB, aacA1, and gcuG were located downstream of blaVEB-1, composing a gene cassette array blaVEB-1-aadB-aacA1-gcuG bracketed by an intact ISWoch1 and a truncated one, which was named the blaVEB-1 region. The 5'-RACE experiments revealed that the transcription start site of the blaVEB-1 region was located in the intact ISWoch1 and this IS provided a strong promoter for the blaVEB-1 region. Conclusion The spread of the blaVEB-1-carrying plasmid might enhance the ability of W. chitiniclastica to survive under drug selection pressure and aggravate the difficulty in treating infections caused by blaVEB-1-carrying W. chitiniclastica. To the best of our knowledge, this is the first report of the genetic characterization of a novel blaVEB-1-carrying plasmid with new ISs from W. chitiniclastica.
Collapse
Affiliation(s)
- Jiayao Guan
- College of Veterinary Medicine, Jilin Agricultural University, Changchun, China
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, China
| | - Wei Zhou
- Center for Animal Disease Control and Prevention of Ordos, Ordos, China
| | - Jingyi Guo
- The Second Hospital of Jilin University, Jilin University, Changchun, China
| | - Lin Zheng
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, China
| | - Gejin Lu
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, China
| | - Fuyou Hua
- Shenzhen Safari Park, Shenzhen, China
| | - Mingwei Liu
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, China
| | - Xue Ji
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, China
| | - Yang Sun
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, China
| | - Lingwei Zhu
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, China
| | - Xuejun Guo
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, China
| |
Collapse
|
5
|
van Lent P, Schmitz J, Abeel T. Simulated Design-Build-Test-Learn Cycles for Consistent Comparison of Machine Learning Methods in Metabolic Engineering. ACS Synth Biol 2023; 12:2588-2599. [PMID: 37616156 PMCID: PMC10510747 DOI: 10.1021/acssynbio.3c00186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Indexed: 08/25/2023]
Abstract
Combinatorial pathway optimization is an important tool in metabolic flux optimization. Simultaneous optimization of a large number of pathway genes often leads to combinatorial explosions. Strain optimization is therefore often performed using iterative design-build-test-learn (DBTL) cycles. The aim of these cycles is to develop a product strain iteratively, every time incorporating learning from the previous cycle. Machine learning methods provide a potentially powerful tool to learn from data and propose new designs for the next DBTL cycle. However, due to the lack of a framework for consistently testing the performance of machine learning methods over multiple DBTL cycles, evaluating the effectiveness of these methods remains a challenge. In this work, we propose a mechanistic kinetic model-based framework to test and optimize machine learning for iterative combinatorial pathway optimization. Using this framework, we show that gradient boosting and random forest models outperform the other tested methods in the low-data regime. We demonstrate that these methods are robust for training set biases and experimental noise. Finally, we introduce an algorithm for recommending new designs using machine learning model predictions. We show that when the number of strains to be built is limited, starting with a large initial DBTL cycle is favorable over building the same number of strains for every cycle.
Collapse
Affiliation(s)
- Paul van Lent
- Delft
Bioinformatics Lab, Delft University of
Technology Van Mourik, Delft 2628 XE, The Netherlands
| | - Joep Schmitz
- Department
of Science and Research, Joep Schmitz -
dsm-firmenich, Science & Research, P.O. Box 1, 2600
MA Delft, The Netherlands
| | - Thomas Abeel
- Delft
Bioinformatics Lab, Delft University of
Technology Van Mourik, Delft 2628 XE, The Netherlands
- Infectious
Disease and Microbiome Program, Broad Institute
of MIT and Harvard, Cambridge, Massachusetts 02142, United States
| |
Collapse
|
6
|
Xue F, Ma X, Luo C, Li D, Shi G, Li Y. Construction of a bacteriophage-derived recombinase system in Bacillus licheniformis for gene deletion. AMB Express 2023; 13:89. [PMID: 37633871 PMCID: PMC10460339 DOI: 10.1186/s13568-023-01589-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Accepted: 07/29/2023] [Indexed: 08/28/2023] Open
Abstract
Bacillus licheniformis and its related strains have found extensive applications in diverse industries, agriculture, and medicine. However, the current breeding methods for this strain primarily rely on natural screening and traditional mutagenesis. The limited availability of efficient genetic engineering tools, particularly recombination techniques, has hindered further advancements in its applications. In this study, we conducted a comprehensive investigation to identify and characterize a recombinase, RecT, derived from a Bacillus phage. Remarkably, the recombinase exhibited a 105-fold enhancement in the recombination efficiency of the strain. To facilitate genome editing, we developed a system based on the conditional expression of RecT using a rhamnose-inducible promoter (Prha). The efficacy of this system was evaluated by deleting the amyL gene, which encodes an α-amylase. Our findings revealed that the induction time and concentration of rhamnose, along with the generation time of the strain, significantly influenced the editing efficiency. Optimal conditions for genome editing were determined as follows: the wild-type strain was initially transformed with the genome editing plasmid, followed by cultivation and induction with 1.5% rhamnose for 8 h. Subsequently, the strain was further cultured for an additional 24 h, equivalent to approximately three generations. Consequently, the recombination efficiency reached an impressive 16.67%. This study represents a significant advancement in enhancing the recombination efficiency of B. licheniformis through the utilization of a RecT-based recombination system. Moreover, it provides a highly effective genome editing tool for genetic engineering applications in this strain.
Collapse
Affiliation(s)
- Fang Xue
- Key Laboratory of Chinese Cigar Fermentation, Cigar Technology Innovation Center of China Tobacco, Tobacco Sichuan Industrial Co., Ltd, Chengdu, 610000, P. R. China
| | - Xufan Ma
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu, 214122, P. R. China
- Jiangsu Provincial Engineering Research Center for Bioactive Product Processing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu, 214122, P. R. China
| | - Cheng Luo
- Key Laboratory of Chinese Cigar Fermentation, Cigar Technology Innovation Center of China Tobacco, Tobacco Sichuan Industrial Co., Ltd, Chengdu, 610000, P. R. China
| | - Dongliang Li
- Key Laboratory of Chinese Cigar Fermentation, Cigar Technology Innovation Center of China Tobacco, Tobacco Sichuan Industrial Co., Ltd, Chengdu, 610000, P. R. China
| | - Guiyang Shi
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu, 214122, P. R. China
- Jiangsu Provincial Engineering Research Center for Bioactive Product Processing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu, 214122, P. R. China
| | - Youran Li
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu, 214122, P. R. China.
- Jiangsu Provincial Engineering Research Center for Bioactive Product Processing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu, 214122, P. R. China.
| |
Collapse
|
7
|
Zaytsev K, Fedorov A, Korotkov E. Classification of Promoter Sequences from Human Genome. Int J Mol Sci 2023; 24:12561. [PMID: 37628742 PMCID: PMC10454140 DOI: 10.3390/ijms241612561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 07/28/2023] [Accepted: 08/03/2023] [Indexed: 08/27/2023] Open
Abstract
We have developed a new method for promoter sequence classification based on a genetic algorithm and the MAHDS sequence alignment method. We have created four classes of human promoters, combining 17,310 sequences out of the 29,598 present in the EPD database. We searched the human genome for potential promoter sequences (PPSs) using dynamic programming and position weight matrices representing each of the promoter sequence classes. A total of 3,065,317 potential promoter sequences were found. Only 1,241,206 of them were located in unannotated parts of the human genome. Every other PPS found intersected with either true promoters, transposable elements, or interspersed repeats. We found a strong intersection between PPSs and Alu elements as well as transcript start sites. The number of false positive PPSs is estimated to be 3 × 10-8 per nucleotide, which is several orders of magnitude lower than for any other promoter prediction method. The developed method can be used to search for PPSs in various eukaryotic genomes.
Collapse
Affiliation(s)
- Konstantin Zaytsev
- Bach Institute of Biochemistry, Federal Research Center of Biotechnology of the Russian Academy of Sciences, 119071 Moscow, Russia
| | - Alexey Fedorov
- Bach Institute of Biochemistry, Federal Research Center of Biotechnology of the Russian Academy of Sciences, 119071 Moscow, Russia
| | - Eugene Korotkov
- Institute of Bioengineering, Federal Research Center of Biotechnology of the Russian Academy of Sciences, 119071 Moscow, Russia
| |
Collapse
|
8
|
Yang J, Son Y, Kang M, Park W. AamA-mediated epigenetic control of genome-wide gene expression and phenotypic traits in Acinetobacter baumannii ATCC 17978. Microb Genom 2023; 9:mgen001093. [PMID: 37589545 PMCID: PMC10483419 DOI: 10.1099/mgen.0.001093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Accepted: 08/03/2023] [Indexed: 08/18/2023] Open
Abstract
Individual deletions of three genes encoding orphan DNA methyltransferases resulted in the occurrence of growth defect only in the aamA (encoding AcinetobacterAdenine Methylase A) mutant of A. baumannii strain ATCC 17978. Our single-molecule real-time sequencing-based methylome analysis revealed multiple AamA-mediated DNA methylation sites and proposed a potent census target motif (TTTRAATTYAAA). Loss of Dam led to modulation of genome-wide gene expression, and several Dam-target sites including the promoter region of the trmD operon (rpsP, rimM, trmD, and rplS) were identified through our methylome and transcriptome analyses. AamA methylation also appeared to control the expression of many genes linked to membrane functions (lolAB, lpxO), replication (dnaA) and protein synthesis (trmD operon) in the strain ATCC 17978. Interestingly, cellular resistance against several antibiotics and ethidium bromide through functions of efflux pumps diminished in the absence of the aamA gene, and the complementation of aamA gene restored the wild-type phenotypes. Other tested phenotypic traits such as outer-membrane vesicle production, biofilm formation and virulence were also affected in the aamA mutant. Collectively, our data indicated that epigenetic regulation through AamA-mediated DNA methylation of novel target sites mostly in the regulatory regions could contribute significantly to changes in multiple phenotypic traits in A. baumannii ATCC 17978.
Collapse
Affiliation(s)
- Jihye Yang
- Laboratory of Molecular Environmental Microbiology, Department of Environmental Science and Ecological Engineering, Korea University, Seoul, Republic of Korea
| | - Yongjun Son
- Laboratory of Molecular Environmental Microbiology, Department of Environmental Science and Ecological Engineering, Korea University, Seoul, Republic of Korea
| | - Mingyeong Kang
- Laboratory of Molecular Environmental Microbiology, Department of Environmental Science and Ecological Engineering, Korea University, Seoul, Republic of Korea
| | - Woojun Park
- Laboratory of Molecular Environmental Microbiology, Department of Environmental Science and Ecological Engineering, Korea University, Seoul, Republic of Korea
| |
Collapse
|
9
|
Huttanus HM, Triola EKH, Velasquez-Guzman JC, Shin SM, Granja-Travez RS, Singh A, Dale T, Jha RK. Targeted mutagenesis and high-throughput screening of diversified gene and promoter libraries for isolating gain-of-function mutations. Front Bioeng Biotechnol 2023; 11:1202388. [PMID: 37545889 PMCID: PMC10400447 DOI: 10.3389/fbioe.2023.1202388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Accepted: 06/25/2023] [Indexed: 08/08/2023] Open
Abstract
Targeted mutagenesis of a promoter or gene is essential for attaining new functions in microbial and protein engineering efforts. In the burgeoning field of synthetic biology, heterologous genes are expressed in new host organisms. Similarly, natural or designed proteins are mutagenized at targeted positions and screened for gain-of-function mutations. Here, we describe methods to attain complete randomization or controlled mutations in promoters or genes. Combinatorial libraries of one hundred thousands to tens of millions of variants can be created using commercially synthesized oligonucleotides, simply by performing two rounds of polymerase chain reactions. With a suitably engineered reporter in a whole cell, these libraries can be screened rapidly by performing fluorescence-activated cell sorting (FACS). Within a few rounds of positive and negative sorting based on the response from the reporter, the library can rapidly converge to a few optimal or extremely rare variants with desired phenotypes. Library construction, transformation and sequence verification takes 6-9 days and requires only basic molecular biology lab experience. Screening the library by FACS takes 3-5 days and requires training for the specific cytometer used. Further steps after sorting, including colony picking, sequencing, verification, and characterization of individual clones may take longer, depending on number of clones and required experiments.
Collapse
Affiliation(s)
- Herbert M. Huttanus
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- Agile BioFoundry, Emeryville, CA, United States
| | - Ellin-Kristina H. Triola
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- Agile BioFoundry, Emeryville, CA, United States
| | - Jeanette C. Velasquez-Guzman
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- Agile BioFoundry, Emeryville, CA, United States
| | - Sang-Min Shin
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- BOTTLE Consortium, Golden, CO, United States
| | - Rommel S. Granja-Travez
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- BOTTLE Consortium, Golden, CO, United States
| | - Anmoldeep Singh
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
| | - Taraka Dale
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- Agile BioFoundry, Emeryville, CA, United States
- BOTTLE Consortium, Golden, CO, United States
| | - Ramesh K. Jha
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- Agile BioFoundry, Emeryville, CA, United States
- BOTTLE Consortium, Golden, CO, United States
| |
Collapse
|
10
|
Karlsen ST, Rau MH, Sánchez BJ, Jensen K, Zeidan AA. From genotype to phenotype: computational approaches for inferring microbial traits relevant to the food industry. FEMS Microbiol Rev 2023; 47:fuad030. [PMID: 37286882 PMCID: PMC10337747 DOI: 10.1093/femsre/fuad030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/31/2023] [Accepted: 06/06/2023] [Indexed: 06/09/2023] Open
Abstract
When selecting microbial strains for the production of fermented foods, various microbial phenotypes need to be taken into account to achieve target product characteristics, such as biosafety, flavor, texture, and health-promoting effects. Through continuous advances in sequencing technologies, microbial whole-genome sequences of increasing quality can now be obtained both cheaper and faster, which increases the relevance of genome-based characterization of microbial phenotypes. Prediction of microbial phenotypes from genome sequences makes it possible to quickly screen large strain collections in silico to identify candidates with desirable traits. Several microbial phenotypes relevant to the production of fermented foods can be predicted using knowledge-based approaches, leveraging our existing understanding of the genetic and molecular mechanisms underlying those phenotypes. In the absence of this knowledge, data-driven approaches can be applied to estimate genotype-phenotype relationships based on large experimental datasets. Here, we review computational methods that implement knowledge- and data-driven approaches for phenotype prediction, as well as methods that combine elements from both approaches. Furthermore, we provide examples of how these methods have been applied in industrial biotechnology, with special focus on the fermented food industry.
Collapse
Affiliation(s)
- Signe T Karlsen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Martin H Rau
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Benjamín J Sánchez
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Kristian Jensen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Ahmad A Zeidan
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| |
Collapse
|
11
|
Ni CE, Doan DP, Chiu YJ, Huang YH. TSSUNet-MB - ab initio identification of σ 70 promoter transcription start sites in Escherichia coli using deep multitask learning. Comput Biol Chem 2023; 105:107904. [PMID: 37327560 DOI: 10.1016/j.compbiolchem.2023.107904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 03/22/2023] [Accepted: 06/09/2023] [Indexed: 06/18/2023]
Abstract
MOTIVATION Computational promoter prediction (CPP) tools designed to classify prokaryotic promoter regions usually assume that a transcription start site (TSS) is located at a predefined position within each promoter region. Such CPP tools are sensitive to any positional shifting of the TSS in a windowed region, and they are unsuitable for determining the boundaries of prokaryotic promoters. RESULTS TSSUNet-MB is a deep learning model developed to identify the TSSs of σ70 promoters. Mononucleotide and bendability were used to encode input sequences. TSSUNet-MB outperforms other CPP tools when assessed using the sequences obtained from the neighborhood of real promoters. TSSUNet-MB achieved a sensitivity of 0.839 and specificity of 0.768 on sliding sequences, while other CPP tool cannot maintain both sensitivities and specificities in a compatible range. Furthermore, TSSUNet-MB can precisely predict the TSS position of σ70 promoter-containing regions with a 10-base accuracy of 77.6%. By leveraging the sliding window scanning approach, we further computed the confidence score of each predicted TSS, which allows for more accurately identifying TSS locations. Our results suggest that TSSUNet-MB is a robust tool for finding σ70 promoters and identifying TSSs.
Collapse
Affiliation(s)
- Chung-En Ni
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Duy-Phuong Doan
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Yen-Jung Chiu
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Yen-Hua Huang
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan; Center for Systems and Synthetic Biology, National Yang Ming Chiao Tung University, Taipei, Taiwan.
| |
Collapse
|
12
|
Barbero-Aparicio JA, Olivares-Gil A, Díez-Pastor JF, García-Osorio C. Deep learning and support vector machines for transcription start site identification. PeerJ Comput Sci 2023; 9:e1340. [PMID: 37346545 PMCID: PMC10280436 DOI: 10.7717/peerj-cs.1340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 03/21/2023] [Indexed: 06/23/2023]
Abstract
Recognizing transcription start sites is key to gene identification. Several approaches have been employed in related problems such as detecting translation initiation sites or promoters, many of the most recent ones based on machine learning. Deep learning methods have been proven to be exceptionally effective for this task, but their use in transcription start site identification has not yet been explored in depth. Also, the very few existing works do not compare their methods to support vector machines (SVMs), the most established technique in this area of study, nor provide the curated dataset used in the study. The reduced amount of published papers in this specific problem could be explained by this lack of datasets. Given that both support vector machines and deep neural networks have been applied in related problems with remarkable results, we compared their performance in transcription start site predictions, concluding that SVMs are computationally much slower, and deep learning methods, specially long short-term memory neural networks (LSTMs), are best suited to work with sequences than SVMs. For such a purpose, we used the reference human genome GRCh38. Additionally, we studied two different aspects related to data processing: the proper way to generate training examples and the imbalanced nature of the data. Furthermore, the generalization performance of the models studied was also tested using the mouse genome, where the LSTM neural network stood out from the rest of the algorithms. To sum up, this article provides an analysis of the best architecture choices in transcription start site identification, as well as a method to generate transcription start site datasets including negative instances on any species available in Ensembl. We found that deep learning methods are better suited than SVMs to solve this problem, being more efficient and better adapted to long sequences and large amounts of data. We also create a transcription start site (TSS) dataset large enough to be used in deep learning experiments.
Collapse
Affiliation(s)
| | - Alicia Olivares-Gil
- Departamento de Ingeniería Informática, Universidad de Burgos, Burgos, Spain
| | - José F. Díez-Pastor
- Departamento de Ingeniería Informática, Universidad de Burgos, Burgos, Spain
| | - César García-Osorio
- Departamento de Ingeniería Informática, Universidad de Burgos, Burgos, Spain
| |
Collapse
|
13
|
Patel VK, Das A, Kumari R, Kajla S. Recent progress and challenges in CRISPR-Cas9 engineered algae and cyanobacteria. ALGAL RES 2023. [DOI: 10.1016/j.algal.2023.103068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
14
|
Database of Potential Promoter Sequences in the Capsicum annuum Genome. BIOLOGY 2022; 11:biology11081117. [PMID: 35892972 PMCID: PMC9332048 DOI: 10.3390/biology11081117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 07/19/2022] [Accepted: 07/23/2022] [Indexed: 11/16/2022]
Abstract
In this study, we used a mathematical method for the multiple alignment of highly divergent sequences (MAHDS) to create a database of potential promoter sequences (PPSs) in the Capsicum annuum genome. To search for PPSs, 20 statistically significant classes of sequences located in the range from −499 to +100 nucleotides near the annotated genes were calculated. For each class, a position–weight matrix (PWM) was computed and then used to identify PPSs in the C. annuum genome. In total, 825,136 PPSs were detected, with a false positive rate of 0.13%. The PPSs obtained with the MAHDS method were tested using TSSFinder, which detects transcription start sites. The databank of the found PPSs provides their coordinates in chromosomes, the alignment of each PPS with the PWM, and the level of statistical significance as a normal distribution argument, and can be used in genetic engineering and biotechnology.
Collapse
|
15
|
PromoterLCNN: A Light CNN-Based Promoter Prediction and Classification Model. Genes (Basel) 2022; 13:genes13071126. [PMID: 35885909 PMCID: PMC9325283 DOI: 10.3390/genes13071126] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 06/15/2022] [Accepted: 06/20/2022] [Indexed: 01/01/2023] Open
Abstract
Promoter identification is a fundamental step in understanding bacterial gene regulation mechanisms. However, accurate and fast classification of bacterial promoters continues to be challenging. New methods based on deep convolutional networks have been applied to identify and classify bacterial promoters recognized by sigma (σ) factors and RNA polymerase subunits which increase affinity to specific DNA sequences to modulate transcription and respond to nutritional or environmental changes. This work presents a new multiclass promoter prediction model by using convolutional neural networks (CNNs), denoted as PromoterLCNN, which classifies Escherichia coli promoters into subclasses σ70, σ24, σ32, σ38, σ28, and σ54. We present a light, fast, and simple two-stage multiclass CNN architecture for promoter identification and classification. Training and testing were performed on a benchmark dataset, part of RegulonDB. Comparative performance of PromoterLCNN against other CNN-based classifiers using four parameters (Acc, Sn, Sp, MCC) resulted in similar or better performance than those that commonly use cascade architecture, reducing time by approximately 30–90% for training, prediction, and hyperparameter optimization without compromising classification quality.
Collapse
|
16
|
Wei PJ, Pang ZZ, Jiang LJ, Tan D, Su Y, Zheng CH. Promoter Prediction in Nannochloropsis Based on Densely Connected Convolutional Neural Networks. Methods 2022; 204:38-46. [DOI: 10.1016/j.ymeth.2022.03.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 03/03/2022] [Accepted: 03/28/2022] [Indexed: 10/18/2022] Open
|
17
|
Tietze L, Mangold A, Hoff MW, Lale R. Identification and Cross-Characterisation of Artificial Promoters and 5' Untranslated Regions in Vibrio natriegens. Front Bioeng Biotechnol 2022; 10:826142. [PMID: 35155395 PMCID: PMC8830501 DOI: 10.3389/fbioe.2022.826142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 01/07/2022] [Indexed: 11/13/2022] Open
Abstract
Vibrio natriegens has recently gained attention as a novel fast-growing bacterium in synthetic biology applications. Currently, a limited set of genetic elements optimised for Escherichia coli are used in V. natriegens due to the lack of DNA parts characterised in this novel host. In this study, we report the identification and cross-characterisation of artificial promoters and 5' untranslated regions (artificial regulatory sequence, ARES) that lead to production of fluorescent proteins with a wide-range of expression levels. We identify and cross-characterise 52 constructs in V. natriegens and E. coli. Furthermore, we report the DNA sequence and motif analysis of the ARESs using various algorithms. With this study, we expand the pool of characterised genetic DNA parts that can be used for different biotechnological applications using V. natriegens as a host microorganism.
Collapse
Affiliation(s)
| | | | | | - Rahmi Lale
- Department of Biotechnology and Food Science, Faculty of Natural Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
18
|
Lagator M, Sarikas S, Steinrueck M, Toledo-Aparicio D, Bollback JP, Guet CC, Tkačik G. Predicting bacterial promoter function and evolution from random sequences. eLife 2022; 11:64543. [PMID: 35080492 PMCID: PMC8791639 DOI: 10.7554/elife.64543] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Accepted: 01/09/2022] [Indexed: 12/12/2022] Open
Abstract
Predicting function from sequence is a central problem of biology. Currently, this is possible only locally in a narrow mutational neighborhood around a wildtype sequence rather than globally from any sequence. Using random mutant libraries, we developed a biophysical model that accounts for multiple features of σ70 binding bacterial promoters to predict constitutive gene expression levels from any sequence. We experimentally and theoretically estimated that 10–20% of random sequences lead to expression and ~80% of non-expressing sequences are one mutation away from a functional promoter. The potential for generating expression from random sequences is so pervasive that selection acts against σ70-RNA polymerase binding sites even within inter-genic, promoter-containing regions. This pervasiveness of σ70-binding sites implies that emergence of promoters is not the limiting step in gene regulatory evolution. Ultimately, the inclusion of novel features of promoter function into a mechanistic model enabled not only more accurate predictions of gene expression levels, but also identified that promoters evolve more rapidly than previously thought.
Collapse
Affiliation(s)
- Mato Lagator
- School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom.,Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Srdjan Sarikas
- Institute of Science and Technology Austria, Klosterneuburg, Austria.,Center for Physiology and Pharmacology, Medical University of Vienna, Klosterneuburg, Austria
| | | | | | - Jonathan P Bollback
- Institute of Integrative Biology, Functional and Comparative Genomics, University of Liverpool, Liverpool, United Kingdom
| | - Calin C Guet
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Gašper Tkačik
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| |
Collapse
|
19
|
Bonidia RP, Domingues DS, Sanches DS, de Carvalho ACPLF. MathFeature: feature extraction package for DNA, RNA and protein sequences based on mathematical descriptors. Brief Bioinform 2022; 23:bbab434. [PMID: 34750626 PMCID: PMC8769707 DOI: 10.1093/bib/bbab434] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 09/18/2021] [Accepted: 09/20/2021] [Indexed: 12/24/2022] Open
Abstract
One of the main challenges in applying machine learning algorithms to biological sequence data is how to numerically represent a sequence in a numeric input vector. Feature extraction techniques capable of extracting numerical information from biological sequences have been reported in the literature. However, many of these techniques are not available in existing packages, such as mathematical descriptors. This paper presents a new package, MathFeature, which implements mathematical descriptors able to extract relevant numerical information from biological sequences, i.e. DNA, RNA and proteins (prediction of structural features along the primary sequence of amino acids). MathFeature makes available 20 numerical feature extraction descriptors based on approaches found in the literature, e.g. multiple numeric mappings, genomic signal processing, chaos game theory, entropy and complex networks. MathFeature also allows the extraction of alternative features, complementing the existing packages. To ensure that our descriptors are robust and to assess their relevance, experimental results are presented in nine case studies. According to these results, the features extracted by MathFeature showed high performance (0.6350-0.9897, accuracy), both applying only mathematical descriptors, but also hybridization with well-known descriptors in the literature. Finally, through MathFeature, we overcame several studies in eight benchmark datasets, exemplifying the robustness and viability of the proposed package. MathFeature has advanced in the area by bringing descriptors not available in other packages, as well as allowing non-experts to use feature extraction techniques.
Collapse
Affiliation(s)
- Robson P Bonidia
- Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos 13566-590, Brazil
| | - Douglas S Domingues
- Group of Genomics and Transcriptomes in Plants, Institute of Biosciences, São Paulo State University (UNESP), Rio Claro 13506-900, Brazil
| | - Danilo S Sanches
- Department of Computer Science, Federal University of Technology - Paraná, UTFPR, Cornélio Procópio 86300-000, Brazil
| | - André C P L F de Carvalho
- Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos 13566-590, Brazil
| |
Collapse
|
20
|
Bacillimidazoles A-F, Imidazolium-Containing Compounds Isolated from a Marine Bacillus. Mar Drugs 2022; 20:md20010043. [PMID: 35049898 PMCID: PMC8779896 DOI: 10.3390/md20010043] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Revised: 12/22/2021] [Accepted: 12/22/2021] [Indexed: 01/11/2023] Open
Abstract
Chemical investigations of a marine sponge-associated Bacillus revealed six new imidazolium-containing compounds, bacillimidazoles A-F (1-6). Previous reports of related imidazolium-containing natural products are rare. Initially unveiled by timsTOF (trapped ion mobility spectrometry) MS data, extensive HRMS and 1D and 2D NMR analyses enabled the structural elucidation of 1-6. In addition, a plausible biosynthetic pathway to bacillimidazoles is proposed based on isotopic labeling experiments and invokes the highly reactive glycolytic adduct 2,3-butanedione. Combined, the results of structure elucidation efforts, isotopic labeling studies and bioinformatics suggest that 1-6 result from a fascinating intersection of primary and secondary metabolic pathways in Bacillus sp. WMMC1349. Antimicrobial assays revealed that, of 1-6, only compound six displayed discernible antibacterial activity, despite the close structural similarities shared by all six natural products.
Collapse
|
21
|
Chevez-Guardado R, Peña-Castillo L. Promotech: a general tool for bacterial promoter recognition. Genome Biol 2021; 22:318. [PMID: 34789306 PMCID: PMC8597233 DOI: 10.1186/s13059-021-02514-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Accepted: 10/11/2021] [Indexed: 12/14/2022] Open
Abstract
Promoters are genomic regions where the transcription machinery binds to initiate the transcription of specific genes. Computational tools for identifying bacterial promoters have been around for decades. However, most of these tools were designed to recognize promoters in one or few bacterial species. Here, we present Promotech, a machine-learning-based method for promoter recognition in a wide range of bacterial species. We compare Promotech's performance with the performance of five other promoter prediction methods. Promotech outperforms these other programs in terms of area under the precision-recall curve (AUPRC) or precision at the same level of recall. Promotech is available at https://github.com/BioinformaticsLabAtMUN/PromoTech .
Collapse
Affiliation(s)
- Ruben Chevez-Guardado
- Department of Computer Science, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, Newfoundland, A1C 5S7, Canada
| | - Lourdes Peña-Castillo
- Department of Computer Science, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, Newfoundland, A1C 5S7, Canada. .,Department of Biology, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, Newfoundland, A1C 5S7, Canada.
| |
Collapse
|
22
|
Wilson EH, Groom JD, Sarfatis MC, Ford SM, Lidstrom ME, Beck DAC. A Computational Framework for Identifying Promoter Sequences in Nonmodel Organisms Using RNA-seq Data Sets. ACS Synth Biol 2021; 10:1394-1405. [PMID: 33988977 DOI: 10.1021/acssynbio.1c00017] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Engineering microorganisms into biological factories that convert renewable feedstocks into valuable materials is a major goal of synthetic biology; however, for many nonmodel organisms, we do not yet have the genetic tools, such as suites of strong promoters, necessary to effectively engineer them. In this work, we developed a computational framework that can leverage standard RNA-seq data sets to identify sets of constitutive, strongly expressed genes and predict strong promoter signals within their upstream regions. The framework was applied to a diverse collection of RNA-seq data measured for the methanotroph Methylotuvimicrobium buryatense 5GB1 and identified 25 genes that were constitutively, strongly expressed across 12 experimental conditions. For each gene, the framework predicted short (27-30 nucleotide) sequences as candidate promoters and derived -35 and -10 consensus promoter motifs (TTGACA and TATAAT, respectively) for strong expression in M. buryatense. This consensus closely matches the canonical E. coli sigma-70 motif and was found to be enriched in promoter regions of the genome. A subset of promoter predictions was experimentally validated in a XylE reporter assay, including the consensus promoter, which showed high expression. The pmoC, pqqA, and ssrA promoter predictions were additionally screened in an experiment that scrambled the -35 and -10 signal sequences, confirming that transcription initiation was disrupted when these specific regions of the predicted sequence were altered. These results indicate that the computational framework can make biologically meaningful promoter predictions and identify key pieces of regulatory systems that can serve as foundational tools for engineering diverse microorganisms for biomolecule production.
Collapse
Affiliation(s)
- Erin H. Wilson
- The Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, Washington 98195, United States
| | - Joseph D. Groom
- Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
| | - M. Claire Sarfatis
- Department of Microbiology, University of Washington, Seattle, Washington 98195, United States
| | - Stephanie M. Ford
- Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
| | - Mary E. Lidstrom
- Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
- Department of Microbiology, University of Washington, Seattle, Washington 98195, United States
| | - David A. C. Beck
- Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
- eScience Institute, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
23
|
Lei L, Burton ZF. Early Evolution of Transcription Systems and Divergence of Archaea and Bacteria. Front Mol Biosci 2021; 8:651134. [PMID: 34026831 PMCID: PMC8131849 DOI: 10.3389/fmolb.2021.651134] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Accepted: 04/06/2021] [Indexed: 11/13/2022] Open
Abstract
DNA template-dependent multi-subunit RNA polymerases (RNAPs) found in all three domains of life and some viruses are of the two-double-Ψ-β-barrel (DPBB) type. The 2-DPBB protein format is also found in some RNA template-dependent RNAPs and a major replicative DNA template-dependent DNA polymerase (DNAP) from Archaea (PolD). The 2-DPBB family of RNAPs and DNAPs probably evolved prior to the last universal common cellular ancestor (LUCA). Archaeal Transcription Factor B (TFB) and bacterial σ factors include homologous strings of helix-turn-helix units. The consequences of TFB-σ homology are discussed in terms of the evolution of archaeal and bacterial core promoters. Domain-specific DPBB loop inserts functionally connect general transcription factors to the RNAP active site. Archaea appear to be more similar to LUCA than Bacteria. Evolution of bacterial σ factors from TFB appears to have driven divergence of Bacteria from Archaea, splitting the prokaryotic domains.
Collapse
Affiliation(s)
- Lei Lei
- Department of Biology, University of New England, Biddeford, ME, United States
| | - Zachary F Burton
- Department of Biochemistry and Molecular Biology, Michigan State University, E. Lansing, MI, United States
| |
Collapse
|
24
|
Ittensohn J, Hemberger J, Griffiths H, Keller M, Albrecht S, Miethke T. Regulation of Expression of the TIR-Containing Protein C Gene of the Uropathogenic Escherichia coli Strain CFT073. Pathogens 2021; 10:pathogens10050549. [PMID: 34062817 PMCID: PMC8147327 DOI: 10.3390/pathogens10050549] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 04/23/2021] [Accepted: 04/27/2021] [Indexed: 11/22/2022] Open
Abstract
The uropathogenic Escherichia coli strain CFT073 causes kidney abscesses in mice Toll/interleukin-1 receptor domain-containing protein C (TcpC) dependently and the corresponding gene is present in around 40% of E. coli isolates of pyelonephritis patients. It impairs the Toll-like receptor (TLR) signaling chain and the NACHT leucin-rich repeat PYD protein 3 inflammasome (NLRP3) by binding to TLR4 and myeloid differentiation factor 88 as well as to NLRP3 and caspase-1, respectively. Overexpression of the tcpC gene stopped replication of CFT073. Overexpression of several tcpC-truncation constructs revealed a transmembrane region, while its TIR domain induced filamentous bacteria. Based on these observations, we hypothesized that tcpC expression is presumably tightly controlled. We tested two putative promoters designated P1 and P2 located at 5′ of the gene c2397 and 5′ of the tcpC gene (c2398), respectively, which may form an operon. High pH and increasing glucose concentrations stimulated a P2 reporter construct that was considerably stronger than a P1 reporter construct, while increasing FeSO4 concentrations suppressed their activity. Human urine activated P2, demonstrating that tcpC might be induced in the urinary tract of infected patients. We conclude that P2, consisting of a 240 bp region 5′ of the tcpC gene, represents the major regulator of tcpC expression.
Collapse
Affiliation(s)
- Julia Ittensohn
- Medical Faculty of Mannheim, Institute of Medical Microbiology and Hygiene, University of Heidelberg, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany; (J.I.); (J.H.); (H.G.); (M.K.); (S.A.)
- Medical Faculty of Mannheim, Mannheim Institute for Innate Immunoscience (MI3), University of Heidelberg, Ludolf-Krehl-Str. 13-17, 68167 Mannheim, Germany
| | - Jacqueline Hemberger
- Medical Faculty of Mannheim, Institute of Medical Microbiology and Hygiene, University of Heidelberg, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany; (J.I.); (J.H.); (H.G.); (M.K.); (S.A.)
- Medical Faculty of Mannheim, Mannheim Institute for Innate Immunoscience (MI3), University of Heidelberg, Ludolf-Krehl-Str. 13-17, 68167 Mannheim, Germany
| | - Hannah Griffiths
- Medical Faculty of Mannheim, Institute of Medical Microbiology and Hygiene, University of Heidelberg, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany; (J.I.); (J.H.); (H.G.); (M.K.); (S.A.)
- Medical Faculty of Mannheim, Mannheim Institute for Innate Immunoscience (MI3), University of Heidelberg, Ludolf-Krehl-Str. 13-17, 68167 Mannheim, Germany
| | - Maren Keller
- Medical Faculty of Mannheim, Institute of Medical Microbiology and Hygiene, University of Heidelberg, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany; (J.I.); (J.H.); (H.G.); (M.K.); (S.A.)
- Medical Faculty of Mannheim, Mannheim Institute for Innate Immunoscience (MI3), University of Heidelberg, Ludolf-Krehl-Str. 13-17, 68167 Mannheim, Germany
| | - Simone Albrecht
- Medical Faculty of Mannheim, Institute of Medical Microbiology and Hygiene, University of Heidelberg, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany; (J.I.); (J.H.); (H.G.); (M.K.); (S.A.)
- Medical Faculty of Mannheim, Mannheim Institute for Innate Immunoscience (MI3), University of Heidelberg, Ludolf-Krehl-Str. 13-17, 68167 Mannheim, Germany
| | - Thomas Miethke
- Medical Faculty of Mannheim, Institute of Medical Microbiology and Hygiene, University of Heidelberg, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany; (J.I.); (J.H.); (H.G.); (M.K.); (S.A.)
- Medical Faculty of Mannheim, Mannheim Institute for Innate Immunoscience (MI3), University of Heidelberg, Ludolf-Krehl-Str. 13-17, 68167 Mannheim, Germany
- Correspondence:
| |
Collapse
|
25
|
Shemyakina AO, Grechishnikova EG, Novikov AD, Asachenko AF, Kalinina TI, Lavrov KV, Yanenko AS. A Set of Active Promoters with Different Activity Profiles for Superexpressing Rhodococcus Strain. ACS Synth Biol 2021; 10:515-530. [PMID: 33605147 DOI: 10.1021/acssynbio.0c00508] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Rhodococcus bacteria are a promising platform for biodegradation, biocatalysis, and biosynthesis, but the use of rhodococci is hampered by the insufficient number of both platform strains for expression and promoters that are functional and thoroughly studied in these strains. To expand the list of such strains and promoters, we studied the expression capability of the Rhodococcus rhodochrous M33 strain, and the functioning of a set of recombinant promoters in it. We showed that the strain supports superexpression of the target enzyme (nitrile hydratase) using alternative inexpensive feedings-acetate and urea-without growth factor supplementation, thus being a suitable expression platform. The promoter set included Ptuf (elongation factor Tu) and Psod (superoxide dismutase) from Corynebacterium glutamicum ATCC13032, Pcpi (isocitrate lyase) from Rhodococcus erythropolis PR4, and Pnh (nitrile hydratase) from R. rhodochrous M8. Activity levels, regulation possibilities, and growth-phase-dependent activity profiles of these promoters were studied in derivatives of the M33 strain. The activities of the promoters were significantly different (Pcpi < Psod ≪ Ptuf < Pnh), covering 103-fold range, and the most active Pnh and Ptuf produced up to a 30-50% portion of target protein in soluble intracellular proteins. On the basis of the mRNA quantification and amount of target protein, the production level of Pnh was positioned close to the theoretical upper limit of expression in a bacterial cell. A selection method for the laboratory evolution of such active promoters directly in Rhodococcus was also proposed. Concerning regulation, Ptuf could not be regulated (2-fold change), while others were tunable (6-fold for Psod, 79-fold for Pnh, and 44-fold for Pcpi). The promoters possessed four different activity profiles, including three with peak of activity at different growth phases and one with constant activity throughout the growth phases. Ptuf and Pcpi did not change their activity profile under different growth conditions, whereas the Psod and Pnh profiles changed depending on the growth media. The results allow flexible construction of Rhodococcus strains using the studied promoters, and demonstrate a valuable approach for complex characterization of promoters intended for biotechnological strain construction.
Collapse
Affiliation(s)
- Anna O. Shemyakina
- NRC Kurchatov Institute-Gosniigenetika, Kurchatov Genomic Center, 1st Dorojny pr. 1, Moscow, 117545, Russia
- NRC Kurchatov Institute, Akademika Kurchatova pl. 1, Moscow, 123182, Russia
| | - Elena G. Grechishnikova
- NRC Kurchatov Institute-Gosniigenetika, Kurchatov Genomic Center, 1st Dorojny pr. 1, Moscow, 117545, Russia
- NRC Kurchatov Institute, Akademika Kurchatova pl. 1, Moscow, 123182, Russia
| | - Andrey D. Novikov
- NRC Kurchatov Institute-Gosniigenetika, Kurchatov Genomic Center, 1st Dorojny pr. 1, Moscow, 117545, Russia
- NRC Kurchatov Institute, Akademika Kurchatova pl. 1, Moscow, 123182, Russia
| | - Andrey F. Asachenko
- A. V. Topchiev Institute of Petrochemical Synthesis of Russian Academy of Sciences, Leninsky prospect 29, Moscow, 119991, Russia
| | - Tatyana I. Kalinina
- NRC Kurchatov Institute-Gosniigenetika, Kurchatov Genomic Center, 1st Dorojny pr. 1, Moscow, 117545, Russia
- NRC Kurchatov Institute, Akademika Kurchatova pl. 1, Moscow, 123182, Russia
| | - Konstantin V. Lavrov
- NRC Kurchatov Institute-Gosniigenetika, Kurchatov Genomic Center, 1st Dorojny pr. 1, Moscow, 117545, Russia
- NRC Kurchatov Institute, Akademika Kurchatova pl. 1, Moscow, 123182, Russia
| | - Alexander S. Yanenko
- NRC Kurchatov Institute-Gosniigenetika, Kurchatov Genomic Center, 1st Dorojny pr. 1, Moscow, 117545, Russia
- NRC Kurchatov Institute, Akademika Kurchatova pl. 1, Moscow, 123182, Russia
| |
Collapse
|