1
|
Cronan JE. Unsaturated fatty acid synthesis in bacteria: Mechanisms and regulation of canonical and remarkably noncanonical pathways. Biochimie 2024; 218:137-151. [PMID: 37683993 PMCID: PMC10915108 DOI: 10.1016/j.biochi.2023.09.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 08/02/2023] [Accepted: 09/04/2023] [Indexed: 09/10/2023]
Abstract
Unsaturated phospholipid acyl chains are required for membrane function in most bacteria. The double bonds of the cis monoenoic chains arise by two distinct pathways depending on whether oxygen is required. The oxygen-independent pathway (traditionally called the anaerobic pathway) introduces the cis double bond by isomerization of the trans double bond intermediate of the fatty acid elongation cycle. Double bond isomerization occurs at an intermediate chain length (e.g., C10) and the isomerization product is elongated to the C16-C18 chains that become phospholipid monoenoic acyl chains. This pathway was first delineated in Escherichia coli and became the paradigm pathway. However, studies of other bacteria show deviations from this paradigm, the most exceptional being reversal of the fatty acid elongation cycle by a reaction paralleling the initial step in the β-oxidative degradation of fatty acids. In the oxygen-dependent pathway diiron enzymes called desaturases introduce a double bond into a saturated acyl chain by regioselective cis dehydrogenation through activation of molecular oxygen with an active-site diiron cluster. This difficult hydrogen abstraction from a methylene group often occurs at the midpoint of a saturated fatty acyl chain. In bacteria the acyl chain is a phospholipid acyl chain, and the desaturase is membrane bound. Both the oxygen-independent oxygen-dependent pathways are transcriptionally regulated by repressor and activator proteins that respond to small molecule ligands such as acyl-CoAs. However, in Bacillus subtilis the desaturase is synthesized only at low growth temperatures, a process controlled by a signal transduction regulatory pathway dependent on membrane lipid properties.
Collapse
Affiliation(s)
- John E Cronan
- Departments of Microbiology and Biochemistry, University of Illinois, Urbana, 61801, USA.
| |
Collapse
|
2
|
Tognon M, Giugno R, Pinello L. A survey on algorithms to characterize transcription factor binding sites. Brief Bioinform 2023; 24:bbad156. [PMID: 37099664 PMCID: PMC10422928 DOI: 10.1093/bib/bbad156] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 03/27/2023] [Accepted: 04/01/2023] [Indexed: 04/28/2023] Open
Abstract
Transcription factors (TFs) are key regulatory proteins that control the transcriptional rate of cells by binding short DNA sequences called transcription factor binding sites (TFBS) or motifs. Identifying and characterizing TFBS is fundamental to understanding the regulatory mechanisms governing the transcriptional state of cells. During the last decades, several experimental methods have been developed to recover DNA sequences containing TFBS. In parallel, computational methods have been proposed to discover and identify TFBS motifs based on these DNA sequences. This is one of the most widely investigated problems in bioinformatics and is referred to as the motif discovery problem. In this manuscript, we review classical and novel experimental and computational methods developed to discover and characterize TFBS motifs in DNA sequences, highlighting their advantages and drawbacks. We also discuss open challenges and future perspectives that could fill the remaining gaps in the field.
Collapse
Affiliation(s)
- Manuel Tognon
- Computer Science Department, University of Verona, Verona, Italy
- Molecular Pathology Unit, Center for Computational and Integrative Biology and Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Rosalba Giugno
- Computer Science Department, University of Verona, Verona, Italy
| | - Luca Pinello
- Molecular Pathology Unit, Center for Computational and Integrative Biology and Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Department of Pathology, Harvard Medical School, Boston, Massachusetts, United States of America
| |
Collapse
|
3
|
Shao S, Zhang Y, Yin K, Zhang Y, Wei L, Wang Q. FabR senses long-chain unsaturated fatty acids to control virulence in pathogen Edwardsiella piscicida. Mol Microbiol 2022; 117:737-753. [PMID: 34932231 DOI: 10.1111/mmi.14869] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Revised: 12/11/2021] [Accepted: 12/18/2021] [Indexed: 11/28/2022]
Abstract
Long-chain unsaturated fatty acids (UFAs) can serve as nutrient sources or building blocks for bacterial membranes. However, little is known about how UFAs may be incorporated into the virulence programs of pathogens. A previous investigation identified FabR as a positive regulator of virulence gene expression in Edwardsiella piscicida. Here, chromatin immunoprecipitation-sequencing coupled with RNA-seq analyses revealed that 10 genes were under the direct control of FabR, including fabA, fabB, and cfa, which modulate the composition of UFAs. The binding of FabR to its target DNA was facilitated by oleoyl-CoA and inhibited by stearoyl-CoA. In addition, analyses of enzyme mobility shift assay and DNase I footprinting with wild-type and a null mutant (F131A) of FabR demonstrated crucial roles of FabR in binding to the promoters of fabA, fabB, and cfa. Moreover, FabR also binds to the promoter region of the virulence regulator esrB for its activation, facilitating the expression of the type III secretion system (T3SS) in response to UFAs. Furthermore, FabR coordinated with RpoS to modulate the expression of T3SS. Collectively, our results elucidate the molecular machinery of FabR regulating bacterial fatty acid composition and virulence in enteric pathogens, further expanding our knowledge of its crucial role in host-pathogen interactions.
Collapse
Affiliation(s)
- Shuai Shao
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai, China
| | - Yi Zhang
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai, China
| | - Kaiyu Yin
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai, China
| | - Yuanxing Zhang
- Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, China
- Shanghai Engineering Research Center of Maricultured Animal Vaccines, Shanghai, China
| | - Lifan Wei
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai, China
- Department of Endodontics and Operative Dentistry, Ninth People's Hospital, College of Stomatology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Qiyao Wang
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai, China
- Shanghai Engineering Research Center of Maricultured Animal Vaccines, Shanghai, China
| |
Collapse
|
4
|
Bacterial Homologs of Progestin and AdipoQ Receptors (PAQRs) Affect Membrane Energetics Homeostasis but Not Fluidity. J Bacteriol 2022; 204:e0058321. [PMID: 35285724 PMCID: PMC9017321 DOI: 10.1128/jb.00583-21] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Membrane potential homeostasis is essential for cell survival. Defects in membrane potential lead to pleiotropic phenotypes, consistent with the central role of membrane energetics in cell physiology. Homologs of the progestin and AdipoQ receptors (PAQRs) are conserved in multiple phyla of Bacteria and Eukarya. In eukaryotes, PAQRs are proposed to modulate membrane fluidity and fatty acid (FA) metabolism. The role of bacterial homologs has not been elucidated. Here, we use Escherichia coli and Bacillus subtilis to show that bacterial PAQR homologs, which we name “TrhA,” have a role in membrane energetics homeostasis. Using transcriptional fusions, we show that E. coli TrhA (encoded by yqfA) is part of the unsaturated fatty acid biosynthesis regulon. Fatty acid analyses and physiological assays show that a lack of TrhA in both E. coli and B. subtilis (encoded by yplQ) provokes subtle but consistent changes in membrane fatty acid profiles that do not translate to control of membrane fluidity. Instead, membrane proteomics in E. coli suggested a disrupted energy metabolism and dysregulated membrane energetics in the mutant, though it grew similarly to its parent. These changes translated into a disturbed membrane potential in the mutant relative to its parent under various growth conditions. Similar dysregulation of membrane energetics was observed in a different E. coli strain and in the distantly related B. subtilis. Together, our findings are consistent with a role for TrhA in membrane energetics homeostasis, through a mechanism that remains to be elucidated. IMPORTANCE Eukaryotic homologs of the progestin and AdipoQ receptor family (PAQR) have been shown to regulate membrane fluidity by affecting, through unknown mechanisms, unsaturated fatty acid (FA) metabolism. The bacterial homologs studied here mediate small and consistent changes in unsaturated FA metabolism that do not seem to impact membrane fluidity but, rather, alter membrane energetics homeostasis. Together, the findings here suggest that bacterial and eukaryotic PAQRs share functions in maintaining membrane homeostasis (fluidity in eukaryotes and energetics for bacteria with TrhA homologs).
Collapse
|
5
|
Adams FG, Pokhrel A, Brazel EB, Semenec L, Li L, Trappetti C, Paton JC, Cain AK, Paulsen IT, Eijkelkamp BA. Acinetobacter baumannii Fatty Acid Desaturases Facilitate Survival in Distinct Environments. ACS Infect Dis 2021; 7:2221-2228. [PMID: 34100578 DOI: 10.1021/acsinfecdis.1c00192] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Maintaining optimal fluidity is essential to ensure adequate membrane structure and function under different environmental conditions. We apply integrated molecular approaches to characterize two desaturases (DesA and DesB) and define their specific roles in unsaturated fatty acid (UFA) production in Acinetobacter baumannii. Using a murine model, we reveal DesA to play a minor role in colonization of the respiratory tract, whereas DesB is important during invasive disease. Furthermore, using transcriptomic and bioinformatic analyses, a global regulator involved in fatty acid homeostasis and members of its regulon are characterized. Collectively, we show that DesA and DesB are primary contributors to UFA production in A. baumannii with infection studies illustrating that these distinct desaturases aid in the bacterium's ability to survive in multiple host niches. Hence, this study provides novel insights into the fundamentals of A. baumannii lipid biology, which contributes to the versatility of this critical bacterial pathogen.
Collapse
Affiliation(s)
- Felise G. Adams
- Molecular Sciences and Technology, College of Science and Engineering, Flinders University, Bedford Park, South Australia 5042, Australia
| | - Alaska Pokhrel
- Department of Molecular Sciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Erin B. Brazel
- Research Centre for Infectious Diseases, Department of Molecular and Biomedical Science, University of Adelaide, Adelaide, South Australia 5005, Australia
| | - Lucie Semenec
- Department of Molecular Sciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Liping Li
- Department of Molecular Sciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Claudia Trappetti
- Research Centre for Infectious Diseases, Department of Molecular and Biomedical Science, University of Adelaide, Adelaide, South Australia 5005, Australia
| | - James C. Paton
- Research Centre for Infectious Diseases, Department of Molecular and Biomedical Science, University of Adelaide, Adelaide, South Australia 5005, Australia
| | - Amy K. Cain
- Department of Molecular Sciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Ian T. Paulsen
- Department of Molecular Sciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Bart A. Eijkelkamp
- Molecular Sciences and Technology, College of Science and Engineering, Flinders University, Bedford Park, South Australia 5042, Australia
| |
Collapse
|
6
|
Long P, Zhang L, Huang B, Chen Q, Liu H. Integrating genome sequence and structural data for statistical learning to predict transcription factor binding sites. Nucleic Acids Res 2020; 48:12604-12617. [PMID: 33264415 PMCID: PMC7736823 DOI: 10.1093/nar/gkaa1134] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 09/18/2020] [Accepted: 11/10/2020] [Indexed: 01/11/2023] Open
Abstract
We report an approach to predict DNA specificity of the tetracycline repressor (TetR) family transcription regulators (TFRs). First, a genome sequence-based method was streamlined with quantitative P-values defined to filter out reliable predictions. Then, a framework was introduced to incorporate structural data and to train a statistical energy function to score the pairing between TFR and TFR binding site (TFBS) based on sequences. The predictions benchmarked against experiments, TFBSs for 29 out of 30 TFRs were correctly predicted by either the genome sequence-based or the statistical energy-based method. Using P-values or Z-scores as indicators, we estimate that 59.6% of TFRs are covered with relatively reliable predictions by at least one of the two methods, while only 28.7% are covered by the genome sequence-based method alone. Our approach predicts a large number of new TFBs which cannot be correctly retrieved from public databases such as FootprintDB. High-throughput experimental assays suggest that the statistical energy can model the TFBSs of a significant number of TFRs reliably. Thus the energy function may be applied to explore for new TFBSs in respective genomes. It is possible to extend our approach to other transcriptional factor families with sufficient structural information.
Collapse
Affiliation(s)
- Pengpeng Long
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Lu Zhang
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Bin Huang
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Quan Chen
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
- Hefei National Laboratory for Physical Sciences at the Microscale, Hefei, Anhui 230026, China
| | - Haiyan Liu
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
- Hefei National Laboratory for Physical Sciences at the Microscale, Hefei, Anhui 230026, China
- School of Data Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
7
|
Taboada-Castro H, Castro-Mondragón JA, Aguilar-Vera A, Hernández-Álvarez AJ, van Helden J, Encarnación-Guevara S. RhizoBindingSites, a Database of DNA-Binding Motifs in Nitrogen-Fixing Bacteria Inferred Using a Footprint Discovery Approach. Front Microbiol 2020; 11:567471. [PMID: 33250866 PMCID: PMC7674921 DOI: 10.3389/fmicb.2020.567471] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 10/13/2020] [Indexed: 11/30/2022] Open
Abstract
Basic knowledge of transcriptional regulation is needed to understand the mechanisms governing biological processes, i.e., nitrogen fixation by Rhizobiales bacteria in symbiosis with leguminous plants. The RhizoBindingSites database is a computer-assisted framework providing motif-gene-associated conserved sequences potentially implicated in transcriptional regulation in nine symbiotic species. A dyad analysis algorithm was used to deduce motifs in the upstream regulatory region of orthologous genes, and only motifs also located in the gene seed promoter with a p-value of 1e-4 were accepted. A genomic scan analysis of the upstoream sequences with these motifs was performed. These predicted binding sites were categorized according to low, medium and high homology between the matrix and the upstream regulatory sequence. On average, 62.7% of the genes had a motif, accounting for 80.44% of the genes per genome, with 19613 matrices (a matrix is a representation of a motif). The RhizoBindingSites database provides motif and gene information, motif conservation in the order Rhizobiales, matrices, motif logos, regulatory networks constructed from theoretical or experimental data, a criterion for selecting motifs and a guide for users. The RhizoBindingSites database is freely available online at rhizobindingsites.ccg.unam.mx.
Collapse
Affiliation(s)
| | | | - Alejandro Aguilar-Vera
- Center for Genomic Sciences, National Autonomous University of Mexico, Cuernavaca, Mexico
| | | | - Jacques van Helden
- CNRS, IFB-core, UMS 3601, Institut Français de Bioinformatique, Évry, France.,Laboratoire Theory and Approaches of Genome Complexity (TAGC), Inserm, Aix-Marseille Univ, Marseille, France
| | | |
Collapse
|
8
|
Gao R, Li D, Lin Y, Lin J, Xia X, Wang H, Bi L, Zhu J, Hassan B, Wang S, Feng Y. Structural and Functional Characterization of the FadR Regulatory Protein from Vibrio alginolyticus. Front Cell Infect Microbiol 2017; 7:513. [PMID: 29312893 PMCID: PMC5733061 DOI: 10.3389/fcimb.2017.00513] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Accepted: 11/29/2017] [Indexed: 02/03/2023] Open
Abstract
The structure of Vibrio cholerae FadR (VcFadR) complexed with the ligand oleoyl-CoA suggests an additional ligand-binding site. However, the fatty acid metabolism and its regulation is poorly addressed in Vibrio alginolyticus, a species closely-related to V. cholerae. Here, we show crystal structures of V. alginolyticus FadR (ValFadR) alone and its complex with the palmitoyl-CoA, a long-chain fatty acyl ligand different from the oleoyl-CoA occupied by VcFadR. Structural comparison indicates that both VcFadR and ValFadR consistently have an additional ligand-binding site (called site 2), which leads to more dramatic conformational-change of DNA-binding domain than that of the E. coli FadR (EcFadR). Isothermal titration calorimetry (ITC) analyses defines that the ligand-binding pattern of ValFadR (2:1) is distinct from that of EcFadR (1:1). Together with surface plasmon resonance (SPR), electrophoresis mobility shift assay (EMSA) demonstrates that ValFadR binds fabA, an important gene of unsaturated fatty acid (UFA) synthesis. The removal of fadR from V. cholerae attenuates fabA transcription and results in the unbalance of UFA/SFA incorporated into membrane phospholipids. Genetic complementation of the mutant version of fadR (Δ42, 136-177) lacking site 2 cannot restore the defective phenotypes of ΔfadR while the wild-type fadR gene and addition of exogenous oleate can restore them. Mice experiments reveals that VcFadR and its site 2 have roles in bacterial colonizing. Together, the results might represent an additional example that illustrates the Vibrio FadR-mediated lipid regulation and its role in pathogenesis.
Collapse
Affiliation(s)
- Rongsui Gao
- Department of Medical Microbiology and Parasitology, Zhejiang University School of Medicine, Hangzhou, China
| | - Defeng Li
- Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Yuan Lin
- School of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Jingxia Lin
- Department of Medical Microbiology and Parasitology, Zhejiang University School of Medicine, Hangzhou, China
| | - Xiaoyun Xia
- Department of Microbiology, Nanjing Agricultural University, Nanjing, China
| | - Hui Wang
- Department of Microbiology, Nanjing Agricultural University, Nanjing, China
| | - Lijun Bi
- Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Jun Zhu
- Department of Microbiology, Nanjing Agricultural University, Nanjing, China.,Department of Microbiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Bachar Hassan
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Shihua Wang
- School of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Youjun Feng
- Department of Medical Microbiology and Parasitology, Zhejiang University School of Medicine, Hangzhou, China
| |
Collapse
|
9
|
Lee M, Um H, Van Dyke MW. Identification and characterization of preferred DNA-binding sites for the Thermus thermophilus transcriptional regulator FadR. PLoS One 2017; 12:e0184796. [PMID: 28902898 PMCID: PMC5597230 DOI: 10.1371/journal.pone.0184796] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Accepted: 08/31/2017] [Indexed: 11/18/2022] Open
Abstract
One of the primary transcriptional regulators of fatty acid homeostasis in many prokaryotes is the protein FadR. To better understand its biological function in the extreme thermophile Thermus thermophilus HB8, we sought to first determine its preferred DNA-binding sequences in vitro using the combinatorial selection method Restriction Endonuclease Protection, Selection, and Amplification (REPSA) and then use this information to bioinformatically identify potential regulated genes. REPSA determined a consensus FadR-binding sequence 5´-TTRNACYNRGTNYAA-3´, which was further characterized using quantitative electrophoretic mobility shift assays. With this information, a search of the T. thermophilus HB8 genome found multiple operons potentially regulated by FadR. Several of these were identified as encoding proteins involved in fatty acid biosynthesis and degradation; however, others were novel and not previously identified as targets of FadR. The role of FadR in regulating these genes was validated by physical and functional methods, as well as comparative genomic approaches to further characterize regulons in related organisms. Taken together, our study demonstrates that a systematic approach involving REPSA, biophysical characterization of protein-DNA binding, and bioinformatics can be used to postulate biological roles for potential transcriptional regulators.
Collapse
Affiliation(s)
- Minwoo Lee
- Department of Chemistry and Biochemistry, Kennesaw State University, Kennesaw, Georgia, United States of America
| | - Hyejin Um
- Department of Chemistry and Biochemistry, Kennesaw State University, Kennesaw, Georgia, United States of America
| | - Michael W. Van Dyke
- Department of Chemistry and Biochemistry, Kennesaw State University, Kennesaw, Georgia, United States of America
- * E-mail:
| |
Collapse
|
10
|
Fu H, Zhang X. Noncoding Variants Functional Prioritization Methods Based on Predicted Regulatory Factor Binding Sites. Curr Genomics 2017; 18:322-331. [PMID: 29081688 PMCID: PMC5635616 DOI: 10.2174/1389202918666170228143619] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Revised: 10/16/2016] [Accepted: 11/02/2016] [Indexed: 12/31/2022] Open
Abstract
BACKGROUNDS With the advent of the post genomic era, the research for the genetic mechanism of the diseases has found to be increasingly depended on the studies of the genes, the gene-networks and gene-protein interaction networks. To explore gene expression and regulation, the researchers have carried out many studies on transcription factors and their binding sites (TFBSs). Based on the large amount of transcription factor binding sites predicting values in the deep learning models, further computation and analysis have been done to reveal the relationship between the gene mutation and the occurrence of the disease. It has been demonstrated that based on the deep learning methods, the performances of the prediction for the functions of the noncoding variants are outperforming than those of the conventional methods. The research on the prediction for functions of Single Nucleotide Polymorphisms (SNPs) is expected to uncover the mechanism of the gene mutation affection on traits and diseases of human beings. RESULTS We reviewed the conventional TFBSs identification methods from different perspectives. As for the deep learning methods to predict the TFBSs, we discussed the related problems, such as the raw data preprocessing, the structure design of the deep convolution neural network (CNN) and the model performance measure et al. And then we summarized the techniques that usually used in finding out the functional noncoding variants from de novo sequence. CONCLUSION Along with the rapid development of the high-throughout assays, more and more sample data and chromatin features would be conducive to improve the prediction accuracy of the deep convolution neural network for TFBSs identification. Meanwhile, getting more insights into the deep CNN framework itself has been proved useful for both the promotion on model performance and the development for more suitable design to sample data. Based on the feature values predicted by the deep CNN model, the prioritization model for functional noncoding variants would contribute to reveal the affection of gene mutation on the diseases.
Collapse
Affiliation(s)
- Haoyue Fu
- College of Sciences, Northeastern University, Shenyang, China
| | - LianpingYang
- College of Sciences, Northeastern University, Shenyang, China
- University of Southern California, Dept. Biol. Sci., Program Mol & Computat Biol, USA
| | - Xiangde Zhang
- College of Sciences, Northeastern University, Shenyang, China
| |
Collapse
|
11
|
Tramonti A, Milano T, Nardella C, di Salvo ML, Pascarella S, Contestabile R. Salmonella typhimurium PtsJ is a novel MocR-like transcriptional repressor involved in regulating the vitamin B 6 salvage pathway. FEBS J 2017; 284:466-484. [PMID: 27987384 DOI: 10.1111/febs.13994] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Revised: 12/09/2016] [Accepted: 12/13/2016] [Indexed: 12/11/2022]
Abstract
The vitamin B6 salvage pathway, involving pyridoxine 5'-phosphate oxidase (PNPOx) and pyridoxal kinase (PLK), recycles B6 vitamers from nutrients and protein turnover to produce pyridoxal 5'-phosphate (PLP), the catalytically active form of the vitamin. Regulation of this pathway, widespread in living organisms including humans and many bacteria, is very important to vitamin B6 homeostasis but poorly understood. Although some information is available on the enzymatic regulation of PNPOx and PLK, little is known on their regulation at the transcriptional level. In the present work, we identified a new MocR-like regulator, PtsJ from Salmonella typhimurium, which controls the expression of the pdxK gene encoding one of the two PLKs expressed in this organism (PLK1). Analysis of pdxK expression in a ptsJ knockout strain demonstrated that PtsJ acts as a transcriptional repressor. This is the first case of a MocR-like regulator acting as repressor of its target gene. Expression and purification of PtsJ allowed a detailed characterisation of its effector and DNA-binding properties. PLP is the only B6 vitamer acting as effector molecule for PtsJ. A DNA-binding region composed of four repeated nucleotide sequences is responsible for binding of PtsJ to its target promoter. Analysis of binding stoichiometry revealed that protein subunits/DNA molar ratio varies from 4 : 1 to 2 : 1, depending on the presence or absence of PLP. Structural characteristics of DNA transcriptional factor-binding sites suggest that PtsJ binds DNA according to a different model with respect to other characterised members of the MocR subgroup.
Collapse
Affiliation(s)
- Angela Tramonti
- Istituto di Biologia e Patologia Molecolari, Consiglio Nazionale delle Ricerche, Rome, Italy.,Dipartimento di Scienze Biochimiche 'A. Rossi Fanelli', Sapienza Università di Roma, Italy
| | - Teresa Milano
- Dipartimento di Scienze Biochimiche 'A. Rossi Fanelli', Sapienza Università di Roma, Italy
| | - Caterina Nardella
- Dipartimento di Scienze Biochimiche 'A. Rossi Fanelli', Sapienza Università di Roma, Italy
| | - Martino L di Salvo
- Dipartimento di Scienze Biochimiche 'A. Rossi Fanelli', Sapienza Università di Roma, Italy
| | - Stefano Pascarella
- Dipartimento di Scienze Biochimiche 'A. Rossi Fanelli', Sapienza Università di Roma, Italy
| | - Roberto Contestabile
- Dipartimento di Scienze Biochimiche 'A. Rossi Fanelli', Sapienza Università di Roma, Italy
| |
Collapse
|
12
|
Kılıç S, Erill I. Assessment of transfer methods for comparative genomics of regulatory networks in bacteria. BMC Bioinformatics 2016; 17 Suppl 8:277. [PMID: 27586594 PMCID: PMC5009822 DOI: 10.1186/s12859-016-1113-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Background Comparative genomics can leverage the vast amount of available genomic sequences to reconstruct and analyze transcriptional regulatory networks in Bacteria, but the efficacy of this approach hinges on the ability to transfer regulatory network information from reference species to the genomes under analysis. Several methods have been proposed to transfer regulatory information between bacterial species, but the paucity and distributed nature of experimental information on bacterial transcriptional networks have prevented their systematic evaluation. Results We report the compilation of a large catalog of transcription factor-binding sites across Bacteria and its use to systematically benchmark proposed transfer methods across pairs of bacterial species. We evaluate motif- and accuracy-based metrics to assess the results of regulatory network transfer and we identify the precision-recall area-under-the-curve as the best metric for this purpose due to the large class-imbalanced nature of the problem. Methods assuming conservation of the transcription factor-binding motif (motif-based) are shown to substantially outperform those assuming conservation of regulon composition (network-based), even though their efficiency can decrease sharply with increasing phylogenetic distance. Variations of the basic motif-based transfer method do not yield significant improvements in transfer accuracy. Our results indicate that detection of a large enough number of regulated orthologs is critical for network-based transfer methods, but that relaxing orthology requirements does not improve results. Using the transcriptional regulators LexA and Fur as case examples, we also show how DNA-binding domain sequence similarity can yield confounding results as an indicator of transfer efficiency for motif-based methods. Conclusions Counter to standard practice, our evaluation of metrics to assess the efficiency of methods for regulatory network information transfer reveals that the area under precision-recall (PR) curves is a more precise and informative metric than that of receiver-operating-characteristic (ROC) curves, confirming similar findings in other class-imbalanced settings. Our systematic assessment of transfer methods reveals that simple approaches to both motif- and network-based transfer of regulatory information provide equal or better results than more elaborate methods. We also show that there are not effective predictors of transfer efficacy, substantiating the long-standing practice of manual curation in comparative genomics analyses. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1113-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sefa Kılıç
- Department of Biological Sciences, University of Maryland Baltimore County (UMBC), Baltimore, MD, 21250, USA
| | - Ivan Erill
- Department of Biological Sciences, University of Maryland Baltimore County (UMBC), Baltimore, MD, 21250, USA.
| |
Collapse
|
13
|
Tsoy OV, Ravcheev DA, Čuklina J, Gelfand MS. Nitrogen Fixation and Molecular Oxygen: Comparative Genomic Reconstruction of Transcription Regulation in Alphaproteobacteria. Front Microbiol 2016; 7:1343. [PMID: 27617010 PMCID: PMC4999443 DOI: 10.3389/fmicb.2016.01343] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2016] [Accepted: 08/15/2016] [Indexed: 11/13/2022] Open
Abstract
Biological nitrogen fixation plays a crucial role in the nitrogen cycle. An ability to fix atmospheric nitrogen, reducing it to ammonium, was described for multiple species of Bacteria and Archaea. The transcriptional regulatory network for nitrogen fixation was extensively studied in several representatives of the class Alphaproteobacteria. This regulatory network includes the activator of nitrogen fixation NifA, working in tandem with the alternative sigma-factor RpoN as well as oxygen-responsive regulatory systems, one-component regulators FnrN/FixK and two-component system FixLJ. Here we used a comparative genomics approach for in silico study of the transcriptional regulatory network in 50 genomes of Alphaproteobacteria. We extended the known regulons and proposed the scenario for the evolution of the nitrogen fixation transcriptional network. The reconstructed network substantially expands the existing knowledge of transcriptional regulation in nitrogen-fixing microorganisms and can be used for genetic experiments, metabolic reconstruction, and evolutionary analysis.
Collapse
Affiliation(s)
- Olga V Tsoy
- Research and Training Center on Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia
| | - Dmitry A Ravcheev
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg Esch-sur-Alzette, Luxembourg
| | - Jelena Čuklina
- Research and Training Center on Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of SciencesMoscow, Russia; Moscow Institute of Physics and TechnologyDolgoprudny, Russia
| | - Mikhail S Gelfand
- Research and Training Center on Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of SciencesMoscow, Russia; Faculty of Bioengineering and Bioinformatics, Moscow State UniversityMoscow, Russia; Skolkovo Institute of Science and TechnologySkolkovo, Russia; Faculty of Computer Science, Higher School of EconomicsMoscow, Russia
| |
Collapse
|
14
|
Liu B, Zhang H, Zhou C, Li G, Fennell A, Wang G, Kang Y, Liu Q, Ma Q. An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes. BMC Genomics 2016; 17:578. [PMID: 27507169 PMCID: PMC4977642 DOI: 10.1186/s12864-016-2982-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2016] [Accepted: 07/29/2016] [Indexed: 11/10/2022] Open
Abstract
Background Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Results Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP3). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP3 consistently outperformed other popular motif finding tools. We have integrated MP3 into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. Conclusion The performance evaluation indicated that MP3 is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance progress in elucidating transcription regulation mechanism, thus provide benefit to the genomic research community and prokaryotic genome researchers in particular. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2982-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Bingqiang Liu
- School of Mathematics, Shandong University, Jinan, 250100, China
| | - Hanyuan Zhang
- Systems Biology and Biomedical Informatics (SBBI) Laboratory University of Nebraska-Lincoln, Lincoln, NE, 68588-0115, USA
| | - Chuan Zhou
- School of Mathematics, Shandong University, Jinan, 250100, China
| | - Guojun Li
- School of Mathematics, Shandong University, Jinan, 250100, China
| | - Anne Fennell
- Department of Agronomy, Horticulture, and Plant Science, South Dakota State University, Brookings, SD, 57007, USA.,BioSNTR, Brookings, SD, USA
| | - Guanghui Wang
- School of Mathematics, Shandong University, Jinan, 250100, China
| | - Yu Kang
- CAS Key Laboratory of Genome Sciences and information, Beijing Institute of Genomics of CAS, Beijing, 100101, People's Republic of China
| | - Qi Liu
- Department of Bioinformatics, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Qin Ma
- Department of Agronomy, Horticulture, and Plant Science, South Dakota State University, Brookings, SD, 57007, USA. .,BioSNTR, Brookings, SD, USA.
| |
Collapse
|
15
|
A Bioinformatics Analysis Reveals a Group of MocR Bacterial Transcriptional Regulators Linked to a Family of Genes Coding for Membrane Proteins. Biochem Res Int 2016; 2016:4360285. [PMID: 27446613 PMCID: PMC4944035 DOI: 10.1155/2016/4360285] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2015] [Accepted: 05/26/2016] [Indexed: 01/30/2023] Open
Abstract
The MocR bacterial transcriptional regulators are characterized by an N-terminal domain, 60 residues long on average, possessing the winged-helix-turn-helix (wHTH) architecture responsible for DNA recognition and binding, linked to a large C-terminal domain (350 residues on average) that is homologous to fold type-I pyridoxal 5′-phosphate (PLP) dependent enzymes like aspartate aminotransferase (AAT). These regulators are involved in the expression of genes taking part in several metabolic pathways directly or indirectly connected to PLP chemistry, many of which are still uncharacterized. A bioinformatics analysis is here reported that studied the features of a distinct group of MocR regulators predicted to be functionally linked to a family of homologous genes coding for integral membrane proteins of unknown function. This group occurs mainly in the Actinobacteria and Gammaproteobacteria phyla. An analysis of the multiple sequence alignments of their wHTH and AAT domains suggested the presence of specificity-determining positions (SDPs). Mapping of SDPs onto a homology model of the AAT domain hinted at possible structural/functional roles in effector recognition. Likewise, SDPs in wHTH domain suggested the basis of specificity of Transcription Factor Binding Site recognition. The results reported represent a framework for rational design of experiments and for bioinformatics analysis of other MocR subgroups.
Collapse
|
16
|
FabR regulates Salmonella biofilm formation via its direct target FabB. BMC Genomics 2016; 17:253. [PMID: 27004424 PMCID: PMC4804515 DOI: 10.1186/s12864-016-2387-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Accepted: 01/08/2016] [Indexed: 12/02/2022] Open
Abstract
Background Biofilm formation is an important survival strategy of Salmonella in all environments. By mutant screening, we showed a knock-out mutant of fabR, encoding a repressor of unsaturated fatty acid biosynthesis (UFA), to have impaired biofilm formation. In order to unravel how this regulator impinges on Salmonella biofilm formation, we aimed at elucidating the S. Typhimurium FabR regulon. Hereto, we applied a combinatorial high-throughput approach, combining ChIP-chip with transcriptomics. Results All the previously identified E. coli FabR transcriptional target genes (fabA, fabB and yqfA) were shown to be direct S. Typhimurium FabR targets as well. As we found a fabB overexpressing strain to partly mimic the biofilm defect of the fabR mutant, the effect of FabR on biofilms can be attributed at least partly to FabB, which plays a key role in UFA biosynthesis. Additionally, ChIP-chip identified a number of novel direct FabR targets (the intergenic regions between hpaR/hpaG and ddg/ydfZ) and yet putative direct targets (i.a. genes involved in tRNA metabolism, ribosome synthesis and translation). Next to UFA biosynthesis, a number of these direct targets and other indirect targets identified by transcriptomics (e.g. ribosomal genes, ompA, ompC, ompX, osmB, osmC, sseI), could possibly contribute to the effect of FabR on biofilm formation. Conclusion Overall, our results point at the importance of FabR and UFA biosynthesis in Salmonella biofilm formation and their role as potential targets for biofilm inhibitory strategies. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2387-x) contains supplementary material, which is available to authorized users.
Collapse
|
17
|
Lis M, Walther D. The orientation of transcription factor binding site motifs in gene promoter regions: does it matter? BMC Genomics 2016; 17:185. [PMID: 26939991 PMCID: PMC4778318 DOI: 10.1186/s12864-016-2549-x] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2015] [Accepted: 02/27/2016] [Indexed: 12/23/2022] Open
Abstract
Background Gene expression is to large degree regulated by the specific binding of protein transcription factors to cis-regulatory transcription factor binding sites in gene promoter regions. Despite the identification of hundreds of binding site sequence motifs, the question as to whether motif orientation matters with regard to the gene expression regulation of the respective downstream genes appears surprisingly underinvestigated. Results We pursued a statistical approach by probing 293 reported non-palindromic transcription factor binding site and ten core promoter motifs in Arabidopsis thaliana for evidence of any relevance of motif orientation based on mapping statistics and effects on the co-regulation of gene expression of the respective downstream genes. Although positional intervals closer to the transcription start site (TSS) were found with increased frequencies of motifs exhibiting orientation preference, a corresponding effect with regard to gene expression regulation as evidenced by increased co-expression of genes harboring the favored orientation in their upstream sequence could not be established. Furthermore, we identified an intrinsic orientational asymmetry of sequence regions close to the TSS as the likely source of the identified motif orientation preferences. By contrast, motif presence irrespective of orientation was found associated with pronounced effects on gene expression co-regulation validating the pursued approach. Inspecting motif pairs revealed statistically preferred orientational arrangements, but no consistent effect with regard to arrangement-dependent gene expression regulation was evident. Conclusions Our results suggest that for the motifs considered here, either no specific orientation rendering them functional across all their instances exists with orientational requirements instead depending on gene-locus specific additional factors, or that the binding orientation of transcription factors may generally not be relevant, but rather the event of binding itself. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2549-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Monika Lis
- Max Planck Institute for Molecular Plant Physiology, Am Mühlenberg 1, 14476, Potsdam-Golm, Germany.
| | - Dirk Walther
- Max Planck Institute for Molecular Plant Physiology, Am Mühlenberg 1, 14476, Potsdam-Golm, Germany.
| |
Collapse
|
18
|
Wang D, Thakker C, Liu P, Bennett GN, San KY. Efficient production of free fatty acids from soybean meal carbohydrates. Biotechnol Bioeng 2015; 112:2324-33. [PMID: 25943383 DOI: 10.1002/bit.25633] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2015] [Revised: 04/05/2015] [Accepted: 04/29/2015] [Indexed: 11/09/2022]
Abstract
Conversion of biomass feedstock to chemicals and fuels has attracted increasing attention recently. Soybean meal, containing significant quantities of carbohydrates, is an inexpensive renewable feedstock. Glucose, galactose, and fructose can be obtained by enzymatic hydrolysis of soluble carbohydrates of soybean meal. Free fatty acids (FFAs) are valuable molecules that can be used as precursors for the production of fuels and other value-added chemicals. In this study, free fatty acids were produced by mutant Escherichia coli strains with plasmid pXZ18Z (carrying acyl-ACP thioesterase (TE) and (3R)-hydroxyacyl-ACP dehydratase) using individual sugars, sugar mixtures, and enzymatic hydrolyzed soybean meal extract. For individual sugar fermentations, strain ML211 (MG1655 fadD(-) fabR(-) )/pXZ18Z showed the best performance, which produced 4.22, 3.79, 3.49 g/L free fatty acids on glucose, fructose, and galactose, respectively. While the strain ML211/pXZ18Z performed the best with individual sugars, however, for sugar mixture fermentation, the triple mutant strain XZK211 (MG1655 fadD(-) fabR(-) ptsG(-) )/pXZ18Z with an additional deletion of ptsG encoding the glucose-specific transporter, functioned the best due to relieved catabolite repression. This strain produced approximately 3.18 g/L of fatty acids with a yield of 0.22 g fatty acids/g total sugar. Maximum free fatty acids production of 2.78 g/L with a high yield of 0.21 g/g was achieved using soybean meal extract hydrolysate. The results suggested that soybean meal carbohydrates after enzymatic treatment could serve as an inexpensive feedstock for the efficient production of free fatty acids.
Collapse
Affiliation(s)
- Dan Wang
- Department of Bioengineering, Rice University, 6100 Main Street, MS-362, Houston, Texas, 77005-1892
- College of Chemistry and Chemical Engineering, Chongqing University, Chongqing, P. R. China
| | | | - Ping Liu
- Department of Bioengineering, Rice University, 6100 Main Street, MS-362, Houston, Texas, 77005-1892
| | | | - Ka-Yiu San
- Department of Bioengineering, Rice University, 6100 Main Street, MS-362, Houston, Texas, 77005-1892.
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas.
| |
Collapse
|
19
|
Abstract
The pathways in Escherichia coli and (largely by analogy) S. enterica remain the paradigm of bacterial lipid synthetic pathways, although recently considerable diversity among bacteria in the specific areas of lipid synthesis has been demonstrated. The structural biology of the fatty acid synthetic proteins is essentially complete. However, the membrane-bound enzymes of phospholipid synthesis remain recalcitrant to structural analyses. Recent advances in genetic technology have allowed the essentialgenes of lipid synthesis to be tested with rigor, and as expected most genes are essential under standard growth conditions. Conditionally lethal mutants are available in numerous genes, which facilitates physiological analyses. The array of genetic constructs facilitates analysis of the functions of genes from other organisms. Advances in mass spectroscopy have allowed very accurate and detailed analyses of lipid compositions as well as detection of the interactions of lipid biosynthetic proteins with one another and with proteins outside the lipid pathway. The combination of these advances has resulted in use of E. coli and S. enterica for discovery of new antimicrobials targeted to lipid synthesis and in deciphering the molecular actions of known antimicrobials. Finally,roles for bacterial fatty acids other than as membrane lipid structural components have been uncovered. For example, fatty acid synthesis plays major roles in the synthesis of the essential enzyme cofactors, biotin and lipoic acid. Although other roles for bacterial fatty acids, such as synthesis of acyl-homoserine quorum-sensing molecules, are not native to E. coli introduction of the relevant gene(s) synthesis of these foreign molecules readily proceeds and the sophisticated tools available can used to decipher the mechanisms of synthesis of these molecules.
Collapse
|
20
|
Abstract
Motivation: The construction of statistics for summarizing posterior samples returned by a Bayesian phylogenetic study has so far been hindered by the poor geometric insights available into the space of phylogenetic trees, and ad hoc methods such as the derivation of a consensus tree makeup for the ill-definition of the usual concepts of posterior mean, while bootstrap methods mitigate the absence of a sound concept of variance. Yielding satisfactory results with sufficiently concentrated posterior distributions, such methods fall short of providing a faithful summary of posterior distributions if the data do not offer compelling evidence for a single topology. Results: Building upon previous work of Billera et al., summary statistics such as sample mean, median and variance are defined as the geometric median, Fréchet mean and variance, respectively. Their computation is enabled by recently published works, and embeds an algorithm for computing shortest paths in the space of trees. Studying the phylogeny of a set of plants, where several tree topologies occur in the posterior sample, the posterior mean balances correctly the contributions from the different topologies, where a consensus tree would be biased. Comparisons of the posterior mean, median and consensus trees with the ground truth using simulated data also reveals the benefits of a sound averaging method when reconstructing phylogenetic trees. Availability and implementation: We provide two independent implementations of the algorithm for computing Fréchet means, geometric medians and variances in the space of phylogenetic trees. TFBayes: https://github.com/pbenner/tfbayes, TrAP: https://github.com/bacak/TrAP. Contact:philipp.benner@mis.mpg.de
Collapse
Affiliation(s)
- Philipp Benner
- Max-Planck Institute for Mathematics in the Sciences, 04103 Leipzig, Germany and Isthmus SARL, 75002 Paris, France
| | - Miroslav Bačák
- Max-Planck Institute for Mathematics in the Sciences, 04103 Leipzig, Germany and Isthmus SARL, 75002 Paris, France
| | - Pierre-Yves Bourguignon
- Max-Planck Institute for Mathematics in the Sciences, 04103 Leipzig, Germany and Isthmus SARL, 75002 Paris, France Max-Planck Institute for Mathematics in the Sciences, 04103 Leipzig, Germany and Isthmus SARL, 75002 Paris, France
| |
Collapse
|
21
|
Luo Q, Shi M, Ren Y, Gao H. Transcription factors FabR and FadR regulate both aerobic and anaerobic pathways for unsaturated fatty acid biosynthesis in Shewanella oneidensis. Front Microbiol 2014; 5:736. [PMID: 25566241 PMCID: PMC4273635 DOI: 10.3389/fmicb.2014.00736] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2014] [Accepted: 12/05/2014] [Indexed: 01/01/2023] Open
Abstract
As genes for type II fatty acid synthesis are essential to the growth of Escherichia coli, its sole (anaerobic) pathway has significant potential as a target for novel antibacterial drug, and has been extensively studied. Despite this, we still know surprisingly little about fatty acid synthesis in bacteria because this anaerobic pathway in fact is not widely distributed. In this study, we show a novel model of unsaturated fatty acid (UFA) synthesis in Shewanella, emerging human pathogens in addition to well-known metal reducers. We identify both anaerobic and aerobic UFA biosynthesis pathways in the representative species, S. oneidensis. Uniquely, the bacterium also contains two regulators FabR and FadR, whose counterparts in other bacteria control the anaerobic pathway. However, we show that in S. oneidensis these two regulators are involved in regulation of both pathways, in either direct or indirect manner. Overall, our results indicate that the UFA biosynthesis and its regulation are far more complex than previously expected, and S. oneidensis serves as a good research model for further work.
Collapse
Affiliation(s)
- Qixia Luo
- Institute of Microbiology and College of Life Sciences, Zhejiang University Hangzhou, China
| | - Miaomiao Shi
- Institute of Microbiology and College of Life Sciences, Zhejiang University Hangzhou, China
| | - Yedan Ren
- Institute of Microbiology and College of Life Sciences, Zhejiang University Hangzhou, China
| | - Haichun Gao
- Institute of Microbiology and College of Life Sciences, Zhejiang University Hangzhou, China
| |
Collapse
|
22
|
Fernandez L, Mercader JM, Planas-Fèlix M, Torrents D. Adaptation to environmental factors shapes the organization of regulatory regions in microbial communities. BMC Genomics 2014; 15:877. [PMID: 25294412 PMCID: PMC4287501 DOI: 10.1186/1471-2164-15-877] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Accepted: 09/24/2014] [Indexed: 11/10/2022] Open
Abstract
Background It has been shown in a number of metagenomic studies that the addition and removal of specific genes have allowed microbiomes to adapt to specific environmental conditions by losing and gaining specific functions. But it is not known whether and how the regulation of gene expression also contributes to adaptation. Results We have here characterized and analyzed the metaregulome of three different environments, as well as their impact in the adaptation to particular variable physico-chemical conditions. For this, we have developed a computational protocol to extract regulatory regions and their corresponding transcription factors binding sites directly from metagenomic reads and applied it to three well known environments: Acid Mine, Whale Fall, and Waseca Farm. Taking the density of regulatory sites in promoters as a measure of the potential and complexity of gene regulation, we found it to be quantitatively the same in all three environments, despite their different physico-chemical conditions and species composition. However, we found that each environment distributes their regulatory potential differently across their functional space. Among the functions with highest regulatory potential in each niche, we found significant enrichment of processes related to sensing and buffering external variable factors specific to each environment, like for example, the availability of co-factors in deep sea, of oligosaccharides in soil and the regulation of pH in the acid mine. Conclusions These results highlight the potential impact of gene regulation in the adaptation of bacteria to the different habitats through the distribution of their regulatory potential among specific functions, and point to critical environmental factors that challenge the growth of any microbial community. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-877) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | - David Torrents
- Joint IRB-BSC program on Computational Biology, BSC, Jordi Girona, 29, 08034 Barcelona, Spain.
| |
Collapse
|
23
|
Ravcheev DA, Khoroshkin MS, Laikova ON, Tsoy OV, Sernova NV, Petrova SA, Rakhmaninova AB, Novichkov PS, Gelfand MS, Rodionov DA. Comparative genomics and evolution of regulons of the LacI-family transcription factors. Front Microbiol 2014; 5:294. [PMID: 24966856 PMCID: PMC4052901 DOI: 10.3389/fmicb.2014.00294] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2014] [Accepted: 05/28/2014] [Indexed: 12/31/2022] Open
Abstract
DNA-binding transcription factors (TFs) are essential components of transcriptional regulatory networks in bacteria. LacI-family TFs (LacI-TFs) are broadly distributed among certain lineages of bacteria. The majority of characterized LacI-TFs sense sugar effectors and regulate carbohydrate utilization genes. The comparative genomics approaches enable in silico identification of TF-binding sites and regulon reconstruction. To study the function and evolution of LacI-TFs, we performed genomics-based reconstruction and comparative analysis of their regulons. For over 1300 LacI-TFs from over 270 bacterial genomes, we predicted their cognate DNA-binding motifs and identified target genes. Using the genome context and metabolic subsystem analyses of reconstructed regulons, we tentatively assigned functional roles and predicted candidate effectors for 78 and 67% of the analyzed LacI-TFs, respectively. Nearly 90% of the studied LacI-TFs are local regulators of sugar utilization pathways, whereas the remaining 125 global regulators control large and diverse sets of metabolic genes. The global LacI-TFs include the previously known regulators CcpA in Firmicutes, FruR in Enterobacteria, and PurR in Gammaproteobacteria, as well as the three novel regulators—GluR, GapR, and PckR—that are predicted to control the central carbohydrate metabolism in three lineages of Alphaproteobacteria. Phylogenetic analysis of regulators combined with the reconstructed regulons provides a model of evolutionary diversification of the LacI protein family. The obtained genomic collection of in silico reconstructed LacI-TF regulons in bacteria is available in the RegPrecise database (http://regprecise.lbl.gov). It provides a framework for future structural and functional classification of the LacI protein family and identification of molecular determinants of the DNA and ligand specificity. The inferred regulons can be also used for functional gene annotation and reconstruction of sugar catabolic networks in diverse bacterial lineages.
Collapse
Affiliation(s)
- Dmitry A Ravcheev
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia
| | - Matvei S Khoroshkin
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia
| | - Olga N Laikova
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia
| | - Olga V Tsoy
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia ; Faculty of Bioengineering and Bioinformatics, Moscow State University Moscow, Russia
| | - Natalia V Sernova
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia
| | - Svetlana A Petrova
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia ; Faculty of Bioengineering and Bioinformatics, Moscow State University Moscow, Russia
| | | | - Pavel S Novichkov
- Lawrence Berkeley National Laboratory, Genomics Division Berkeley, CA, USA
| | - Mikhail S Gelfand
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia
| | - Dmitry A Rodionov
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia ; Department of Bioinformatics, Sanford-Burnham Medical Research Institute La Jolla, CA, USA
| |
Collapse
|
24
|
An improved systematic approach to predicting transcription factor target genes using support vector machine. PLoS One 2014; 9:e94519. [PMID: 24743548 PMCID: PMC3990533 DOI: 10.1371/journal.pone.0094519] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2012] [Accepted: 03/17/2014] [Indexed: 11/21/2022] Open
Abstract
Biological prediction of transcription factor binding sites and their corresponding transcription factor target genes (TFTGs) makes great contribution to understanding the gene regulatory networks. However, these approaches are based on laborious and time-consuming biological experiments. Numerous computational approaches have shown great potential to circumvent laborious biological methods. However, the majority of these algorithms provide limited performances and fail to consider the structural property of the datasets. We proposed a refined systematic computational approach for predicting TFTGs. Based on previous work done on identifying auxin response factor target genes from Arabidopsis thaliana co-expression data, we adopted a novel reverse-complementary distance-sensitive n-gram profile algorithm. This algorithm converts each upstream sub-sequence into a high-dimensional vector data point and transforms the prediction task into a classification problem using support vector machine-based classifier. Our approach showed significant improvement compared to other computational methods based on the area under curve value of the receiver operating characteristic curve using 10-fold cross validation. In addition, in the light of the highly skewed structure of the dataset, we also evaluated other metrics and their associated curves, such as precision-recall curves and cost curves, which provided highly satisfactory results.
Collapse
|
25
|
Fukasawa Y, Leung RKK, Tsui SKW, Horton P. Plus ça change - evolutionary sequence divergence predicts protein subcellular localization signals. BMC Genomics 2014; 15:46. [PMID: 24438075 PMCID: PMC3906766 DOI: 10.1186/1471-2164-15-46] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2013] [Accepted: 01/06/2014] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Protein subcellular localization is a central problem in understanding cell biology and has been the focus of intense research. In order to predict localization from amino acid sequence a myriad of features have been tried: including amino acid composition, sequence similarity, the presence of certain motifs or domains, and many others. Surprisingly, sequence conservation of sorting motifs has not yet been employed, despite its extensive use for tasks such as the prediction of transcription factor binding sites. RESULTS Here, we flip the problem around, and present a proof of concept for the idea that the lack of sequence conservation can be a novel feature for localization prediction. We show that for yeast, mammal and plant datasets, evolutionary sequence divergence alone has significant power to identify sequences with N-terminal sorting sequences. Moreover sequence divergence is nearly as effective when computed on automatically defined ortholog sets as on hand curated ones. Unfortunately, sequence divergence did not necessarily increase classification performance when combined with some traditional sequence features such as amino acid composition. However a post-hoc analysis of the proteins in which sequence divergence changes the prediction yielded some proteins with atypical (i.e. not MPP-cleaved) matrix targeting signals as well as a few misannotations. CONCLUSION We report the results of the first quantitative study of the effectiveness of evolutionary sequence divergence as a feature for protein subcellular localization prediction. We show that divergence is indeed useful for prediction, but it is not trivial to improve overall accuracy simply by adding this feature to classical sequence features. Nevertheless we argue that sequence divergence is a promising feature and show anecdotal examples in which it succeeds where other features fail.
Collapse
Affiliation(s)
- Yoshinori Fukasawa
- Department of Computational Biology, Graduate School of Frontier Sciences, University of Tokyo, Kashiwa, Japan
- Japan Society for the Promotion of Science, Tokyo Chiyoda, Japan
| | - Ross KK Leung
- Hong Kong Bioinformatics Centre and School of Biomedical Sciences, Chinese University of Hong Kong, Shatin, China
| | - Stephen KW Tsui
- Hong Kong Bioinformatics Centre and School of Biomedical Sciences, Chinese University of Hong Kong, Shatin, China
| | - Paul Horton
- Department of Computational Biology, Graduate School of Frontier Sciences, University of Tokyo, Kashiwa, Japan
- Computational Biology Research Center, Advanced Industrial Science and Technology, Tokyo, Japan
| |
Collapse
|
26
|
Janßen HJ, Steinbüchel A. Fatty acid synthesis in Escherichia coli and its applications towards the production of fatty acid based biofuels. BIOTECHNOLOGY FOR BIOFUELS 2014; 7:7. [PMID: 24405789 PMCID: PMC3896788 DOI: 10.1186/1754-6834-7-7] [Citation(s) in RCA: 181] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2013] [Accepted: 12/24/2013] [Indexed: 05/04/2023]
Abstract
The idea of renewable and regenerative resources has inspired research for more than a hundred years. Ideally, the only spent energy will replenish itself, like plant material, sunlight, thermal energy or wind. Biodiesel or ethanol are examples, since their production relies mainly on plant material. However, it has become apparent that crop derived biofuels will not be sufficient to satisfy future energy demands. Thus, especially in the last decade a lot of research has focused on the production of next generation biofuels. A major subject of these investigations has been the microbial fatty acid biosynthesis with the aim to produce fatty acids or derivatives for substitution of diesel. As an industrially important organism and with the best studied microbial fatty acid biosynthesis, Escherichia coli has been chosen as producer in many of these studies and several reviews have been published in the fields of E. coli fatty acid biosynthesis or biofuels. However, most reviews discuss only one of these topics in detail, despite the fact, that a profound understanding of the involved enzymes and their regulation is necessary for efficient genetic engineering of the entire pathway. The first part of this review aims at summarizing the knowledge about fatty acid biosynthesis of E. coli and its regulation, and it provides the connection towards the production of fatty acids and related biofuels. The second part gives an overview about the achievements by genetic engineering of the fatty acid biosynthesis towards the production of next generation biofuels. Finally, the actual importance and potential of fatty acid-based biofuels will be discussed.
Collapse
Affiliation(s)
- Helge Jans Janßen
- Institut für Molekulare Mikrobiologie und Biotechnologie, Westfälische Wilhelms-Universität Münster, Corrensstrasse 3, D-48149, Münster, Germany
| | - Alexander Steinbüchel
- Institut für Molekulare Mikrobiologie und Biotechnologie, Westfälische Wilhelms-Universität Münster, Corrensstrasse 3, D-48149, Münster, Germany
- Environmental Sciences Department, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
27
|
Chen R, Peng Y, Choi B, Xu J, Hu H. A private DNA motif finding algorithm. J Biomed Inform 2014; 50:122-32. [PMID: 24412700 DOI: 10.1016/j.jbi.2013.12.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2013] [Revised: 11/25/2013] [Accepted: 12/29/2013] [Indexed: 10/25/2022]
Abstract
With the increasing availability of genomic sequence data, numerous methods have been proposed for finding DNA motifs. The discovery of DNA motifs serves a critical step in many biological applications. However, the privacy implication of DNA analysis is normally neglected in the existing methods. In this work, we propose a private DNA motif finding algorithm in which a DNA owner's privacy is protected by a rigorous privacy model, known as ∊-differential privacy. It provides provable privacy guarantees that are independent of adversaries' background knowledge. Our algorithm makes use of the n-gram model and is optimized for processing large-scale DNA sequences. We evaluate the performance of our algorithm over real-life genomic data and demonstrate the promise of integrating privacy into DNA motif finding.
Collapse
Affiliation(s)
- Rui Chen
- Department of Computer Science, Hong Kong Baptist University, Hong Kong.
| | - Yun Peng
- School of Computer Engineering, Nanyang Technological University, Singapore.
| | - Byron Choi
- Department of Computer Science, Hong Kong Baptist University, Hong Kong.
| | - Jianliang Xu
- Department of Computer Science, Hong Kong Baptist University, Hong Kong.
| | - Haibo Hu
- Department of Computer Science, Hong Kong Baptist University, Hong Kong.
| |
Collapse
|
28
|
Ravcheev DA, Godzik A, Osterman AL, Rodionov DA. Polysaccharides utilization in human gut bacterium Bacteroides thetaiotaomicron: comparative genomics reconstruction of metabolic and regulatory networks. BMC Genomics 2013; 14:873. [PMID: 24330590 PMCID: PMC3878776 DOI: 10.1186/1471-2164-14-873] [Citation(s) in RCA: 102] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2013] [Accepted: 12/06/2013] [Indexed: 01/14/2023] Open
Abstract
Background Bacteroides thetaiotaomicron, a predominant member of the human gut microbiota, is characterized by its ability to utilize a wide variety of polysaccharides using the extensive saccharolytic machinery that is controlled by an expanded repertoire of transcription factors (TFs). The availability of genomic sequences for multiple Bacteroides species opens an opportunity for their comparative analysis to enable characterization of their metabolic and regulatory networks. Results A comparative genomics approach was applied for the reconstruction and functional annotation of the carbohydrate utilization regulatory networks in 11 Bacteroides genomes. Bioinformatics analysis of promoter regions revealed putative DNA-binding motifs and regulons for 31 orthologous TFs in the Bacteroides. Among the analyzed TFs there are 4 SusR-like regulators, 16 AraC-like hybrid two-component systems (HTCSs), and 11 regulators from other families. Novel DNA motifs of HTCSs and SusR-like regulators in the Bacteroides have the common structure of direct repeats with a long spacer between two conserved sites. Conclusions The inferred regulatory network in B. thetaiotaomicron contains 308 genes encoding polysaccharide and sugar catabolic enzymes, carbohydrate-binding and transport systems, and TFs. The analyzed TFs control pathways for utilization of host and dietary glycans to monosaccharides and their further interconversions to intermediates of the central metabolism. The reconstructed regulatory network allowed us to suggest and refine specific functional assignments for sugar catabolic enzymes and transporters, providing a substantial improvement to the existing metabolic models for B. thetaiotaomicron. The obtained collection of reconstructed TF regulons is available in the RegPrecise database (http://regprecise.lbl.gov).
Collapse
Affiliation(s)
| | | | | | - Dmitry A Rodionov
- Sanford-Burnham Medical Research Institute, La Jolla, California 92037, USA.
| |
Collapse
|
29
|
Novichkov PS, Kazakov AE, Ravcheev DA, Leyn SA, Kovaleva GY, Sutormin RA, Kazanov MD, Riehl W, Arkin AP, Dubchak I, Rodionov DA. RegPrecise 3.0--a resource for genome-scale exploration of transcriptional regulation in bacteria. BMC Genomics 2013; 14:745. [PMID: 24175918 PMCID: PMC3840689 DOI: 10.1186/1471-2164-14-745] [Citation(s) in RCA: 279] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2013] [Accepted: 10/28/2013] [Indexed: 11/27/2022] Open
Abstract
Background Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches). Description RegPrecise (http://regprecise.lbl.gov) is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes. Conclusions RegPrecise 3.0 gives access to the transcriptional regulons reconstructed in bacterial genomes. Analytical capabilities include exploration of: regulon content, structure and function; TF binding site motifs; conservation and variations in genome-wide regulatory networks across all taxonomic groups of Bacteria. RegPrecise 3.0 was selected as a core resource on transcriptional regulation of the Department of Energy Systems Biology Knowledgebase, an emerging software and data environment designed to enable researchers to collaboratively generate, test and share new hypotheses about gene and protein functions, perform large-scale analyses, and model interactions in microbes, plants, and their communities.
Collapse
|
30
|
Shelokar P, Quirin A, Cordón Ó. A multiobjective evolutionary programming framework for graph-based data mining. Inf Sci (N Y) 2013. [DOI: 10.1016/j.ins.2013.02.014] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
31
|
Parsons JB, Rock CO. Bacterial lipids: metabolism and membrane homeostasis. Prog Lipid Res 2013; 52:249-76. [PMID: 23500459 PMCID: PMC3665635 DOI: 10.1016/j.plipres.2013.02.002] [Citation(s) in RCA: 307] [Impact Index Per Article: 27.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2013] [Revised: 02/27/2013] [Accepted: 02/28/2013] [Indexed: 11/29/2022]
Abstract
Membrane lipid homeostasis is a vital facet of bacterial cell physiology. For decades, research in bacterial lipid synthesis was largely confined to the Escherichia coli model system. This basic research provided a blueprint for the biochemistry of lipid metabolism that has largely defined the individual steps in bacterial fatty acid and phospholipids synthesis. The advent of genomic sequencing has revealed a surprising amount of diversity in the genes, enzymes and genetic organization of the components responsible for bacterial lipid synthesis. Although the chemical steps in fatty acid synthesis are largely conserved in bacteria, there are surprising differences in the structure and cofactor requirements for the enzymes that perform these reactions in Gram-positive and Gram-negative bacteria. This review summarizes how the explosion of new information on the diversity of biochemical and genetic regulatory mechanisms has impacted our understanding of bacterial lipid homeostasis. The potential and problems of developing therapeutics that block pathogen phospholipid synthesis are explored and evaluated. The study of bacterial lipid metabolism continues to be a rich source for new biochemistry that underlies the variety and adaptability of bacterial life styles.
Collapse
Affiliation(s)
- Joshua B Parsons
- Department of Infectious Diseases, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN 38105, USA
| | | |
Collapse
|
32
|
Jiang B, Liu JS, Bulyk ML. Bayesian hierarchical model of protein-binding microarray k-mer data reduces noise and identifies transcription factor subclasses and preferred k-mers. ACTA ACUST UNITED AC 2013; 29:1390-8. [PMID: 23559638 DOI: 10.1093/bioinformatics/btt152] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
MOTIVATION Sequence-specific transcription factors (TFs) regulate the expression of their target genes through interactions with specific DNA-binding sites in the genome. Data on TF-DNA binding specificities are essential for understanding how regulatory specificity is achieved. RESULTS Numerous studies have used universal protein-binding microarray (PBM) technology to determine the in vitro binding specificities of hundreds of TFs for all possible 8 bp sequences (8mers). We have developed a Bayesian analysis of variance (ANOVA) model that decomposes these 8mer data into background noise, TF familywise effects and effects due to the particular TF. Adjusting for background noise improves PBM data quality and concordance with in vivo TF binding data. Moreover, our model provides simultaneous identification of TF subclasses and their shared sequence preferences, and also of 8mers bound preferentially by individual members of TF subclasses. Such results may aid in deciphering cis-regulatory codes and determinants of protein-DNA binding specificity. AVAILABILITY AND IMPLEMENTATION Source code, compiled code and R and Python scripts are available from http://thebrain.bwh.harvard.edu/hierarchicalANOVA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bo Jiang
- Department of Statistics, Harvard University, Cambridge, MA 02138, USA.
| | | | | |
Collapse
|
33
|
Faria JP, Overbeek R, Xia F, Rocha M, Rocha I, Henry CS. Genome-scale bacterial transcriptional regulatory networks: reconstruction and integrated analysis with metabolic models. Brief Bioinform 2013; 15:592-611. [DOI: 10.1093/bib/bbs071] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
|
34
|
Schweizer HP, Choi KH. Characterization of molecular mechanisms controlling fabAB transcription in Pseudomonas aeruginosa. PLoS One 2012; 7:e45646. [PMID: 23056212 PMCID: PMC3462791 DOI: 10.1371/journal.pone.0045646] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2012] [Accepted: 08/24/2012] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND The FabAB pathway is one of the unsaturated fatty acid (UFA) synthesis pathways for Pseudomonas aeruginosa. It was previously noted that this operon was upregulated in biofilms and repressed by exogenous UFAs. Deletion of a 30 nt fabA upstream sequence, which is conserved in P. aeruginosa, P. putida, and P. syringae, led to a significant decrease in fabA transcription, suggesting positive regulation by an unknown positive regulatory mechanism. METHODS/PRINCIPAL FINDINGS Here, genetic and biochemical approaches were employed to identify a potential fabAB activator. Deletion of candidate genes such as PA1611 or PA1627 was performed to determine if any of these gene products act as a fabAB activator. However, none of these genes were involved in the regulation of fabAB transcription. Use of mariner-based random mutagenesis to screen for fabA activator(s) showed that several genes encoding unknown functions, rpoN and DesA may be involved in fabA regulation, but probably via indirect mechanisms. Biochemical attempts performed did fail to isolate an activator of fabAB operon. CONCLUSION/SIGNIFICANCE The data suggest that fabA expression might not be regulated by protein-binding, but by a distinct mechanism such as a regulatory RNA-based mechanism.
Collapse
MESH Headings
- 3-Oxoacyl-(Acyl-Carrier-Protein) Synthase/genetics
- 3-Oxoacyl-(Acyl-Carrier-Protein) Synthase/metabolism
- 5' Untranslated Regions/genetics
- Amino Acid Sequence
- Bacterial Proteins/genetics
- Bacterial Proteins/metabolism
- Base Sequence
- DNA Transposable Elements/genetics
- Fatty Acid Synthase, Type II/genetics
- Fatty Acid Synthase, Type II/metabolism
- Fatty Acids, Unsaturated/metabolism
- Gene Expression Regulation, Bacterial
- Hydro-Lyases/genetics
- Hydro-Lyases/metabolism
- Molecular Sequence Data
- Mutagenesis, Insertional
- Nucleic Acid Conformation
- Operon
- Promoter Regions, Genetic/genetics
- Pseudomonas aeruginosa/genetics
- Pseudomonas aeruginosa/metabolism
- Pseudomonas putida/genetics
- Pseudomonas putida/metabolism
- Pseudomonas syringae/genetics
- Pseudomonas syringae/metabolism
- RNA, Bacterial/chemistry
- RNA, Bacterial/genetics
- RNA, Bacterial/metabolism
- Regulatory Sequences, Nucleic Acid/genetics
- Reverse Transcriptase Polymerase Chain Reaction
- Trans-Activators/genetics
- Trans-Activators/metabolism
- Transcription, Genetic/genetics
Collapse
Affiliation(s)
- Herbert P. Schweizer
- Department of Microbiology, Immunology, and Pathology, IDRC at Foothills Campus, Colorado State University, Fort Collins, Colorado, United States of America
| | - Kyoung-Hee Choi
- Department of Oral Microbiology, College of Dentistry, Wonkwang University, Iksan, Chonbuk, South Korea
- * E-mail:
| |
Collapse
|
35
|
Katara P, Grover A, Sharma V. Phylogenetic footprinting: a boost for microbial regulatory genomics. PROTOPLASMA 2012; 249:901-907. [PMID: 22113593 DOI: 10.1007/s00709-011-0351-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2011] [Accepted: 11/09/2011] [Indexed: 05/31/2023]
Abstract
Phylogenetic footprinting is a method for the discovery of regulatory elements in a set of homologous regulatory regions, usually collected from multiple species. It does so by identifying the best conserved motifs in those homologous regions. There are two popular sets of methods-alignment-based and motif-based, which are generally employed for phylogenetic methods. However, serious efforts have lacked to develop a tool exclusively for phylogenetic footprinting, based on either of these methods. Nevertheless, a number of software and tools exist that can be applied for prediction of phylogenetic footprinting with variable degree of success. The output from these tools may get affected by a number of factors associated with current state of knowledge, techniques and other resources available. We here present a critical apprehension of various phylogenetic approaches with reference to prokaryotes outlining the available resources and also discussing various factors affecting footprinting in order to make a clear idea about the proper use of this approach on prokaryotes.
Collapse
Affiliation(s)
- Pramod Katara
- Department of Bioscience and Biotechnology, Banasthali University, Banasthali, 304022, India.
| | | | | |
Collapse
|
36
|
Xu F, Park MR, Kitazumi A, Herath V, Mohanty B, Yun SJ, de los Reyes BG. Cis-regulatory signatures of orthologous stress-associated bZIP transcription factors from rice, sorghum and Arabidopsis based on phylogenetic footprints. BMC Genomics 2012; 13:497. [PMID: 22992304 PMCID: PMC3522565 DOI: 10.1186/1471-2164-13-497] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2012] [Accepted: 09/14/2012] [Indexed: 01/10/2023] Open
Abstract
Background The potential contribution of upstream sequence variation to the unique features of orthologous genes is just beginning to be unraveled. A core subset of stress-associated bZIP transcription factors from rice (Oryza sativa) formed ten clusters of orthologous groups (COG) with genes from the monocot sorghum (Sorghum bicolor) and dicot Arabidopsis (Arabidopsis thaliana). The total cis-regulatory information content of each stress-associated COG was examined by phylogenetic footprinting to reveal ortholog-specific, lineage-specific and species-specific conservation patterns. Results The most apparent pattern observed was the occurrence of spatially conserved ‘core modules’ among the COGs but not among paralogs. These core modules are comprised of various combinations of two to four putative transcription factor binding site (TFBS) classes associated with either developmental or stress-related functions. Outside the core modules are specific stress (ABA, oxidative, abiotic, biotic) or organ-associated signals, which may be functioning as ‘regulatory fine-tuners’ and further define lineage-specific and species-specific cis-regulatory signatures. Orthologous monocot and dicot promoters have distinct TFBS classes involved in disease and oxidative-regulated expression, while the orthologous rice and sorghum promoters have distinct combinations of root-specific signals, a pattern that is not particularly conserved in Arabidopsis. Conclusions Patterns of cis-regulatory conservation imply that each ortholog has distinct signatures, further suggesting that they are potentially unique in a regulatory context despite the presumed conservation of broad biological function during speciation. Based on the observed patterns of conservation, we postulate that core modules are likely primary determinants of basal developmental programming, which may be integrated with and further elaborated by additional intrinsic or extrinsic signals in conjunction with lineage-specific or species-specific regulatory fine-tuners. This synergy may be critical for finer-scale spatio-temporal regulation, hence unique expression profiles of homologous transcription factors from different species with distinct zones of ecological adaptation such as rice, sorghum and Arabidopsis. The patterns revealed from these comparisons set the stage for further empirical validation by functional genomics.
Collapse
Affiliation(s)
- Fuyu Xu
- School of Biology and Ecology, University of Maine, 5735 Hitchner Hall, Orono, ME 04469, USA
| | | | | | | | | | | | | |
Collapse
|
37
|
Ihuegbu NE, Stormo GD, Buhler J. Fast, sensitive discovery of conserved genome-wide motifs. J Comput Biol 2012; 19:139-47. [PMID: 22300316 DOI: 10.1089/cmb.2011.0249] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Regulatory sites that control gene expression are essential to the proper functioning of cells, and identifying them is critical for modeling regulatory networks. We have developed Magma (Multiple Aligner of Genomic Multiple Alignments), a software tool for multiple species, multiple gene motif discovery. Magma identifies putative regulatory sites that are conserved across multiple species and occur near multiple genes throughout a reference genome. Magma takes as input multiple alignments that can include gaps. It uses efficient clustering methods that make it about 70 times faster than PhyloNet, a previous program for this task, with slightly greater sensitivity. We ran Magma on all non-coding DNA conserved between Caenorhabditis elegans and five additional species, about 70 Mbp in total, in <4 h. We obtained 2,309 motifs with lengths of 6-20 bp, each occurring at least 10 times throughout the genome, which collectively covered about 566 kbp of the genomes, approximately 0.8% of the input. Predicted sites occurred in all types of non-coding sequence but were especially enriched in the promoter regions. Comparisons to several experimental datasets show that Magma motifs correspond to a variety of known regulatory motifs.
Collapse
Affiliation(s)
- Nnamdi E Ihuegbu
- Department of Genetics, Washington University School of Medicine, Saint Louis, Missouri 63108, USA
| | | | | |
Collapse
|
38
|
Ishihama A. Prokaryotic genome regulation: a revolutionary paradigm. PROCEEDINGS OF THE JAPAN ACADEMY. SERIES B, PHYSICAL AND BIOLOGICAL SCIENCES 2012; 88:485-508. [PMID: 23138451 PMCID: PMC3511978 DOI: 10.2183/pjab.88.485] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2012] [Accepted: 08/31/2012] [Indexed: 06/01/2023]
Abstract
After determination of the whole genome sequence, the research frontier of bacterial molecular genetics has shifted to reveal the genome regulation under stressful conditions in nature. The gene selectivity of RNA polymerase is modulated after interaction with two groups of regulatory proteins, 7 sigma factors and 300 transcription factors. For identification of regulation targets of transcription factors in Escherichia coli, we have developed Genomic SELEX system and subjected to screening the binding sites of these factors on the genome. The number of regulation targets by a single transcription factor was more than those hitherto recognized, ranging up to hundreds of promoters. The number of transcription factors involved in regulation of a single promoter also increased to as many as 30 regulators. The multi-target transcription factors and the multi-factor promoters were assembled into complex networks of transcription regulation. The most complex network was identified in the regulation cascades of transcription of two master regulators for planktonic growth and biofilm formation.
Collapse
Affiliation(s)
- Akira Ishihama
- Department of Frontier Bioscience and Micro-Nano Technology Research Center, Hosei University, Koganei, Tokyo 184-8584, Japan.
| |
Collapse
|
39
|
Finding Transcription Factor Binding Motifs for Coregulated Genes by Combining Sequence Overrepresentation with Cross-Species Conservation. JOURNAL OF PROBABILITY AND STATISTICS 2012. [DOI: 10.1155/2012/830575] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Novel computational methods for finding transcription factor binding motifs have long been sought due to tedious work of experimentally identifying them. However, the current prevailing methods yield a large number of false positive predictions due to the short, variable nature of transcriptional factor binding sites (TFBSs). We proposed here a method that combines sequence overrepresentation and cross-species sequence conservation to detect TFBSs in upstream regions of a given set of coregulated genes. We applied the method to 35S. cerevisiaetranscriptional factors with known DNA binding motifs (with the support of orthologous sequences from genomes ofS. mikatae,S. bayanus, andS. paradoxus), and the proposed method outperformed the single-genome-based motif finding methodsMEMEandAlignACEas well as the multiple-genome-based methodsPHYMEandFootprinterfor the majority of these transcriptional factors. Compared with the prevailing motif finding software, our method has some advantages in finding transcriptional factor binding motifs for potential coregulated genes if the gene upstream sequences of multiple closely related species are available. Although we used yeast genomes to assess our method in this study, it might also be applied to other organisms if suitable related species are available and the upstream sequences of coregulated genes can be obtained for the multiple closely related species.
Collapse
|
40
|
Göhler AK, Kökpinar Ö, Schmidt-Heck W, Geffers R, Guthke R, Rinas U, Schuster S, Jahreis K, Kaleta C. More than just a metabolic regulator--elucidation and validation of new targets of PdhR in Escherichia coli. BMC SYSTEMS BIOLOGY 2011; 5:197. [PMID: 22168595 PMCID: PMC3265435 DOI: 10.1186/1752-0509-5-197] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2011] [Accepted: 12/14/2011] [Indexed: 11/10/2022]
Abstract
BACKGROUND The pyruvate dehydrogenase regulator protein (PdhR) of Escherichia coli acts as a transcriptional regulator in a pyruvate dependent manner to control central metabolic fluxes. However, the complete PdhR regulon has not yet been uncovered. To achieve an extended understanding of its gene regulatory network, we combined large-scale network inference and experimental verification of results obtained by a systems biology approach. RESULTS 22 new genes contained in two operons controlled by PdhR (previously only 20 regulatory targets in eight operons were known) were identified by analysing a large-scale dataset of E. coli from the Many Microbes Microarray Database and novel expression data from a pdhR knockout strain, as well as a PdhR overproducing strain. We identified a regulation of the glycolate utilization operon glcDEFGBA using chromatin immunoprecipitation and gel shift assays. We show that this regulation could be part of a cross-induction between genes necessary for acetate and pyruvate utilisation controlled through PdhR. Moreover, a link of PdhR regulation to the replication machinery of the cell via control of the transcription of the dcw-cluster was verified in experiments. This augments our knowledge of the functions of the PdhR-regulon and demonstrates its central importance for further cellular processes in E. coli. CONCLUSIONS We extended the PdhR regulon by 22 new genes contained in two operons and validated the regulation of the glcDEFGBA operon for glycolate utilisation and the dcw-cluster for cell division proteins experimentally. Our results provide, for the first time, a plausible regulatory link between the nutritional status of the cell and cell replication mediated by PdhR.
Collapse
|
41
|
Zhang S, Li S, Niu M, Pham PT, Su Z. MotifClick: prediction of cis-regulatory binding sites via merging cliques. BMC Bioinformatics 2011; 12:238. [PMID: 21679436 PMCID: PMC3225181 DOI: 10.1186/1471-2105-12-238] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2010] [Accepted: 06/16/2011] [Indexed: 11/21/2022] Open
Abstract
Background Although dozens of algorithms and tools have been developed to find a set of cis-regulatory binding sites called a motif in a set of intergenic sequences using various approaches, most of these tools focus on identifying binding sites that are significantly different from their background sequences. However, some motifs may have a similar nucleotide distribution to that of their background sequences. Therefore, such binding sites can be missed by these tools. Results Here, we present a graph-based polynomial-time algorithm, MotifClick, for the prediction of cis-regulatory binding sites, in particular, those that have a similar nucleotide distribution to that of their background sequences. To find binding sites with length k, we construct a graph using some 2(k-1)-mers in the input sequences as the vertices, and connect two vertices by an edge if the maximum number of matches of the local gapless alignments between the two 2(k-1)-mers is greater than a cutoff value. We identify a motif as a set of similar k-mers from a merged group of maximum cliques associated with some vertices. Conclusions When evaluated on both synthetic and real datasets of prokaryotes and eukaryotes, MotifClick outperforms existing leading motif-finding tools for prediction accuracy and balancing the prediction sensitivity and specificity in general. In particular, when the distribution of nucleotides of binding sites is similar to that of their background sequences, MotifClick is more likely to identify the binding sites than the other tools.
Collapse
Affiliation(s)
- Shaoqiang Zhang
- Department of Bioinformatics and Genomics, Center for Bioinformatics Research, the University of North Carolina at Charlotte, 28223, USA
| | | | | | | | | |
Collapse
|
42
|
Brohée S, Janky R, Abdel-Sater F, Vanderstocken G, André B, van Helden J. Unraveling networks of co-regulated genes on the sole basis of genome sequences. Nucleic Acids Res 2011; 39:6340-58. [PMID: 21572103 PMCID: PMC3159452 DOI: 10.1093/nar/gkr264] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
With the growing number of available microbial genome sequences, regulatory signals can now be revealed as conserved motifs in promoters of orthologous genes (phylogenetic footprints). A next challenge is to unravel genome-scale regulatory networks. Using as sole input genome sequences, we predicted cis-regulatory elements for each gene of the yeast Saccharomyces cerevisiae by discovering over-represented motifs in the promoters of their orthologs in 19 Saccharomycetes species. We then linked all genes displaying similar motifs in their promoter regions and inferred a co-regulation network including 56,919 links between 3171 genes. Comparison with annotated regulons highlights the high predictive value of the method: a majority of the top-scoring predictions correspond to already known co-regulations. We also show that this inferred network is as accurate as a co-expression network built from hundreds of transcriptome microarray experiments. Furthermore, we experimentally validated 14 among 16 new functional links between orphan genes and known regulons. This approach can be readily applied to unravel gene regulatory networks from hundreds of microbial genomes for which no other information is available except the sequence. Long-term benefits can easily be perceived when considering the exponential increase of new genome sequences.
Collapse
Affiliation(s)
- Sylvain Brohée
- Lab. Bioinformatique des Génomes et des Réseaux (BiGRe), Université Libre de Bruxelles (ULB), CP 263, Campus Plaine, Bld du Triomphe, 1050 Brussels, Belgium
| | | | | | | | | | | |
Collapse
|
43
|
Feng Y, Cronan JE. Complex binding of the FabR repressor of bacterial unsaturated fatty acid biosynthesis to its cognate promoters. Mol Microbiol 2011; 80:195-218. [PMID: 21276098 DOI: 10.1111/j.1365-2958.2011.07564.x] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Two transcriptional regulators, the FadR activator and the FabR repressor, control biosynthesis of unsaturated fatty acids in Escherichia coli. FabR represses expression of the two genes, fabA and fabB, required for unsaturated fatty acid synthesis and has been reported to require the presence of an unsaturated thioester (of either acyl carrier protein or CoA) in order to bind the fabA and fabB promoters in vitro. We report in vivo experiments in which unsaturated fatty acid synthesis was blocked in the absence of exogenous unsaturated fatty acids in a ΔfadR strain and found that the rates of transcription of fabA and fabB were unaffected by the lack of unsaturated thioesters. To examine the discrepancy between our in vivo results and the prior in vitro results we obtained active, natively folded forms of the E. coli and Vibrio cholerae FabRs by use of an in vitro transcription-translation system. We report that FabR bound the intact promoter regions of both fabA and fabB in the absence of unsaturated acyl thioesters, but bound the two promoters differently. Native FabR bound the fabA promoter region provided that the canonical FabR binding site is extended by inclusion of flanking sequences that overlap the neighbouring FadR binding site. In contrast, although binding to the fabB operator also required a flanking sequence, a non-specific sequence could suffice. However, unsaturated thioesters did allow FabR binding to the minimal FabR operator sites of both promoters which otherwise were not bound. Thus unsaturated thioester ligands were not essential for FabR/target DNA interaction, but acted to enhance binding. The gel mobility shift data plus in vivo expression data indicate that despite the remarkably similar arrangements of promoter elements, FadR predominately regulates fabA expression whereas FabR is the dominant regulator of fabB expression. We also report that E. coli fabR expression is not autoregulated. Complementation, qRT-PCR and fatty acid composition analyses demonstrated that V. cholerae FabR was a functional repressor of unsaturated fatty acid synthesis. However, in contrast to E. coli, gel mobility shift assays indicated that neither E. coli nor V. cholerae FabRs bound the V. cholerae fabB promoter, although both proteins efficiently bound the V. cholerae fabA promoter. This asymmetry was shown to be due to the lack of a FabR binding site within the V. cholerae fabB promoter region.
Collapse
Affiliation(s)
- Youjun Feng
- Department of Microbiology, University of Illinois, Urbana, IL 61801, USA
| | | |
Collapse
|
44
|
Li G, Liu B, Ma Q, Xu Y. A new framework for identifying cis-regulatory motifs in prokaryotes. Nucleic Acids Res 2010; 39:e42. [PMID: 21149261 PMCID: PMC3074163 DOI: 10.1093/nar/gkq948] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
We present a new algorithm, BOBRO, for prediction of cis-regulatory motifs in a given set of promoter sequences. The algorithm substantially improves the prediction accuracy and extends the scope of applicability of the existing programs based on two key new ideas: (i) we developed a highly effective method for reliably assessing the possibility for each position in a given promoter to be the (approximate) start of a conserved sequence motif; and (ii) we developed a highly reliable way for recognition of actual motifs from the accidental ones based on the concept of ‘motif closure’. These two key ideas are embedded in a classical framework for motif finding through finding cliques in a graph but have made this framework substantially more sensitive as well as more selective in motif finding in a very noisy background. A comparative analysis shows that the performance coefficient was improved from 29% to 41% by our program compared to the best among other six state-of-the-art prediction tools on a large-scale data sets of promoters from one genome, and also consistently improved by substantial margins on another kind of large-scale data sets of orthologous promoters across multiple genomes. The power of BOBRO in dealing with noisy data was further demonstrated through identification of the motifs of the global transcriptional regulators by running it over 2390 promoter sequences of Escherichia coli K12.
Collapse
Affiliation(s)
- Guojun Li
- Computational Systems Biology Laboratory, Department of Biochemistry and Molecular Biology and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | | | | | | |
Collapse
|
45
|
Ishihama A. Prokaryotic genome regulation: multifactor promoters, multitarget regulators and hierarchic networks. FEMS Microbiol Rev 2010; 34:628-45. [DOI: 10.1111/j.1574-6976.2010.00227.x] [Citation(s) in RCA: 170] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
46
|
Kaleta C, Göhler A, Schuster S, Jahreis K, Guthke R, Nikolajewa S. Integrative inference of gene-regulatory networks in Escherichia coli using information theoretic concepts and sequence analysis. BMC SYSTEMS BIOLOGY 2010; 4:116. [PMID: 20718955 PMCID: PMC2936295 DOI: 10.1186/1752-0509-4-116] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2010] [Accepted: 08/18/2010] [Indexed: 11/24/2022]
Abstract
Background Although Escherichia coli is one of the best studied model organisms, a comprehensive understanding of its gene regulation is not yet achieved. There exist many approaches to reconstruct regulatory interaction networks from gene expression experiments. Mutual information based approaches are most useful for large-scale network inference. Results We used a three-step approach in which we combined gene regulatory network inference based on directed information (DTI) and sequence analysis. DTI values were calculated on a set of gene expression profiles from 19 time course experiments extracted from the Many Microbes Microarray Database. Focusing on influences between pairs of genes in which one partner encodes a transcription factor (TF) we derived a network which contains 878 TF - gene interactions of which 166 are known according to RegulonDB. Afterward, we selected a subset of 109 interactions that could be confirmed by the presence of a phylogenetically conserved binding site of the respective regulator. By this second step, the fraction of known interactions increased from 19% to 60%. In the last step, we checked the 44 of the 109 interactions not yet included in RegulonDB for functional relationships between the regulator and the target and, thus, obtained ten TF - target gene interactions. Five of them concern the regulator LexA and have already been reported in the literature. The remaining five influences describe regulations by Fis (with two novel targets), PhdR, PhoP, and KdgR. For the validation of our approach, one of them, the regulation of lipoate synthase (LipA) by the pyruvate-sensing pyruvate dehydrogenate repressor (PdhR), was experimentally checked and confirmed. Conclusions We predicted a set of five novel TF - target gene interactions in E. coli. One of them, the regulation of lipA by the transcriptional regulator PdhR was validated experimentally. Furthermore, we developed DTInfer, a new R-package for the inference of gene-regulatory networks from microarrays using directed information.
Collapse
Affiliation(s)
- Christoph Kaleta
- Systems Biology/Bioinformatics Group, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, Beutenbergstr, 11a, D-07745 Jena, Germany.
| | | | | | | | | | | |
Collapse
|
47
|
Zhang S, Li S, Pham PT, Su Z. Simultaneous prediction of transcription factor binding sites in a group of prokaryotic genomes. BMC Bioinformatics 2010; 11:397. [PMID: 20653963 PMCID: PMC2920276 DOI: 10.1186/1471-2105-11-397] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2010] [Accepted: 07/23/2010] [Indexed: 11/24/2022] Open
Abstract
Background Our current understanding of transcription factor binding sites (TFBSs) in sequenced prokaryotic genomes is very limited due to the lack of an accurate and efficient computational method for the prediction of TFBSs at a genome scale. In an attempt to change this situation, we have recently developed a comparative genomics based algorithm called GLECLUBS for de novo genome-wide prediction of TFBSs in a target genome. Although GLECLUBS has achieved rather high prediction accuracy of TFBSs in a target genome, it is still not efficient enough to be applied to all the sequenced prokaryotic genomes. Results Here, we designed a new algorithm based on GLECLUBS called extended GLECLUBS (eGLECLUBS) for simultaneous prediction of TFBSs in a group of related prokaryotic genomes. When tested on a group of γ-proteobacterial genomes including E. coli K12, a group of firmicutes genomes including B. subtilis and a group of cyanobacterial genomes using the same parameter settings, eGLECLUBS predicts more than 82% of known TFBSs in extracted inter-operonic sequences in both E. coli K12 and B. subtilis. Because each genome in a group is equally treated, it is highly likely that similar prediction accuracy has been achieved for each genome in the group. Conclusions We have developed a new algorithm for genome-wide de novo prediction of TFBSs in a group of related prokaryotic genomes. The algorithm has achieved the same level of accuracy and robustness as its predecessor GLECLUBS, but can work on dozens of genomes at the same time.
Collapse
Affiliation(s)
- Shaoqiang Zhang
- Department of Bioinformatics and Genomics, Center for Bioinformatics Research, the University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | | | | | | |
Collapse
|
48
|
Harari O, Park SY, Huang H, Groisman EA, Zwir I. Defining the plasticity of transcription factor binding sites by Deconstructing DNA consensus sequences: the PhoP-binding sites among gamma/enterobacteria. PLoS Comput Biol 2010; 6:e1000862. [PMID: 20661307 PMCID: PMC2908699 DOI: 10.1371/journal.pcbi.1000862] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2010] [Accepted: 06/15/2010] [Indexed: 01/12/2023] Open
Abstract
Transcriptional regulators recognize specific DNA sequences. Because these sequences are embedded in the background of genomic DNA, it is hard to identify the key cis-regulatory elements that determine disparate patterns of gene expression. The detection of the intra- and inter-species differences among these sequences is crucial for understanding the molecular basis of both differential gene expression and evolution. Here, we address this problem by investigating the target promoters controlled by the DNA-binding PhoP protein, which governs virulence and Mg(2+) homeostasis in several bacterial species. PhoP is particularly interesting; it is highly conserved in different gamma/enterobacteria, regulating not only ancestral genes but also governing the expression of dozens of horizontally acquired genes that differ from species to species. Our approach consists of decomposing the DNA binding site sequences for a given regulator into families of motifs (i.e., termed submotifs) using a machine learning method inspired by the "Divide & Conquer" strategy. By partitioning a motif into sub-patterns, computational advantages for classification were produced, resulting in the discovery of new members of a regulon, and alleviating the problem of distinguishing functional sites in chromatin immunoprecipitation and DNA microarray genome-wide analysis. Moreover, we found that certain partitions were useful in revealing biological properties of binding site sequences, including modular gains and losses of PhoP binding sites through evolutionary turnover events, as well as conservation in distant species. The high conservation of PhoP submotifs within gamma/enterobacteria, as well as the regulatory protein that recognizes them, suggests that the major cause of divergence between related species is not due to the binding sites, as was previously suggested for other regulators. Instead, the divergence may be attributed to the fast evolution of orthologous target genes and/or the promoter architectures resulting from the interaction of those binding sites with the RNA polymerase.
Collapse
Affiliation(s)
- Oscar Harari
- Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain
- Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Sun-Yang Park
- Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Henry Huang
- Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Eduardo A. Groisman
- Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Howard Hughes Medical Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Igor Zwir
- Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain
- Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Howard Hughes Medical Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
| |
Collapse
|
49
|
Novichkov PS, Rodionov DA, Stavrovskaya ED, Novichkova ES, Kazakov AE, Gelfand MS, Arkin AP, Mironov AA, Dubchak I. RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach. Nucleic Acids Res 2010; 38:W299-307. [PMID: 20542910 PMCID: PMC2896116 DOI: 10.1093/nar/gkq531] [Citation(s) in RCA: 113] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov.
Collapse
|
50
|
Hu M, Yu J, Taylor JMG, Chinnaiyan AM, Qin ZS. On the detection and refinement of transcription factor binding sites using ChIP-Seq data. Nucleic Acids Res 2010; 38:2154-67. [PMID: 20056654 PMCID: PMC2853110 DOI: 10.1093/nar/gkp1180] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Coupling chromatin immunoprecipitation (ChIP) with recently developed massively parallel sequencing technologies has enabled genome-wide detection of protein–DNA interactions with unprecedented sensitivity and specificity. This new technology, ChIP-Seq, presents opportunities for in-depth analysis of transcription regulation. In this study, we explore the value of using ChIP-Seq data to better detect and refine transcription factor binding sites (TFBS). We introduce a novel computational algorithm named Hybrid Motif Sampler (HMS), specifically designed for TFBS motif discovery in ChIP-Seq data. We propose a Bayesian model that incorporates sequencing depth information to aid motif identification. Our model also allows intra-motif dependency to describe more accurately the underlying motif pattern. Our algorithm combines stochastic sampling and deterministic ‘greedy’ search steps into a novel hybrid iterative scheme. This combination accelerates the computation process. Simulation studies demonstrate favorable performance of HMS compared to other existing methods. When applying HMS to real ChIP-Seq datasets, we find that (i) the accuracy of existing TFBS motif patterns can be significantly improved; and (ii) there is significant intra-motif dependency inside all the TFBS motifs we tested; modeling these dependencies further improves the accuracy of these TFBS motif patterns. These findings may offer new biological insights into the mechanisms of transcription factor regulation.
Collapse
Affiliation(s)
- Ming Hu
- Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | | | | | | | |
Collapse
|