1
|
Abstract
Since the large-scale experimental characterization of protein–protein interactions (PPIs) is not possible for all species, several computational PPI prediction methods have been developed that harness existing data from other species. While PPI network prediction has been extensively used in eukaryotes, microbial network inference has lagged behind. However, bacterial interactomes can be built using the same principles and techniques; in fact, several methods are better suited to bacterial genomes. These predicted networks allow systems-level analyses in species that lack experimental interaction data. This review describes the current network inference and analysis techniques and summarizes the use of computationally-predicted microbial interactomes to date.
Collapse
|
2
|
Parise MTD, Parise D, Kato RB, Pauling JK, Tauch A, Azevedo VADC, Baumbach J. CoryneRegNet 7, the reference database and analysis platform for corynebacterial gene regulatory networks. Sci Data 2020; 7:142. [PMID: 32393779 PMCID: PMC7214426 DOI: 10.1038/s41597-020-0484-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 04/15/2020] [Indexed: 12/21/2022] Open
Abstract
We present the newest version of CoryneRegNet, the reference database for corynebacterial regulatory interactions, available at www.exbio.wzw.tum.de/coryneregnet/. The exponential growth of next-generation sequencing data in recent years has allowed a better understanding of bacterial molecular mechanisms. Transcriptional regulation is one of the most important mechanisms for bacterial adaptation and survival. These mechanisms may be understood via an organism's network of regulatory interactions. Although the Corynebacterium genus is important in medical, veterinary and biotechnological research, little is known concerning the transcriptional regulation of these bacteria. Here, we unravel transcriptional regulatory networks (TRNs) for 224 corynebacterial strains by utilizing genome-scale transfer of TRNs from four model organisms and assigning statistical significance values to all predicted regulations. As a result, the number of corynebacterial strains with TRNs increased twenty times and the back-end and front-end were reimplemented to support new features as well as future database growth. CoryneRegNet 7 is the largest TRN database for the Corynebacterium genus and aids in elucidating transcriptional mechanisms enabling adaptation, survival and infection.
Collapse
Affiliation(s)
- Mariana Teixeira Dornelles Parise
- Institute of Biological Sciences, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany.
| | - Doglas Parise
- Institute of Biological Sciences, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Rodrigo Bentes Kato
- Institute of Biological Sciences, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Josch Konstantin Pauling
- LipiTUM, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Andreas Tauch
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
| | | | - Jan Baumbach
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| |
Collapse
|
3
|
Lund JB, List M, Baumbach J. Interactive microbial distribution analysis using BioAtlas. Nucleic Acids Res 2019; 45:W509-W513. [PMID: 28460071 PMCID: PMC5570126 DOI: 10.1093/nar/gkx304] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2017] [Accepted: 04/12/2017] [Indexed: 02/01/2023] Open
Abstract
Massive amounts of 16S rRNA sequencing data have been stored in publicly accessible databases, such as GOLD, SILVA, GreenGenes (GG), and the Ribosomal Database Project (RDP). Many of these sequences are tagged with geo-locations. Nevertheless, researchers currently lack a user-friendly tool to analyze microbial distribution in a location-specific context. BioAtlas is an interactive web application that closes this gap between sequence databases, taxonomy profiling and geo/body-location information. It enables users to browse taxonomically annotated sequences across (i) the world map, (ii) human body maps and (iii) user-defined maps. It further allows for (iv) uploading of own sample data, which can be placed on existing maps to (v) browse the distribution of the associated taxonomies. Finally, BioAtlas enables users to (vi) contribute custom maps (e.g. for plants or animals) and to map taxonomies to pre-defined map locations. In summary, BioAtlas facilitates map-supported browsing of public 16S rRNA sequence data and analyses of user-provided sequences without requiring manual mapping to taxonomies and existing databases. Availability: http://bioatlas.compbio.sdu.dk/
Collapse
Affiliation(s)
- Jesper Beltoft Lund
- Department of Mathematics and Computer Science (IMADA), University of Southern Denmark, 5000 Odense, Denmark
| | - Markus List
- Max Planck Institute for Informatics, Saarland Informatics Campus, 66123 Saarbrücken, Germany
| | - Jan Baumbach
- Department of Mathematics and Computer Science (IMADA), University of Southern Denmark, 5000 Odense, Denmark.,Max Planck Institute for Informatics, Saarland Informatics Campus, 66123 Saarbrücken, Germany
| |
Collapse
|
4
|
Hassan SS, Jamal SB, Radusky LG, Tiwari S, Ullah A, Ali J, Behramand, de Carvalho PVSD, Shams R, Khan S, Figueiredo HCP, Barh D, Ghosh P, Silva A, Baumbach J, Röttger R, Turjanski AG, Azevedo VAC. The Druggable Pocketome of Corynebacterium diphtheriae: A New Approach for in silico Putative Druggable Targets. Front Genet 2018; 9:44. [PMID: 29487617 PMCID: PMC5816920 DOI: 10.3389/fgene.2018.00044] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2017] [Accepted: 01/30/2018] [Indexed: 01/20/2023] Open
Abstract
Diphtheria is an acute and highly infectious disease, previously regarded as endemic in nature but vaccine-preventable, is caused by Corynebacterium diphtheriae (Cd). In this work, we used an in silico approach along the 13 complete genome sequences of C. diphtheriae followed by a computational assessment of structural information of the binding sites to characterize the “pocketome druggability.” To this end, we first computed the “modelome” (3D structures of a complete genome) of a randomly selected reference strain Cd NCTC13129; that had 13,763 open reading frames (ORFs) and resulted in 1,253 (∼9%) structure models. The amino acid sequences of these modeled structures were compared with the remaining 12 genomes and consequently, 438 conserved protein sequences were obtained. The RCSB-PDB database was consulted to check the template structures for these conserved proteins and as a result, 401 adequate 3D models were obtained. We subsequently predicted the protein pockets for the obtained set of models and kept only the conserved pockets that had highly druggable (HD) values (137 across all strains). Later, an off-target host homology analyses was performed considering the human proteome using NCBI database. Furthermore, the gene essentiality analysis was carried out that gave a final set of 10-conserved targets possessing highly druggable protein pockets. To check the target identification robustness of the pipeline used in this work, we crosschecked the final target list with another in-house target identification approach for C. diphtheriae thereby obtaining three common targets, these were; hisE-phosphoribosyl-ATP pyrophosphatase, glpX-fructose 1,6-bisphosphatase II, and rpsH-30S ribosomal protein S8. Our predicted results suggest that the in silico approach used could potentially aid in experimental polypharmacological target determination in C. diphtheriae and other pathogens, thereby, might complement the existing and new drug-discovery pipelines.
Collapse
Affiliation(s)
- Syed S Hassan
- Department of Chemistry, Islamia College University Peshawar, Peshawar, Pakistan
| | - Syed B Jamal
- PG Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Leandro G Radusky
- Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Sandeep Tiwari
- PG Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Asad Ullah
- Department of Chemistry, Islamia College University Peshawar, Peshawar, Pakistan
| | - Javed Ali
- Department of Chemistry, Kohat University of Science and Technology, Kohat, Pakistan
| | - Behramand
- Department of Chemistry, Islamia College University Peshawar, Peshawar, Pakistan
| | - Paulo V S D de Carvalho
- PG Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Rida Shams
- Department of Chemistry, Islamia College University Peshawar, Peshawar, Pakistan
| | - Sabir Khan
- Department of Analytical Chemistry, Institute of Chemistry, São Paulo State University, São Paulo, Brazil
| | - Henrique C P Figueiredo
- AQUACEN, National Reference Laboratory for Aquatic Animal Diseases, Ministry of Fisheries and Aquaculture, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Debmalya Barh
- PG Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil.,Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology, Purba Medinipur, India
| | - Preetam Ghosh
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Artur Silva
- Institute of Biological Sciences, Federal University of Pará, Belém, Brazil
| | - Jan Baumbach
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Richard Röttger
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Adrián G Turjanski
- Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina.,INQUIMAE/UBA-CONICET, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Vasco A C Azevedo
- PG Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil
| |
Collapse
|
5
|
Improving Gene Regulatory Network Inference by Incorporating Rates of Transcriptional Changes. Sci Rep 2017; 7:17244. [PMID: 29222512 PMCID: PMC5722905 DOI: 10.1038/s41598-017-17143-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Accepted: 11/22/2017] [Indexed: 11/18/2022] Open
Abstract
Organisms respond to changes in their environment through transcriptional regulatory networks (TRNs). The regulatory hierarchy of these networks can be inferred from expression data. Computational approaches to identify TRNs can be applied in any species where quality RNA can be acquired, However, ChIP-Seq and similar validation methods are challenging to employ in non-model species. Improving the accuracy of computational inference methods can significantly reduce the cost and time of subsequent validation experiments. We have developed ExRANGES, an approach that improves the ability to computationally infer TRN from time series expression data. ExRANGES utilizes both the rate of change in expression and the absolute expression level to identify TRN connections. We evaluated ExRANGES in five data sets from different model systems. ExRANGES improved the identification of experimentally validated transcription factor targets for all species tested, even in unevenly spaced and sparse data sets. This improved ability to predict known regulator-target relationships enhances the utility of network inference approaches in non-model species where experimental validation is challenging. We integrated ExRANGES with two different network construction approaches and it has been implemented as an R package available here: http://github.com/DohertyLab/ExRANGES. To install the package type: devtools::install_github(“DohertyLab/ExRANGES”).
Collapse
|
6
|
Liu B, Yang J, Li Y, McDermaid A, Ma Q. An algorithmic perspective of de novo cis-regulatory motif finding based on ChIP-seq data. Brief Bioinform 2017; 19:1069-1081. [DOI: 10.1093/bib/bbx026] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2016] [Indexed: 01/06/2023] Open
Affiliation(s)
- Bingqiang Liu
- School of Mathematics, Shandong University, Jinan Shandong, P. R. China
| | - Jinyu Yang
- Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, USA
| | - Yang Li
- School of Mathematics, Shandong University, Jinan Shandong, P. R. China
| | - Adam McDermaid
- Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, USA
| | - Qin Ma
- Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, USA
| |
Collapse
|
7
|
Freyre-González JA, Tauch A. Functional architecture and global properties of the Corynebacterium glutamicum regulatory network: Novel insights from a dataset with a high genomic coverage. J Biotechnol 2016; 257:199-210. [PMID: 27829123 DOI: 10.1016/j.jbiotec.2016.10.025] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 10/25/2016] [Accepted: 10/26/2016] [Indexed: 10/20/2022]
Abstract
Corynebacterium glutamicum is a Gram-positive, anaerobic, rod-shaped soil bacterium able to grow on a diversity of carbon sources like sugars and organic acids. It is a biotechnological relevant organism because of its highly efficient ability to biosynthesize amino acids, such as l-glutamic acid and l-lysine. Here, we reconstructed the most complete C. glutamicum regulatory network to date and comprehensively analyzed its global organizational properties, systems-level features and functional architecture. Our analyses show the tremendous power of Abasy Atlas to study the functional organization of regulatory networks. We created two models of the C. glutamicum regulatory network: all-evidences (containing both weak and strong supported interactions, genomic coverage=73%) and strongly-supported (only accounting for strongly supported evidences, genomic coverage=71%). Using state-of-the-art methodologies, we prove that power-law behaviors truly govern the connectivity and clustering coefficient distributions. We found a non-previously reported circuit motif that we named complex feed-forward motif. We highlighted the importance of feedback loops for the functional architecture, beyond whether they are statistically over-represented or not in the network. We show that the previously reported top-down approach is inadequate to infer the hierarchy governing a regulatory network because feedback bridges different hierarchical layers, and the top-down approach disregards the presence of intermodular genes shaping the integration layer. Our findings all together further support a diamond-shaped, three-layered hierarchy exhibiting some feedback between processing and coordination layers, which is shaped by four classes of systems-level elements: global regulators, locally autonomous modules, basal machinery and intermodular genes.
Collapse
Affiliation(s)
- Julio A Freyre-González
- Regulatory Systems Biology Research Group, Evolutionary Genomics Program, Center for Genomics Sciences, Universidad Nacional Autónoma de México, Av. Universidad s/n, Col. Chamilpa, 62210, Cuernavaca, Morelos, Mexico.
| | - Andreas Tauch
- Centrum für Biotechnologie (CeBiTec), Universität Bielefeld, Universitätsstraße 27, Bielefeld, 33615, Germany
| |
Collapse
|
8
|
Folador EL, de Carvalho PVSD, Silva WM, Ferreira RS, Silva A, Gromiha M, Ghosh P, Barh D, Azevedo V, Röttger R. In silico identification of essential proteins in Corynebacterium pseudotuberculosis based on protein-protein interaction networks. BMC SYSTEMS BIOLOGY 2016; 10:103. [PMID: 27814699 PMCID: PMC5097352 DOI: 10.1186/s12918-016-0346-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 10/18/2016] [Indexed: 12/27/2022]
Abstract
Background Corynebacterium pseudotuberculosis (Cp) is a gram-positive bacterium that is classified into equi and ovis serovars. The serovar ovis is the etiological agent of caseous lymphadenitis, a chronic infection affecting sheep and goats, causing economic losses due to carcass condemnation and decreased production of meat, wool, and milk. Current diagnosis or treatment protocols are not fully effective and, thus, require further research of Cp pathogenesis. Results Here, we mapped known protein-protein interactions (PPI) from various species to nine Cp strains to reconstruct parts of the potential Cp interactome and to identify potentially essential proteins serving as putative drug targets. On average, we predict 16,669 interactions for each of the nine strains (with 15,495 interactions shared among all strains). An in silico sanity check suggests that the potential networks were not formed by spurious interactions but have a strong biological bias. With the inferred Cp networks we identify 181 essential proteins, among which 41 are non-host homologous. Conclusions The list of candidate interactions of the Cp strains lay the basis for developing novel hypotheses and designing according wet-lab studies. The non-host homologous essential proteins are attractive targets for therapeutic and diagnostic proposes. They allow for searching of small molecule inhibitors of binding interactions enabling modern drug discovery. Overall, the predicted Cp PPI networks form a valuable and versatile tool for researchers interested in Corynebacterium pseudotuberculosis. Electronic supplementary material The online version of this article (doi:10.1186/s12918-016-0346-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Edson Luiz Folador
- Department of General Biology, Instituto de Ciências Biológicas (ICB), Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil.,Institute of Biological Sciences, Federal University of Para, Belém, PA, Brazil.,Biotechnology Center (CBiotec), Federal University of Paraiba (UFPB), João Pessoa, Brazil
| | - Paulo Vinícius Sanches Daltro de Carvalho
- Department of General Biology, Instituto de Ciências Biológicas (ICB), Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil.,Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Wanderson Marques Silva
- Department of General Biology, Instituto de Ciências Biológicas (ICB), Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Rafaela Salgado Ferreira
- Department of Biochemistry and Immunology, Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Artur Silva
- Institute of Biological Sciences, Federal University of Para, Belém, PA, Brazil
| | - Michael Gromiha
- Department of Biotechnology, Indian Institute of Technology (IIT) Madras, Tamilnadu, India
| | - Preetam Ghosh
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Debmalya Barh
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal, India
| | - Vasco Azevedo
- Department of General Biology, Instituto de Ciências Biológicas (ICB), Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Richard Röttger
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark.
| |
Collapse
|
9
|
Kılıç S, Erill I. Assessment of transfer methods for comparative genomics of regulatory networks in bacteria. BMC Bioinformatics 2016; 17 Suppl 8:277. [PMID: 27586594 PMCID: PMC5009822 DOI: 10.1186/s12859-016-1113-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Background Comparative genomics can leverage the vast amount of available genomic sequences to reconstruct and analyze transcriptional regulatory networks in Bacteria, but the efficacy of this approach hinges on the ability to transfer regulatory network information from reference species to the genomes under analysis. Several methods have been proposed to transfer regulatory information between bacterial species, but the paucity and distributed nature of experimental information on bacterial transcriptional networks have prevented their systematic evaluation. Results We report the compilation of a large catalog of transcription factor-binding sites across Bacteria and its use to systematically benchmark proposed transfer methods across pairs of bacterial species. We evaluate motif- and accuracy-based metrics to assess the results of regulatory network transfer and we identify the precision-recall area-under-the-curve as the best metric for this purpose due to the large class-imbalanced nature of the problem. Methods assuming conservation of the transcription factor-binding motif (motif-based) are shown to substantially outperform those assuming conservation of regulon composition (network-based), even though their efficiency can decrease sharply with increasing phylogenetic distance. Variations of the basic motif-based transfer method do not yield significant improvements in transfer accuracy. Our results indicate that detection of a large enough number of regulated orthologs is critical for network-based transfer methods, but that relaxing orthology requirements does not improve results. Using the transcriptional regulators LexA and Fur as case examples, we also show how DNA-binding domain sequence similarity can yield confounding results as an indicator of transfer efficiency for motif-based methods. Conclusions Counter to standard practice, our evaluation of metrics to assess the efficiency of methods for regulatory network information transfer reveals that the area under precision-recall (PR) curves is a more precise and informative metric than that of receiver-operating-characteristic (ROC) curves, confirming similar findings in other class-imbalanced settings. Our systematic assessment of transfer methods reveals that simple approaches to both motif- and network-based transfer of regulatory information provide equal or better results than more elaborate methods. We also show that there are not effective predictors of transfer efficacy, substantiating the long-standing practice of manual curation in comparative genomics analyses. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1113-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sefa Kılıç
- Department of Biological Sciences, University of Maryland Baltimore County (UMBC), Baltimore, MD, 21250, USA
| | - Ivan Erill
- Department of Biological Sciences, University of Maryland Baltimore County (UMBC), Baltimore, MD, 21250, USA.
| |
Collapse
|
10
|
Abreu VAC, Almeida S, Tiwari S, Hassan SS, Mariano D, Silva A, Baumbach J, Azevedo V, Röttger R. CMRegNet-An interspecies reference database for corynebacterial and mycobacterial regulatory networks. BMC Genomics 2015; 16:452. [PMID: 26062809 PMCID: PMC4464113 DOI: 10.1186/s12864-015-1631-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Accepted: 05/14/2015] [Indexed: 11/10/2022] Open
Abstract
Background Organisms utilize a multitude of mechanisms for responding to changing environmental conditions, maintaining their functional homeostasis and to overcome stress situations. One of the most important mechanisms is transcriptional gene regulation. In-depth study of the transcriptional gene regulatory network can lead to various practical applications, creating a greater understanding of how organisms control their cellular behavior. Description In this work, we present a new database, CMRegNet for the gene regulatory networks of Corynebacterium glutamicum ATCC 13032 and Mycobacterium tuberculosis H37Rv. We furthermore transferred the known networks of these model organisms to 18 other non-model but phylogenetically close species (target organisms) of the CMNR group. In comparison to other network transfers, for the first time we utilized two model organisms resulting into a more diverse and complete network of the target organisms. Conclusion CMRegNet provides easy access to a total of 3,103 known regulations in C. glutamicum ATCC 13032 and M. tuberculosis H37Rv and to 38,940 evolutionary conserved interactions for 18 non-model species of the CMNR group. This makes CMRegNet to date the most comprehensive database of regulatory interactions of CMNR bacteria. The content of CMRegNet is publicly available online via a web interface found at http://lgcm.icb.ufmg.br/cmregnet.
Collapse
Affiliation(s)
- Vinicius A C Abreu
- Graduate Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais (Universidade Federal de Minas Gerais), Belo Horizonte, Minas Gerais, Brazil.
| | - Sintia Almeida
- Graduate Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais (Universidade Federal de Minas Gerais), Belo Horizonte, Minas Gerais, Brazil.
| | - Sandeep Tiwari
- Graduate Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais (Universidade Federal de Minas Gerais), Belo Horizonte, Minas Gerais, Brazil.
| | - Syed Shah Hassan
- Graduate Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais (Universidade Federal de Minas Gerais), Belo Horizonte, Minas Gerais, Brazil.
| | - Diego Mariano
- Graduate Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais (Universidade Federal de Minas Gerais), Belo Horizonte, Minas Gerais, Brazil.
| | - Artur Silva
- Institute of Biological Sciences, Federal University of Pará, Belém, Pará, Brazil.
| | - Jan Baumbach
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark.
| | - Vasco Azevedo
- Graduate Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais (Universidade Federal de Minas Gerais), Belo Horizonte, Minas Gerais, Brazil.
| | - Richard Röttger
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark. .,Computational Systems Biology, Max Planck Institute for Informatics, Campus E 2.1, 66123, Saarbrucken, Germany.
| |
Collapse
|
11
|
Ferreira AJS, Siam R, Setubal JC, Moustafa A, Sayed A, Chambergo FS, Dawe AS, Ghazy MA, Sharaf H, Ouf A, Alam I, Abdel-Haleem AM, Lehvaslaiho H, Ramadan E, Antunes A, Stingl U, Archer JAC, Jankovic BR, Sogin M, Bajic VB, El-Dorry H. Core microbial functional activities in ocean environments revealed by global metagenomic profiling analyses. PLoS One 2014; 9:e97338. [PMID: 24921648 PMCID: PMC4055538 DOI: 10.1371/journal.pone.0097338] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2014] [Accepted: 04/17/2014] [Indexed: 11/19/2022] Open
Abstract
Metagenomics-based functional profiling analysis is an effective means of gaining deeper insight into the composition of marine microbial populations and developing a better understanding of the interplay between the functional genome content of microbial communities and abiotic factors. Here we present a comprehensive analysis of 24 datasets covering surface and depth-related environments at 11 sites around the world's oceans. The complete datasets comprises approximately 12 million sequences, totaling 5,358 Mb. Based on profiling patterns of Clusters of Orthologous Groups (COGs) of proteins, a core set of reference photic and aphotic depth-related COGs, and a collection of COGs that are associated with extreme oxygen limitation were defined. Their inferred functions were utilized as indicators to characterize the distribution of light- and oxygen-related biological activities in marine environments. The results reveal that, while light level in the water column is a major determinant of phenotypic adaptation in marine microorganisms, oxygen concentration in the aphotic zone has a significant impact only in extremely hypoxic waters. Phylogenetic profiling of the reference photic/aphotic gene sets revealed a greater variety of source organisms in the aphotic zone, although the majority of individual photic and aphotic depth-related COGs are assigned to the same taxa across the different sites. This increase in phylogenetic and functional diversity of the core aphotic related COGs most probably reflects selection for the utilization of a broad range of alternate energy sources in the absence of light.
Collapse
Affiliation(s)
- Ari J. S. Ferreira
- Department of Biology and the Science and Technology Research Center, School of Sciences and Engineering, The American University in Cairo, Cairo, New Cairo, Egypt
| | - Rania Siam
- Department of Biology and the Science and Technology Research Center, School of Sciences and Engineering, The American University in Cairo, Cairo, New Cairo, Egypt
| | - João C. Setubal
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, Brazil
| | - Ahmed Moustafa
- Department of Biology and the Science and Technology Research Center, School of Sciences and Engineering, The American University in Cairo, Cairo, New Cairo, Egypt
| | - Ahmed Sayed
- Department of Biology and the Science and Technology Research Center, School of Sciences and Engineering, The American University in Cairo, Cairo, New Cairo, Egypt
| | - Felipe S. Chambergo
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo, São Paulo, Brazil
| | - Adam S. Dawe
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Mohamed A. Ghazy
- Department of Biology and the Science and Technology Research Center, School of Sciences and Engineering, The American University in Cairo, Cairo, New Cairo, Egypt
| | - Hazem Sharaf
- Department of Biology and the Science and Technology Research Center, School of Sciences and Engineering, The American University in Cairo, Cairo, New Cairo, Egypt
| | - Amged Ouf
- Department of Biology and the Science and Technology Research Center, School of Sciences and Engineering, The American University in Cairo, Cairo, New Cairo, Egypt
| | - Intikhab Alam
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Alyaa M. Abdel-Haleem
- Department of Biology and the Science and Technology Research Center, School of Sciences and Engineering, The American University in Cairo, Cairo, New Cairo, Egypt
| | - Heikki Lehvaslaiho
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Eman Ramadan
- Department of Biology and the Science and Technology Research Center, School of Sciences and Engineering, The American University in Cairo, Cairo, New Cairo, Egypt
| | - André Antunes
- Institute for Biotechnology and Bioengineering, Centre of Biological Engineering, University of Minho, Portugal
| | - Ulrich Stingl
- Red Sea Research Center, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - John A. C. Archer
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Boris R. Jankovic
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Mitchell Sogin
- Josephine Bay Paul Center, Marine Biological Laboratory, Woods Hole, Massachusetts, United States of America
| | - Vladimir B. Bajic
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Hamza El-Dorry
- Department of Biology and the Science and Technology Research Center, School of Sciences and Engineering, The American University in Cairo, Cairo, New Cairo, Egypt
- * E-mail:
| |
Collapse
|
12
|
Saha S, Lindeberg M. Bound to Succeed: transcription factor binding-site prediction and its contribution to understanding virulence and environmental adaptation in bacterial plant pathogens. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2013; 26:1123-1130. [PMID: 23802990 DOI: 10.1094/mpmi-04-13-0090-cr] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Bacterial plant pathogens rely on a battalion of transcription factors to fine-tune their response to changing environmental conditions and to marshal the genetic resources required for successful pathogenesis. Prediction of transcription factor binding sites (TFBS) represents an important tool for elucidating regulatory networks and has been conducted in multiple genera of plant-pathogenic bacteria for the purpose of better understanding mechanisms of survival and pathogenesis. The major categories of TFBS that have been characterized are reviewed here, with emphasis on in silico methods used for site identification and challenges therein, their applicability to different types of sequence datasets, and insights into mechanisms of virulence and survival that have been gained through binding-site mapping. An improved strategy for establishing E-value cutoffs when using existing models to screen uncharacterized genomes is also discussed.
Collapse
|
13
|
Ma Q, Liu B, Zhou C, Yin Y, Li G, Xu Y. An integrated toolkit for accurate prediction and analysis of cis-regulatory motifs at a genome scale. ACTA ACUST UNITED AC 2013; 29:2261-8. [PMID: 23846744 DOI: 10.1093/bioinformatics/btt397] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
MOTIVATION We present an integrated toolkit, BoBro2.0, for prediction and analysis of cis-regulatory motifs. This toolkit can (i) reliably identify statistically significant cis-regulatory motifs at a genome scale; (ii) accurately scan for all motif instances of a query motif in specified genomic regions using a novel method for P-value estimation; (iii) provide highly reliable comparisons and clustering of identified motifs, which takes into consideration the weak signals from the flanking regions of the motifs; and (iv) analyze co-occurring motifs in the regulatory regions. RESULTS We have carried out systematic comparisons between motif predictions using BoBro2.0 and the MEME package. The comparison results on Escherichia coli K12 genome and the human genome show that BoBro2.0 can identify the statistically significant motifs at a genome scale more efficiently, identify motif instances more accurately and get more reliable motif clusters than MEME. In addition, BoBro2.0 provides correlational analyses among the identified motifs to facilitate the inference of joint regulation relationships of transcription factors. AVAILABILITY The source code of the program is freely available for noncommercial uses at http://code.google.com/p/bobro/. CONTACT xyn@bmb.uga.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qin Ma
- Computational Systems Biology Laboratory, Department of Biochemistry and Molecular Biology, Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | | | | | | | | | | |
Collapse
|
14
|
Signal correlations in ecological niches can shape the organization and evolution of bacterial gene regulatory networks. Adv Microb Physiol 2013; 61:1-36. [PMID: 23046950 DOI: 10.1016/b978-0-12-394423-8.00001-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Transcriptional regulation plays a significant role in the biological response of bacteria to changing environmental conditions. Therefore, mapping transcriptional regulatory networks is an important step not only in understanding how bacteria sense and interpret their environment but also to identify the functions involved in biological responses to specific conditions. Recent experimental and computational developments have facilitated the characterization of regulatory networks on a genome-wide scale in model organisms. In addition, the multiplication of complete genome sequences has encouraged comparative analyses to detect conserved regulatory elements and infer regulatory networks in other less well-studied organisms. However, transcription regulation appears to evolve rapidly, thus, creating challenges for the transfer of knowledge to nonmodel organisms. Nevertheless, the mechanisms and constraints driving the evolution of regulatory networks have been the subjects of numerous analyses, and several models have been proposed. Overall, the contributions of mutations, recombination, and horizontal gene transfer are complex. Finally, the rapid evolution of regulatory networks plays a significant role in the remarkable capacity of bacteria to adapt to new or changing environments. Conversely, the characteristics of environmental niches determine the selective pressures and can shape the structure of regulatory network accordingly.
Collapse
|
15
|
Sun Y, Zeng F, Zhang W, Qiao J. Structure-based phylogeny of polyene macrolide antibiotic glycosyltransferases. Gene 2012; 499:288-96. [DOI: 10.1016/j.gene.2012.02.050] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2011] [Revised: 02/23/2012] [Accepted: 02/27/2012] [Indexed: 11/28/2022]
|
16
|
Alcaraz N, Friedrich T, Kötzing T, Krohmer A, Müller J, Pauling J, Baumbach J. Efficient key pathway mining: combining networks and OMICS data. Integr Biol (Camb) 2012; 4:756-64. [PMID: 22353882 DOI: 10.1039/c2ib00133k] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Systems biology has emerged over the last decade. Driven by the advances in sophisticated measurement technology the research community generated huge molecular biology data sets. These comprise rather static data on the interplay of biological entities, for instance protein-protein interaction network data, as well as quite dynamic data collected for studying the behavior of individual cells or tissues in accordance with changing environmental conditions, such as DNA microarrays or RNA sequencing. Here we bring the two different data types together in order to gain higher level knowledge. We introduce a significantly improved version of the KeyPathwayMiner software framework. Given a biological network modelled as a graph and a set of expression studies, KeyPathwayMiner efficiently finds and visualizes connected sub-networks where most components are expressed in most cases. It finds all maximal connected sub-networks where all nodes but k exceptions are expressed in all experimental studies but at most l exceptions. We demonstrate the power of the new approach by comparing it to similar approaches with gene expression data previously used to study Huntington's disease. In addition, we demonstrate KeyPathwayMiner's flexibility and applicability to non-array data by analyzing genome-scale DNA methylation profiles from colorectal tumor cancer patients. KeyPathwayMiner release 2 is available as a Cytoscape plugin and online at http://keypathwayminer.mpi-inf.mpg.de.
Collapse
Affiliation(s)
- Nicolas Alcaraz
- Max Planck Institute for Informatics-Computational Systems Biology, Saarbrucken 66123, Germany
| | | | | | | | | | | | | |
Collapse
|
17
|
Pauling J, Röttger R, Neuner A, Salgado H, Collado-Vides J, Kalaghatgi P, Azevedo V, Tauch A, Pühler A, Baumbach J. On the trail of EHEC/EAEC--unraveling the gene regulatory networks of human pathogenic Escherichia coli bacteria. Integr Biol (Camb) 2012; 4:728-33. [PMID: 22318347 DOI: 10.1039/c2ib00132b] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Pathogenic Escherichia coli, such as Enterohemorrhagic E. coli (EHEC) and Enteroaggregative E. coli (EAEC), are globally widespread bacteria. Some may cause the hemolytic uremic syndrome (HUS). Varying strains cause epidemics all over the world. Recently, we observed an epidemic outbreak of a multi-resistant EHEC strain in Western Europe, mainly in Germany. The Robert Koch Institute reports >4300 infections and >50 deaths (July, 2011). Farmers lost several million EUR since the origin of infection was unclear. Here, we contribute to the currently ongoing research with a computer-aided study of EHEC transcriptional regulatory interactions, a network of genetic switches that control, for instance, pathogenicity, survival and reproduction of bacterial cells. Our strategy is to utilize knowledge of gene regulatory networks from the evolutionary relative E. coli K-12, a harmless strain mainly used for wet lab studies. In order to provide high-potential candidates for human pathogenic E. coli bacteria, such as EHEC, we developed the integrated online database and an analysis platform EhecRegNet. We utilize 3489 known regulations from E. coli K-12 for predictions of yet unknown gene regulatory interactions in 16 human pathogens. For these strains we predict 40,913 regulatory interactions. EhecRegNet is based on the identification of evolutionarily conserved regulatory sites within the DNA of the harmless E. coli K-12 and the pathogens. Identifying and characterizing EHEC's genetic control mechanism network on a large scale will allow for a better understanding of its survival and infection strategies. This will support the development of urgently needed new treatments. EhecRegNet is online via http://www.ehecregnet.de.
Collapse
Affiliation(s)
- Josch Pauling
- Computational Systems Biology, Max Planck Institute for Informatics, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Pauling J, Röttger R, Tauch A, Azevedo V, Baumbach J. CoryneRegNet 6.0--Updated database content, new analysis methods and novel features focusing on community demands. Nucleic Acids Res 2011; 40:D610-4. [PMID: 22080556 PMCID: PMC3245100 DOI: 10.1093/nar/gkr883] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Post-genomic analysis techniques such as next-generation sequencing have produced vast amounts of data about micro organisms including genetic sequences, their functional annotations and gene regulatory interactions. The latter are genetic mechanisms that control a cell's characteristics, for instance, pathogenicity as well as survival and reproduction strategies. CoryneRegNet is the reference database and analysis platform for corynebacterial gene regulatory networks. In this article we introduce the updated version 6.0 of CoryneRegNet and describe the updated database content which includes, 6352 corynebacterial regulatory interactions compared with 4928 interactions in release 5.0 and 3235 regulations in release 4.0, respectively. We also demonstrate how we support the community by integrating analysis and visualization features for transiently imported custom data, such as gene regulatory interactions. Furthermore, with release 6.0, we provide easy-to-use functions that allow the user to submit data for persistent storage with the CoryneRegNet database. Thus, it offers important options to its users in terms of community demands. CoryneRegNet is publicly available at http://www.coryneregnet.de.
Collapse
Affiliation(s)
- Josch Pauling
- Computational Systems Biology, Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbrücken, Germany
| | | | | | | | | |
Collapse
|