1
|
Liu W, Cen H, Wu Z, Zhou H, Chen S, Yang X, Zhao G, Zhang G. Mycobacteriaceae Phenome Atlas (MPA): A Standardized Atlas for the Mycobacteriaceae Phenome Based on Heterogeneous Sources. PHENOMICS (CHAM, SWITZERLAND) 2023; 3:439-456. [PMID: 37881319 PMCID: PMC10593683 DOI: 10.1007/s43657-023-00101-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 02/23/2023] [Accepted: 03/03/2023] [Indexed: 10/27/2023]
Abstract
The bacterial family Mycobacteriaceae includes pathogenic and nonpathogenic bacteria, and systematic research on their genome and phenome can give comprehensive perspectives for exploring their disease mechanism. In this study, the phenotypes of Mycobacteriaceae were inferred from available phenomic data, and 82 microbial phenotypic traits were recruited as data elements of the microbial phenome. This Mycobacteriaceae phenome contains five categories and 20 subcategories of polyphasic phenotypes, and three categories and eight subcategories of functional phenotypes, all of which are complementary to the existing data standards of microbial phenotypes. The phenomic data of Mycobacteriaceae strains were compiled by literature mining, third-party database integration, and bioinformatics annotation. The phenotypes were searchable and comparable from the website of the Mycobacteriaceae Phenome Atlas (MPA, https://www.biosino.org/mpa/). A topological data analysis of MPA revealed the co-evolution between Mycobacterium tuberculosis and virulence factors, and uncovered potential pathogenicity-associated phenotypes. Two hundred and sixty potential pathogen-enriched pathways were found by Fisher's exact test. The application of MPA may provide novel insights into the pathogenicity mechanism and antimicrobial targets of Mycobacteriaceae. Supplementary Information The online version contains supplementary material available at 10.1007/s43657-023-00101-5.
Collapse
Affiliation(s)
- Wan Liu
- National Genomics Data Center & Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031 China
| | - Hui Cen
- National Genomics Data Center & Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031 China
| | - Zhile Wu
- National Genomics Data Center & Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031 China
- Shanghai Southgene Technology Co., Ltd., Shanghai, 201210 China
| | - Haokui Zhou
- Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055 China
| | - Shuo Chen
- Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055 China
| | - Xilan Yang
- Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055 China
| | - Guoping Zhao
- National Genomics Data Center & Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031 China
- Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024 China
| | - Guoqing Zhang
- National Genomics Data Center & Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031 China
| |
Collapse
|
2
|
Thakur Z, Saini V, Arya P, Kumar A, Mehta PK. Computational insights into promoter architecture of toxin-antitoxin systems of Mycobacterium tuberculosis. Gene 2017; 641:161-171. [PMID: 29066303 DOI: 10.1016/j.gene.2017.10.054] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Revised: 09/27/2017] [Accepted: 10/16/2017] [Indexed: 12/16/2022]
Abstract
Toxin-antitoxin (TA) systems are two component genetic modules widespread in many bacterial genomes, including Mycobacterium tuberculosis (Mtb). The TA systems play a significant role in biofilm formation, antibiotic tolerance and persistence of pathogen inside the host cells. Deciphering regulatory motifs of Mtb TA systems is the first essential step to understand their transcriptional regulation. In this study, in silico approaches, that is, the knowledge based motif discovery and de novo motif discovery were used to identify the regulatory motifs of 79 Mtb TA systems. The knowledge based motif discovery approach was used to design a Perl based bio-tool Mtb-sig-miner available at (https://github.com/zoozeal/Mtb-sig-miner), which could successfully detect sigma (σ) factor specific regulatory motifs in the promoter region of Mtb TA modules. The manual curation of Mtb-sig-miner output hits revealed that the majority of them possessed σB regulatory motif in their promoter region. On the other hand, de novo approach resulted in the identification of a novel conserved motif [(T/A)(G/T)NTA(G/C)(C/A)AT(C/A)] within the promoter region of 14 Mtb TA systems. The identified conserved motif was also validated for its activity as conserved core region of operator sequence of corresponding TA system by molecular docking studies. The strong binding of respective antitoxin/toxin with the identified novel conserved motif reflected the validation of identified motif as the core region of operator sequence of respective TA systems. These findings provide computational insight to understand the transcriptional regulation of Mtb TA systems.
Collapse
Affiliation(s)
- Zoozeal Thakur
- Centre for Biotechnology, Maharshi Dayanand University, Rohtak, 124001, Haryana, India
| | - Vandana Saini
- Toxicology & Computational Biology Group, Centre for Bioinformatics, Maharshi Dayanand University, Rohtak, 124001, Haryana, India
| | - Preeti Arya
- National Agri-Food Biotechnology Institute, Sector 81, S.A.S Nagar, Mohali, Punjab 140306, India
| | - Ajit Kumar
- Toxicology & Computational Biology Group, Centre for Bioinformatics, Maharshi Dayanand University, Rohtak, 124001, Haryana, India.
| | - Promod K Mehta
- Centre for Biotechnology, Maharshi Dayanand University, Rohtak, 124001, Haryana, India.
| |
Collapse
|
3
|
REMap: Operon map of M. tuberculosis based on RNA sequence data. Tuberculosis (Edinb) 2016; 99:70-80. [PMID: 27450008 DOI: 10.1016/j.tube.2016.04.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2016] [Revised: 04/19/2016] [Accepted: 04/24/2016] [Indexed: 12/18/2022]
Abstract
A map of the transcriptional organization of genes of an organism is a basic tool that is necessary to understand and facilitate a more accurate genetic manipulation of the organism. Operon maps are largely generated by computational prediction programs that rely on gene conservation and genome architecture and may not be physiologically relevant. With the widespread use of RNA sequencing (RNAseq), the prediction of operons based on actual transcriptome sequencing rather than computational genomics alone is much needed. Here, we report a validated operon map of Mycobacterium tuberculosis, developed using RNAseq data from both the exponential and stationary phases of growth. At least 58.4% of M. tuberculosis genes are organized into 749 operons. Our prediction algorithm, REMap (RNA Expression Mapping of operons), considers the many cases of transcription coverage of intergenic regions, and avoids dependencies on functional annotation and arbitrary assumptions about gene structure. As a result, we demonstrate that REMap is able to more accurately predict operons, especially those that contain long intergenic regions or functionally unrelated genes, than previous operon prediction programs. The REMap algorithm is publicly available as a user-friendly tool that can be readily modified to predict operons in other bacteria.
Collapse
|
4
|
Midha M, Prasad NK, Vindal V. MycoRRdb: a database of computationally identified regulatory regions within intergenic sequences in mycobacterial genomes. PLoS One 2012; 7:e36094. [PMID: 22563442 PMCID: PMC3338573 DOI: 10.1371/journal.pone.0036094] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2011] [Accepted: 03/29/2012] [Indexed: 11/18/2022] Open
Abstract
The identification of regulatory regions for a gene is an important step towards deciphering the gene regulation. Regulatory regions tend to be conserved under evolution that facilitates the application of comparative genomics to identify such regions. The present study is an attempt to make use of this attribute to identify regulatory regions in the Mycobacterium species followed by the development of a database, MycoRRdb. It consist the regulatory regions identified within the intergenic distances of 25 mycobacterial species. MycoRRdb allows to retrieve the identified intergenic regulatory elements in the mycobacterial genomes. In addition to the predicted motifs, it also allows user to retrieve the Reciprocal Best BLAST Hits across the mycobacterial genomes. It is a useful resource to understand the transcriptional regulatory mechanism of mycobacterial species. This database is first of its kind which specifically addresses cis-regulatory regions and also comprehensive to the mycobacterial species. Database URL: http://mycorrdb.uohbif.in.
Collapse
Affiliation(s)
- Mohit Midha
- Department of Biotechnology, School of Life Sciences, University of Hyderabad, Hyderabad, India
| | - Nirmal K. Prasad
- Department of Biotechnology, School of Life Sciences, University of Hyderabad, Hyderabad, India
| | - Vaibhav Vindal
- Department of Biotechnology, School of Life Sciences, University of Hyderabad, Hyderabad, India
- * E-mail:
| |
Collapse
|
5
|
Sundaramurthi JC, Brindha S, Reddy T, Hanna LE. Informatics resources for tuberculosis – Towards drug discovery. Tuberculosis (Edinb) 2012; 92:133-8. [DOI: 10.1016/j.tube.2011.08.006] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2011] [Revised: 08/03/2011] [Accepted: 08/22/2011] [Indexed: 11/15/2022]
|
6
|
Bharti R, Das R, Sharma P, Katoch K, Bhattacharya A. MTCID: a database of genetic polymorphisms in clinical isolates of Mycobacterium tuberculosis. Tuberculosis (Edinb) 2011; 92:166-72. [PMID: 22209237 DOI: 10.1016/j.tube.2011.12.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2011] [Revised: 11/28/2011] [Accepted: 12/02/2011] [Indexed: 01/15/2023]
Abstract
Tuberculosis (TB) is a major cause of morbidity and mortality throughout the world, particularly in developing countries. The response of the patients and treatment outcome depends, in addition to diagnosis, appropriate and timely treatment and host factors, on the virulence of Mycobacterium tuberculosis and genetic polymorphism prevalent in clinical isolates of the bacterium. A number of studies have been carried out to characterize clinical isolates of M. tuberculosis obtained from TB patients. However, the data is scattered in a large number of publications. Though attempts have been made to catalog the observed variations, there is no database that has been developed for cataloging, storing and dissemination of genetic polymorphism information. MTCID (M. tuberculosis clinical isolate genetic polymorphism database) is an attempt to provide a comprehensive repository to store, access and disseminate single nucleotide polymorphism (SNPs) and spoligotyping profiles of M. tuberculosis. It can be used to automatically upload the information available with a user that adds to the existing database at the backend. Besides it may also aid in maintaining clinical profiles of TB and treatment of patients. The database has 'search' features and is available at http://ccbb.jnu.ac.in/Tb.
Collapse
Affiliation(s)
- Richa Bharti
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| | | | | | | | | |
Collapse
|
7
|
Sharma D, Surolia A. Computational tools to study and understand the intricate biology of mycobacteria. Tuberculosis (Edinb) 2011; 91:273-6. [PMID: 21398182 DOI: 10.1016/j.tube.2011.02.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2010] [Revised: 02/07/2011] [Accepted: 02/08/2011] [Indexed: 11/19/2022]
Abstract
The field of mycobacteriology is currently an area of intense research. To deal with the copious amount of data being generated, numerous web servers and databases have been developed. However, these are available at disparate sites and there exists no single source/platform which provides information about their utility and access. Therefore, a comprehensive compilation of various bioinformatics tools/resources dedicated to mycobacteria is presented in this article.
Collapse
Affiliation(s)
- Deepak Sharma
- National Institute of Immunology, Aruna Asaf Ali Marg, New Delhi 110067, India.
| | | |
Collapse
|
8
|
Zhu X, Chang S, Fang K, Cui S, Liu J, Wu Z, Yu X, Gao GF, Yang H, Zhu B, Wang J. MyBASE: a database for genome polymorphism and gene function studies of Mycobacterium. BMC Microbiol 2009; 9:40. [PMID: 19228437 PMCID: PMC2656513 DOI: 10.1186/1471-2180-9-40] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2008] [Accepted: 02/20/2009] [Indexed: 01/09/2023] Open
Abstract
Background Mycobacterial pathogens are a major threat to humans. With the increasing availability of functional genomic data, research on mycobacterial pathogenesis and subsequent control strategies will be greatly accelerated. It has been suggested that genome polymorphisms, namely large sequence polymorphisms, can influence the pathogenicity of different mycobacterial strains. However, there is currently no database dedicated to mycobacterial genome polymorphisms with functional interpretations. Description We have developed a mycobacterial database (MyBASE) housing genome polymorphism data and gene functions to provide the mycobacterial research community with a useful information resource and analysis platform. Whole genome comparison data produced by our lab and the novel genome polymorphisms identified were deposited into MyBASE. Extensive literature review of genome polymorphism data, mainly large sequence polymorphisms (LSPs), operon predictions and curated annotations of virulence and essentiality of mycobacterial genes are unique features of MyBASE. Large-scale genomic data integration from public resources makes MyBASE a comprehensive data warehouse useful for current research. All data is cross-linked and can be graphically viewed via a toolbox in MyBASE. Conclusion As an integrated platform focused on the collection of experimental data from our own lab and published literature, MyBASE will facilitate analysis of genome structure and polymorphisms, which will provide insight into genome evolution. Importantly, the database will also facilitate the comparison of virulence factors among various mycobacterial strains. MyBASE is freely accessible via http://mybase.psych.ac.cn.
Collapse
Affiliation(s)
- Xinxing Zhu
- Behavioral Genetics Center, Institute of Psychology, Chinese Academy of Sciences, Beijing, PR China.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Vishnoi A, Srivastava A, Roy R, Bhattacharya A. MGDD: Mycobacterium tuberculosis genome divergence database. BMC Genomics 2008; 9:373. [PMID: 18681951 PMCID: PMC2518163 DOI: 10.1186/1471-2164-9-373] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2008] [Accepted: 08/05/2008] [Indexed: 11/25/2022] Open
Abstract
Background Variation in genomes among different closely-related organisms can be linked to phenotypic differences. A number of mechanisms, such as replication error, repeat expansion and contraction, recombination and transposition can contribute to genomic differences. These processes lead to generation of SNPs, different types of repeat-based and transposons or IS-element-based polymorphisms, inversions and duplications and changes in synteny. A database of all the variations in a group of organisms is not only useful for understanding genotype-phenotype relationship but also in clinical applications. There is no database available at present that provides information about detailed genomic variations among different strains and species of Mycobacterium tuberculosis complex, organisms responsible for human diseases. Description MGDD is a free web-based database that allows quick user friendly search to find different types of genomic variations among a group of fully sequenced organisms belonging to M. tuberculosis complex. The searches are based on data generated by pair wise comparison using a tool that has already been described. Different types of variations that can be searched are SNPs, indels, tandem repeats and divergent regions. The searches can be designed to find specific variations either in a given gene or any given location of the query genome with respect to any other genome currently available. Conclusion Web-based database MGDD can help to find all the possible differences that exists between two strains or species of M. tuberculosis complex. The search tool is very user-friendly and can be used by anyone not familiar with computational methods and will be useful to both clinicians and researchers working on tuberculosis and other Mycobacterial diseases.
Collapse
Affiliation(s)
- Anchal Vishnoi
- Center for Computational Biology and Bioinformatics, School of Information Technology, Jawaharlal Nehru University, New Delhi 110067, India.
| | | | | | | |
Collapse
|
10
|
Vindal V, Ashwantha Kumar E, Ranjan A. Identification of operator sites within the upstream region of the putativemce2Rgene from mycobacteria. FEBS Lett 2008; 582:1117-22. [DOI: 10.1016/j.febslet.2008.02.074] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2008] [Revised: 02/26/2008] [Accepted: 02/29/2008] [Indexed: 10/22/2022]
|
11
|
Vindal V, Suma K, Ranjan A. GntR family of regulators in Mycobacterium smegmatis: a sequence and structure based characterization. BMC Genomics 2007; 8:289. [PMID: 17714599 PMCID: PMC2018728 DOI: 10.1186/1471-2164-8-289] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2006] [Accepted: 08/23/2007] [Indexed: 11/24/2022] Open
Abstract
Background Mycobacterium smegmatis is fast growing non-pathogenic mycobacteria. This organism has been widely used as a model organism to study the biology of other virulent and extremely slow growing species like Mycobacterium tuberculosis. Based on the homology of the N-terminal DNA binding domain, the recently sequenced genome of M. smegmatis has been shown to possess several putative GntR regulators. A striking characteristic feature of this family of regulators is that they possess a conserved N-terminal DNA binding domain and a diverse C-terminal domain involved in the effector binding and/or oligomerization. Since the physiological role of these regulators is critically dependent upon effector binding and operator sites, we have analysed and classified these regulators into their specific subfamilies and identified their potential binding sites. Results The sequence analysis of M. smegmatis putative GntRs has revealed that FadR, HutC, MocR and the YtrA-like regulators are encoded by 45, 8, 8 and 1 genes respectively. Further out of 45 FadR-like regulators, 19 were classified into the FadR group and 26 into the VanR group. All these proteins showed similar secondary structural elements specific to their respective subfamilies except MSMEG_3959, which showed additional secondary structural elements. Using the reciprocal BLAST searches, we further identified the orthologs of these regulators in Bacillus subtilis and other mycobacteria. Since the expression of many regulators is auto-regulatory, we have identified potential operator sites for a number of these GntR regulators by analyzing the upstream sequences. Conclusion This study helps in extending the annotation of M. smegmatis GntR proteins. It identifies the GntR regulators of M. smegmatis that could serve as a model for studying orthologous regulators from virulent as well as other saprophytic mycobacteria. This study also sheds some light on the nucleotide preferences in the target-motifs of GntRs thus providing important leads for initiating the experimental characterization of these proteins, construction of the gene regulatory network for these regulators and an understanding of the influence of these proteins on the physiology of the mycobacteria.
Collapse
Affiliation(s)
- Vaibhav Vindal
- Computational and Functional Genomics Group, Sun Centre of Excellence in Medical Bioinformatics, Centre for DNA Fingerprinting and Diagnostics, EMBnet India Node, Hyderabad 500076, India
| | - Katta Suma
- Computational and Functional Genomics Group, Sun Centre of Excellence in Medical Bioinformatics, Centre for DNA Fingerprinting and Diagnostics, EMBnet India Node, Hyderabad 500076, India
| | - Akash Ranjan
- Computational and Functional Genomics Group, Sun Centre of Excellence in Medical Bioinformatics, Centre for DNA Fingerprinting and Diagnostics, EMBnet India Node, Hyderabad 500076, India
| |
Collapse
|
12
|
Ranganathan S, Tammi M, Gribskov M, Tan TW. Establishing bioinformatics research in the Asia Pacific. BMC Bioinformatics 2006. [PMCID: PMC1764485 DOI: 10.1186/1471-2105-7-s5-s1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
In 1998, the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand) and Busan (South Korea). This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community.
Collapse
|