1
|
Ibraim IC, Parise MTD, Parise D, Sfeir MZT, de Paula Castro TL, Wattam AR, Ghosh P, Barh D, Souza EM, Góes-Neto A, Gomide ACP, Azevedo V. Transcriptome profile of Corynebacterium pseudotuberculosis in response to iron limitation. BMC Genomics 2019; 20:663. [PMID: 31429699 PMCID: PMC6701010 DOI: 10.1186/s12864-019-6018-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 08/06/2019] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Iron is an essential micronutrient for the growth and development of virtually all living organisms, playing a pivotal role in the proliferative capability of many bacterial pathogens. The impact that the bioavailability of iron has on the transcriptional response of bacterial species in the CMNR group has been widely reported for some members of the group, but it hasn't yet been as deeply explored in Corynebacterium pseudotuberculosis. Here we describe for the first time a comprehensive RNA-seq whole transcriptome analysis of the T1 wild-type and the Cp13 mutant strains of C. pseudotuberculosis under iron restriction. The Cp13 mutant strain was generated by transposition mutagenesis of the ciuA gene, which encodes a surface siderophore-binding protein involved in the acquisition of iron. Iron-regulated acquisition systems are crucial for the pathogenesis of bacteria and are relevant targets to the design of new effective therapeutic approaches. RESULTS Transcriptome analyses showed differential expression in 77 genes within the wild-type parental T1 strain and 59 genes in Cp13 mutant under iron restriction. Twenty-five of these genes had similar expression patterns in both strains, including up-regulated genes homologous to the hemin uptake hmu locus and two distinct operons encoding proteins structurally like hemin and Hb-binding surface proteins of C. diphtheriae, which were remarkably expressed at higher levels in the Cp13 mutant than in the T1 wild-type strain. These hemin transport protein genes were found to be located within genomic islands associated with known virulent factors. Down-regulated genes encoding iron and heme-containing components of the respiratory chain (including ctaCEF and qcrCAB genes) and up-regulated known iron/DtxR-regulated transcription factors, namely ripA and hrrA, were also identified differentially expressed in both strains under iron restriction. CONCLUSION Based on our results, it can be deduced that the transcriptional response of C. pseudotuberculosis under iron restriction involves the control of intracellular utilization of iron and the up-regulation of hemin acquisition systems. These findings provide a comprehensive analysis of the transcriptional response of C. pseudotuberculosis, adding important understanding of the gene regulatory adaptation of this pathogen and revealing target genes that can aid the development of effective therapeutic strategies against this important pathogen.
Collapse
Affiliation(s)
- Izabela Coimbra Ibraim
- Laboratório de Genética Molecular e Celular, Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Mariana Teixeira Dornelles Parise
- Laboratório de Genética Molecular e Celular, Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Doglas Parise
- Laboratório de Genética Molecular e Celular, Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Michelle Zibetti Tadra Sfeir
- Departamento de Bioquímica e Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal do Paraná, Curitiba, PR, Brazil
| | - Thiago Luiz de Paula Castro
- Departamento de Biointeração, Instituto de Ciências da Saude, Universidade Federal da Bahia, Salvador, BA, Brazil
| | - Alice Rebecca Wattam
- Biocomplexity Institute and Initiative, University of Virginia, Charlottesville, VA, USA
| | - Preetam Ghosh
- Department of Computer Science, Biological Networks Lab, Virginia Commonwealth University, Richmond, VA, USA
| | - Debmalya Barh
- Laboratório de Genética Molecular e Celular, Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Emannuel Maltempi Souza
- Departamento de Bioquímica e Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal do Paraná, Curitiba, PR, Brazil
| | - Aristóteles Góes-Neto
- Department of Microbiology, Institute of Biological Sciences, Federal University of Minas Gerais (UFMG), Belo Horizonte, MG, 31270-901, Brazil
| | - Anne Cybelle Pinto Gomide
- Laboratório de Genética Molecular e Celular, Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Vasco Azevedo
- Laboratório de Genética Molecular e Celular, Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil.
| |
Collapse
|
2
|
Abstract
Clustering is a popular technique for discovering groups of similar objects in large datasets. It is nowadays applied in all areas of life sciences, from biomedicine to physics. However, designing high-quality cluster analyses is a tedious and complicated task with manifold choices along the way. As a cluster analysis is often the first step of a succeeding downstream analysis, the clustering must be reliable, reproducible, and of the highest quality. To address these challenges, we recently developed ClustEval, an integrated and extensible platform for the automated and standardized design and execution of complex cluster analyses. It allows researchers to design and carry out cluster analyses involving a large number of clustering methods applied to many, large datasets. ClustEval helps to shed light on all major aspects of cluster analysis, from choosing the right similarity function to using validity indices and data preprocessing protocols. Only this high degree of automation allows the researcher to easily run a clustering task with many different tools, parameters, and settings in order to gain the best possible outcome. In this paper, we guide the user step by step through three fundamentally important and widely applicable use cases: (i) identification of the best clustering method for a new, user-given protein sequence similarity dataset; (ii) evaluation of the performance of a new, user-given clustering method (densityCut) against the state of the art; and (iii) prediction of the best method for a new protein sequence similarity dataset. This protocol guides the user through the most important features of ClustEval and takes ∼4 h to complete.
Collapse
|
3
|
Hassan SS, Jamal SB, Radusky LG, Tiwari S, Ullah A, Ali J, Behramand, de Carvalho PVSD, Shams R, Khan S, Figueiredo HCP, Barh D, Ghosh P, Silva A, Baumbach J, Röttger R, Turjanski AG, Azevedo VAC. The Druggable Pocketome of Corynebacterium diphtheriae: A New Approach for in silico Putative Druggable Targets. Front Genet 2018; 9:44. [PMID: 29487617 PMCID: PMC5816920 DOI: 10.3389/fgene.2018.00044] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2017] [Accepted: 01/30/2018] [Indexed: 01/20/2023] Open
Abstract
Diphtheria is an acute and highly infectious disease, previously regarded as endemic in nature but vaccine-preventable, is caused by Corynebacterium diphtheriae (Cd). In this work, we used an in silico approach along the 13 complete genome sequences of C. diphtheriae followed by a computational assessment of structural information of the binding sites to characterize the “pocketome druggability.” To this end, we first computed the “modelome” (3D structures of a complete genome) of a randomly selected reference strain Cd NCTC13129; that had 13,763 open reading frames (ORFs) and resulted in 1,253 (∼9%) structure models. The amino acid sequences of these modeled structures were compared with the remaining 12 genomes and consequently, 438 conserved protein sequences were obtained. The RCSB-PDB database was consulted to check the template structures for these conserved proteins and as a result, 401 adequate 3D models were obtained. We subsequently predicted the protein pockets for the obtained set of models and kept only the conserved pockets that had highly druggable (HD) values (137 across all strains). Later, an off-target host homology analyses was performed considering the human proteome using NCBI database. Furthermore, the gene essentiality analysis was carried out that gave a final set of 10-conserved targets possessing highly druggable protein pockets. To check the target identification robustness of the pipeline used in this work, we crosschecked the final target list with another in-house target identification approach for C. diphtheriae thereby obtaining three common targets, these were; hisE-phosphoribosyl-ATP pyrophosphatase, glpX-fructose 1,6-bisphosphatase II, and rpsH-30S ribosomal protein S8. Our predicted results suggest that the in silico approach used could potentially aid in experimental polypharmacological target determination in C. diphtheriae and other pathogens, thereby, might complement the existing and new drug-discovery pipelines.
Collapse
Affiliation(s)
- Syed S Hassan
- Department of Chemistry, Islamia College University Peshawar, Peshawar, Pakistan
| | - Syed B Jamal
- PG Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Leandro G Radusky
- Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Sandeep Tiwari
- PG Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Asad Ullah
- Department of Chemistry, Islamia College University Peshawar, Peshawar, Pakistan
| | - Javed Ali
- Department of Chemistry, Kohat University of Science and Technology, Kohat, Pakistan
| | - Behramand
- Department of Chemistry, Islamia College University Peshawar, Peshawar, Pakistan
| | - Paulo V S D de Carvalho
- PG Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Rida Shams
- Department of Chemistry, Islamia College University Peshawar, Peshawar, Pakistan
| | - Sabir Khan
- Department of Analytical Chemistry, Institute of Chemistry, São Paulo State University, São Paulo, Brazil
| | - Henrique C P Figueiredo
- AQUACEN, National Reference Laboratory for Aquatic Animal Diseases, Ministry of Fisheries and Aquaculture, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Debmalya Barh
- PG Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil.,Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology, Purba Medinipur, India
| | - Preetam Ghosh
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Artur Silva
- Institute of Biological Sciences, Federal University of Pará, Belém, Brazil
| | - Jan Baumbach
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Richard Röttger
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Adrián G Turjanski
- Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina.,INQUIMAE/UBA-CONICET, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Vasco A C Azevedo
- PG Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil
| |
Collapse
|
4
|
Barbosa E, Röttger R, Hauschild AC, de Castro Soares S, Böcker S, Azevedo V, Baumbach J. LifeStyle-Specific-Islands (LiSSI): Integrated Bioinformatics Platform for Genomic Island Analysis. J Integr Bioinform 2017; 14:/j/jib.2017.14.issue-2/jib-2017-0010/jib-2017-0010.xml. [PMID: 28678736 PMCID: PMC6042826 DOI: 10.1515/jib-2017-0010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2017] [Revised: 04/10/2017] [Accepted: 04/19/2017] [Indexed: 11/20/2022] Open
Abstract
Distinct bacteria are able to cope with highly diverse lifestyles; for instance, they can be free living or host-associated. Thus, these organisms must possess a large and varied genomic arsenal to withstand different environmental conditions. To facilitate the identification of genomic features that might influence bacterial adaptation to a specific niche, we introduce LifeStyle-Specific-Islands (LiSSI). LiSSI combines evolutionary sequence analysis with statistical learning (Random Forest with feature selection, model tuning and robustness analysis). In summary, our strategy aims to identify conserved consecutive homology sequences (islands) in genomes and to identify the most discriminant islands for each lifestyle.
Collapse
Affiliation(s)
- Eudes Barbosa
- University of Southern Denmark, Department of Mathematics and Computer Science, Odense, Denmark
- Federal University of Minas Gerais, Institute of Biological Sciences, Belo Horizonte, Brazil
| | - Richard Röttger
- University of Southern Denmark, Department of Mathematics and Computer Science, Odense, Denmark
| | - Anne-Christin Hauschild
- University of Southern Denmark, Department of Mathematics and Computer Science, Odense, Denmark
| | - Siomar de Castro Soares
- Federal University of Minas Gerais, Institute of Biological Sciences, Belo Horizonte, Brazil
- Federal University of Triângulo Mineiro, Department of Immunology, Microbiology and Parasitology, Uberaba, Brazil
| | - Sebastian Böcker
- Friedrich-Schiller-Universität Jena, Faculty of Mathematics and Computer Science, Jena, Germany
| | - Vasco Azevedo
- Federal University of Minas Gerais, Institute of Biological Sciences, Belo Horizonte, Brazil
| | - Jan Baumbach
- University of Southern Denmark, Department of Mathematics and Computer Science, Odense, Denmark
| |
Collapse
|
5
|
PaPrBaG: A machine learning approach for the detection of novel pathogens from NGS data. Sci Rep 2017; 7:39194. [PMID: 28051068 PMCID: PMC5209729 DOI: 10.1038/srep39194] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Accepted: 11/18/2016] [Indexed: 12/20/2022] Open
Abstract
The reliable detection of novel bacterial pathogens from next-generation sequencing data is a key challenge for microbial diagnostics. Current computational tools usually rely on sequence similarity and often fail to detect novel species when closely related genomes are unavailable or missing from the reference database. Here we present the machine learning based approach PaPrBaG (Pathogenicity Prediction for Bacterial Genomes). PaPrBaG overcomes genetic divergence by training on a wide range of species with known pathogenicity phenotype. To that end we compiled a comprehensive list of pathogenic and non-pathogenic bacteria with human host, using various genome metadata in conjunction with a rule-based protocol. A detailed comparative study reveals that PaPrBaG has several advantages over sequence similarity approaches. Most importantly, it always provides a prediction whereas other approaches discard a large number of sequencing reads with low similarity to currently known reference genomes. Furthermore, PaPrBaG remains reliable even at very low genomic coverages. CombiningPaPrBaG with existing approaches further improves prediction results.
Collapse
|
6
|
Martínez-García PM, López-Solanilla E, Ramos C, Rodríguez-Palenzuela P. Prediction of bacterial associations with plants using a supervised machine-learning approach. Environ Microbiol 2016; 18:4847-4861. [PMID: 27234490 DOI: 10.1111/1462-2920.13389] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2015] [Revised: 05/20/2016] [Accepted: 05/20/2016] [Indexed: 12/11/2022]
Abstract
Recent scenarios of fresh produce contamination by human enteric pathogens have resulted in severe food-borne outbreaks, and a new paradigm has emerged stating that some human-associated bacteria can use plants as secondary hosts. As a consequence, there has been growing concern in the scientific community about these interactions that have not yet been elucidated. Since this is a relatively new area, there is a lack of strategies to address the problem of food-borne illnesses due to the ingestion of fruits and vegetables. In the present study, we performed specific genome annotations to train a supervised machine-learning model that allows for the identification of plant-associated bacteria with a precision of ∼93%. The application of our method to approximately 9500 genomes predicted several unknown interactions between well-known human pathogens and plants, and it also confirmed several cases for which evidence has been reported. We observed that factors involved in adhesion, the deconstruction of the plant cell wall and detoxifying activities were highlighted as the most predictive features. The application of our strategy to sequenced strains that are involved in food poisoning can be used as a primary screening tool to determine the possible causes of contaminations.
Collapse
Affiliation(s)
- Pedro Manuel Martínez-García
- Área de Genética, Facultad de Ciencias, Instituto de Hortofruticultura Subtropical y Mediterránea 'La Mayora', Universidad de Málaga, Consejo Superior de Investigaciones Científicas (IHSM-UMA-CSIC), Málaga, E-29071, Spain.,Centro de Biotecnología y Genómica de Plantas (CBGP), Universidad Politécnica de Madrid-Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria, Parque Científico y Tecnológico de la Universidad Politécnica de Madrid. Campus de Montegancedo, Pozuelo de Alarcón, Madrid, 28223, Spain
| | - Emilia López-Solanilla
- Centro de Biotecnología y Genómica de Plantas (CBGP), Universidad Politécnica de Madrid-Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria, Parque Científico y Tecnológico de la Universidad Politécnica de Madrid. Campus de Montegancedo, Pozuelo de Alarcón, Madrid, 28223, Spain.,Departamento de Biología Vegetal. Escuela Técnica Superior de Ingenieros Agrónomos, Universidad Politécnica de Madrid, Avenida Complutense, 3, Madrid, 28040, Spain
| | - Cayo Ramos
- Área de Genética, Facultad de Ciencias, Instituto de Hortofruticultura Subtropical y Mediterránea 'La Mayora', Universidad de Málaga, Consejo Superior de Investigaciones Científicas (IHSM-UMA-CSIC), Málaga, E-29071, Spain
| | - Pablo Rodríguez-Palenzuela
- Centro de Biotecnología y Genómica de Plantas (CBGP), Universidad Politécnica de Madrid-Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria, Parque Científico y Tecnológico de la Universidad Politécnica de Madrid. Campus de Montegancedo, Pozuelo de Alarcón, Madrid, 28223, Spain.,Departamento de Biología Vegetal. Escuela Técnica Superior de Ingenieros Agrónomos, Universidad Politécnica de Madrid, Avenida Complutense, 3, Madrid, 28040, Spain
| |
Collapse
|
7
|
Comparing the performance of biomedical clustering methods. Nat Methods 2015; 12:1033-8. [DOI: 10.1038/nmeth.3583] [Citation(s) in RCA: 155] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2015] [Accepted: 07/24/2015] [Indexed: 11/08/2022]
|
8
|
Abstract
Essential genes are thought to encode proteins that carry out the basic functions to sustain a cellular life, and genomic islands (GIs) usually contain clusters of horizontally transferred genes. It has been assumed that essential genes are not likely to be located in GIs, but systematical analysis of essential genes in GIs has not been explored before. Here, we have analyzed the essential genes in 28 prokaryotes by statistical method and reached a conclusion that essential genes in GIs are significantly fewer than those outside GIs. The function of 362 essential genes found in GIs has been explored further by BLAST against the Virulence Factor Database (VFDB) and the phage/prophage sequence database of PHAge Search Tool (PHAST). Consequently, 64 and 60 eligible essential genes are found to share the sequence similarity with the virulence factors and phage/prophages-related genes, respectively. Meanwhile, we find several toxin-related proteins and repressors encoded by these essential genes in GIs. The comparative analysis of essential genes in genomic islands will not only shed new light on the development of the prediction algorithm of essential genes, but also give a clue to detect the functionality of essential genes in genomic islands.
Collapse
Affiliation(s)
- Xi Zhang
- Department of Physics, Tianjin University, Tianjin 300072, China
| | - Chong Peng
- Department of Physics, Tianjin University, Tianjin 300072, China
| | - Ge Zhang
- Department of Physics, Tianjin University, Tianjin 300072, China
| | - Feng Gao
- 1] Department of Physics, Tianjin University, Tianjin 300072, China [2] Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China [3] SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering, Tianjin 300072, China
| |
Collapse
|
9
|
Abreu VAC, Almeida S, Tiwari S, Hassan SS, Mariano D, Silva A, Baumbach J, Azevedo V, Röttger R. CMRegNet-An interspecies reference database for corynebacterial and mycobacterial regulatory networks. BMC Genomics 2015; 16:452. [PMID: 26062809 PMCID: PMC4464113 DOI: 10.1186/s12864-015-1631-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Accepted: 05/14/2015] [Indexed: 11/10/2022] Open
Abstract
Background Organisms utilize a multitude of mechanisms for responding to changing environmental conditions, maintaining their functional homeostasis and to overcome stress situations. One of the most important mechanisms is transcriptional gene regulation. In-depth study of the transcriptional gene regulatory network can lead to various practical applications, creating a greater understanding of how organisms control their cellular behavior. Description In this work, we present a new database, CMRegNet for the gene regulatory networks of Corynebacterium glutamicum ATCC 13032 and Mycobacterium tuberculosis H37Rv. We furthermore transferred the known networks of these model organisms to 18 other non-model but phylogenetically close species (target organisms) of the CMNR group. In comparison to other network transfers, for the first time we utilized two model organisms resulting into a more diverse and complete network of the target organisms. Conclusion CMRegNet provides easy access to a total of 3,103 known regulations in C. glutamicum ATCC 13032 and M. tuberculosis H37Rv and to 38,940 evolutionary conserved interactions for 18 non-model species of the CMNR group. This makes CMRegNet to date the most comprehensive database of regulatory interactions of CMNR bacteria. The content of CMRegNet is publicly available online via a web interface found at http://lgcm.icb.ufmg.br/cmregnet.
Collapse
Affiliation(s)
- Vinicius A C Abreu
- Graduate Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais (Universidade Federal de Minas Gerais), Belo Horizonte, Minas Gerais, Brazil.
| | - Sintia Almeida
- Graduate Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais (Universidade Federal de Minas Gerais), Belo Horizonte, Minas Gerais, Brazil.
| | - Sandeep Tiwari
- Graduate Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais (Universidade Federal de Minas Gerais), Belo Horizonte, Minas Gerais, Brazil.
| | - Syed Shah Hassan
- Graduate Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais (Universidade Federal de Minas Gerais), Belo Horizonte, Minas Gerais, Brazil.
| | - Diego Mariano
- Graduate Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais (Universidade Federal de Minas Gerais), Belo Horizonte, Minas Gerais, Brazil.
| | - Artur Silva
- Institute of Biological Sciences, Federal University of Pará, Belém, Pará, Brazil.
| | - Jan Baumbach
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark.
| | - Vasco Azevedo
- Graduate Program in Bioinformatics, Institute of Biological Sciences, Federal University of Minas Gerais (Universidade Federal de Minas Gerais), Belo Horizonte, Minas Gerais, Brazil.
| | - Richard Röttger
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark. .,Computational Systems Biology, Max Planck Institute for Informatics, Campus E 2.1, 66123, Saarbrucken, Germany.
| |
Collapse
|
10
|
Arooj M, Sakkiah S, Cao GP, Kim S, Arulalapperumal V, Lee KW. Finding off-targets, biological pathways, and target diseases for chymase inhibitors via structure-based systems biology approach. Proteins 2015; 83:1209-24. [DOI: 10.1002/prot.24677] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Revised: 08/08/2014] [Accepted: 08/14/2014] [Indexed: 02/03/2023]
Affiliation(s)
- Mahreen Arooj
- School of Biomedical Sciences, Faculty of Health Sciences, Curtin Health Innovation Research Institute (CHIRI); Curtin University Australia
| | - Sugunadevi Sakkiah
- Division of Applied Life Science (BK21 Program); Systems and Synthetic Agrobiotech Center (SSAC), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Research Institute of Natural Science(RINS), Gyeongsang National University (GNU); 501 Jinju-daero Gazha-dong Jinju 660-701 Republic of Korea
| | - Guang Ping Cao
- Division of Applied Life Science (BK21 Program); Systems and Synthetic Agrobiotech Center (SSAC), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Research Institute of Natural Science(RINS), Gyeongsang National University (GNU); 501 Jinju-daero Gazha-dong Jinju 660-701 Republic of Korea
| | - Songmi Kim
- Division of Applied Life Science (BK21 Program); Systems and Synthetic Agrobiotech Center (SSAC), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Research Institute of Natural Science(RINS), Gyeongsang National University (GNU); 501 Jinju-daero Gazha-dong Jinju 660-701 Republic of Korea
| | - Venkatesh Arulalapperumal
- Division of Applied Life Science (BK21 Program); Systems and Synthetic Agrobiotech Center (SSAC), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Research Institute of Natural Science(RINS), Gyeongsang National University (GNU); 501 Jinju-daero Gazha-dong Jinju 660-701 Republic of Korea
| | - Keun Woo Lee
- Division of Applied Life Science (BK21 Program); Systems and Synthetic Agrobiotech Center (SSAC), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Research Institute of Natural Science(RINS), Gyeongsang National University (GNU); 501 Jinju-daero Gazha-dong Jinju 660-701 Republic of Korea
| |
Collapse
|
11
|
Kumar R, Kumari B, Srivastava A, Kumar M. NRfamPred: a proteome-scale two level method for prediction of nuclear receptor proteins and their sub-families. Sci Rep 2014; 4:6810. [PMID: 25351274 PMCID: PMC5381360 DOI: 10.1038/srep06810] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2014] [Accepted: 10/09/2014] [Indexed: 11/09/2022] Open
Abstract
Nuclear receptor proteins (NRP) are transcription factor that regulate many vital cellular processes in animal cells. NRPs form a super-family of phylogenetically related proteins and divided into different sub-families on the basis of ligand characteristics and their functions. In the post-genomic era, when new proteins are being added to the database in a high-throughput mode, it becomes imperative to identify new NRPs using information from amino acid sequence alone. In this study we report a SVM based two level prediction systems, NRfamPred, using dipeptide composition of proteins as input. At the 1st level, NRfamPred screens whether the query protein is NRP or non-NRP; if the query protein belongs to NRP class, prediction moves to 2nd level and predicts the sub-family. Using leave-one-out cross-validation, we were able to achieve an overall accuracy of 97.88% at the 1st level and an overall accuracy of 98.11% at the 2nd level with dipeptide composition. Benchmarking on independent datasets showed that NRfamPred had comparable accuracy to other existing methods, developed on the same dataset. Our method predicted the existence of 76 NRPs in the human proteome, out of which 14 are novel NRPs. NRfamPred also predicted the sub-families of these 14 NRPs.
Collapse
Affiliation(s)
- Ravindra Kumar
- Department of Biophysics, University of Delhi South Campus, Benito Juarez Road, New Delhi, India-110021
| | - Bandana Kumari
- Department of Biophysics, University of Delhi South Campus, Benito Juarez Road, New Delhi, India-110021
| | - Abhishikha Srivastava
- Department of Biophysics, University of Delhi South Campus, Benito Juarez Road, New Delhi, India-110021
| | - Manish Kumar
- Department of Biophysics, University of Delhi South Campus, Benito Juarez Road, New Delhi, India-110021
| |
Collapse
|