1
|
Sánchez-Osuna M, Cortés P, Llagostera M, Barbé J, Erill I. Exploration into the origins and mobilization of di-hydrofolate reductase genes and the emergence of clinical resistance to trimethoprim. Microb Genom 2020; 6:mgen000440. [PMID: 32969787 PMCID: PMC7725336 DOI: 10.1099/mgen.0.000440] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Accepted: 09/08/2020] [Indexed: 01/23/2023] Open
Abstract
Trimethoprim is a synthetic antibacterial agent that targets folate biosynthesis by competitively binding to the di-hydrofolate reductase enzyme (DHFR). Trimethoprim is often administered synergistically with sulfonamide, another chemotherapeutic agent targeting the di-hydropteroate synthase (DHPS) enzyme in the same pathway. Clinical resistance to both drugs is widespread and mediated by enzyme variants capable of performing their biological function without binding to these drugs. These mutant enzymes were assumed to have arisen after the discovery of these synthetic drugs, but recent work has shown that genes conferring resistance to sulfonamide were present in the bacterial pangenome millions of years ago. Here, we apply phylogenetics and comparative genomics methods to study the largest family of mobile trimethoprim-resistance genes (dfrA). We show that most of the dfrA genes identified to date map to two large clades that likely arose from independent mobilization events. In contrast to sulfonamide resistance (sul) genes, we find evidence of recurrent mobilization in dfrA genes. Phylogenetic evidence allows us to identify novel dfrA genes in the emerging pathogen Acinetobacter baumannii, and we confirm their resistance phenotype in vitro. We also identify a cluster of dfrA homologues in cryptic plasmid and phage genomes, but we show that these enzymes do not confer resistance to trimethoprim. Our methods also allow us to pinpoint the chromosomal origin of previously reported dfrA genes, and we show that many of these ancient chromosomal genes also confer resistance to trimethoprim. Our work reveals that trimethoprim resistance predated the clinical use of this chemotherapeutic agent, but that novel mutations have likely also arisen and become mobilized following its widespread use within and outside the clinic. Hence, this work confirms that resistance to novel drugs may already be present in the bacterial pangenome, and stresses the importance of rapid mobilization as a fundamental element in the emergence and global spread of resistance determinants.
Collapse
Affiliation(s)
- Miquel Sánchez-Osuna
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Pilar Cortés
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Montserrat Llagostera
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Jordi Barbé
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Ivan Erill
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, MD, USA
| |
Collapse
|
2
|
Feltes BC, Grisci BI, Poloni JDF, Dorn M. Perspectives and applications of machine learning for evolutionary developmental biology. Mol Omics 2018; 14:289-306. [PMID: 30168572 DOI: 10.1039/c8mo00111a] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Evolutionary Developmental Biology (Evo-Devo) is an ever-expanding field that aims to understand how development was modulated by the evolutionary process. In this sense, "omic" studies emerged as a powerful ally to unravel the molecular mechanisms underlying development. In this scenario, bioinformatics tools become necessary to analyze the growing amount of information. Among computational approaches, machine learning stands out as a promising field to generate knowledge and trace new research perspectives for bioinformatics. In this review, we aim to expose the current advances of machine learning applied to evolution and development. We draw clear perspectives and argue how evolution impacted machine learning techniques.
Collapse
Affiliation(s)
- Bruno César Feltes
- Institute of Informatics, Federal University of Rio Grande do Sul, Porto Alegre, Brazil.
| | | | | | | |
Collapse
|
3
|
Identification of novel mazEF/pemIK family toxin-antitoxin loci and their distribution in the Staphylococcus genus. Sci Rep 2017; 7:13462. [PMID: 29044211 PMCID: PMC5647390 DOI: 10.1038/s41598-017-13857-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2017] [Accepted: 10/02/2017] [Indexed: 11/15/2022] Open
Abstract
The versatile roles of toxin-antitoxin (TA) systems in bacterial physiology and pathogenesis have been investigated for more than three decades. Diverse TA loci in Bacteria and Archaea have been identified in genome-wide studies. The advent of massive parallel sequencing has substantially expanded the number of known bacterial genomic sequences over the last 5 years. In staphylococci, this has translated into an impressive increase from a few tens to a several thousands of available genomes, which has allowed us for the re-evalution of prior conclusions. In this study, we analysed the distribution of mazEF/pemIK family TA system operons in available staphylococcal genomes and their prevalence in mobile genetic elements. 10 novel mazEF/pemIK homologues were identified, each with a corresponding toxin that plays a potentially different and undetermined physiological role. A detailed characterisation of these TA systems would be exceptionally useful. Of particular interest are those associated with an SCCmec mobile genetic element (responsible for multidrug resistance transmission) or representing the joint horizontal transfer of TA systems and determinants of vancomycin resistance from enterococci. The involvement of TA systems in maintaining mobile genetic elements and the associations between novel mazEF/pemIK loci and those which carry drug resistance genes highlight their potential medical importance.
Collapse
|
4
|
Ghosh P, Sowdhamini R. Bioinformatics comparisons of RNA-binding proteins of pathogenic and non-pathogenic Escherichia coli strains reveal novel virulence factors. BMC Genomics 2017; 18:658. [PMID: 28836963 PMCID: PMC5571608 DOI: 10.1186/s12864-017-4045-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Accepted: 08/09/2017] [Indexed: 12/03/2022] Open
Abstract
Background Pathogenic bacteria have evolved various strategies to counteract host defences. They are also exposed to environments that are undergoing constant changes. Hence, in order to survive, bacteria must adapt themselves to the changing environmental conditions by performing regulations at the transcriptional and/or post-transcriptional levels. Roles of RNA-binding proteins (RBPs) as virulence factors have been very well studied. Here, we have used a sequence search-based method to compare and contrast the proteomes of 16 pathogenic and three non-pathogenic E. coli strains as well as to obtain a global picture of the RBP landscape (RBPome) in E. coli. Results Our results show that there are no significant differences in the percentage of RBPs encoded by the pathogenic and the non-pathogenic E. coli strains. The differences in the types of Pfam domains as well as Pfam RNA-binding domains, encoded by these two classes of E. coli strains, are also insignificant. The complete and distinct RBPome of E. coli has been established by studying all known E. coli strains till date. We have also identified RBPs that are exclusive to pathogenic strains, and most of them can be exploited as drug targets since they appear to be non-homologous to their human host proteins. Many of these pathogen-specific proteins were uncharacterised and their identities could be resolved on the basis of sequence homology searches with known proteins. Detailed structural modelling, molecular dynamics simulations and sequence comparisons have been pursued for selected examples to understand differences in stability and RNA-binding. Conclusions The approach used in this paper to cross-compare proteomes of pathogenic and non-pathogenic strains may also be extended to other bacterial or even eukaryotic proteomes to understand interesting differences in their RBPomes. The pathogen-specific RBPs reported in this study, may also be taken up further for clinical trials and/or experimental validations. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-4045-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Pritha Ghosh
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bellary Road, Bangalore, Karnataka, 560 065, India
| | - Ramanathan Sowdhamini
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bellary Road, Bangalore, Karnataka, 560 065, India.
| |
Collapse
|
5
|
Kaushik S, Nair AG, Mutt E, Subramanian HP, Sowdhamini R. Rapid and enhanced remote homology detection by cascading hidden Markov model searches in sequence space. Bioinformatics 2015; 32:338-44. [DOI: 10.1093/bioinformatics/btv538] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2015] [Accepted: 09/06/2015] [Indexed: 11/14/2022] Open
|
6
|
Abstract
Immunoinformatics focuses on modeling immune responses for better understanding of the immune system and in many cases for proposing agents able to modify the immune system. The most classical of these agents are vaccines derived from living organisms such as smallpox or polio. More modern vaccines comprise recombinant proteins, protein domains, and in some cases peptides. Generating a vaccine from peptides however requires technologies and concepts very different from classical vaccinology. Immunoinformatics therefore provides the computational tools to propose peptides suitable for formulation into vaccines. This chapter introduces the essential biological concepts affecting design and efficacy of peptide vaccines and discusses current methods and workflows applied to design successful peptide vaccines using computers.
Collapse
Affiliation(s)
- Johannes Söllner
- Emergentec Biodevelopment GmbH, Gersthofer Straße 29-31, 1180, Vienna, Austria,
| |
Collapse
|
7
|
Mudgal R, Sandhya S, Kumar G, Sowdhamini R, Chandra NR, Srinivasan N. NrichD database: sequence databases enriched with computationally designed protein-like sequences aid in remote homology detection. Nucleic Acids Res 2014; 43:D300-5. [PMID: 25262355 PMCID: PMC4384005 DOI: 10.1093/nar/gku888] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
NrichD (http://proline.biochem.iisc.ernet.in/NRICHD/) is a database of computationally designed protein-like sequences, augmented into natural sequence databases that can perform hops in protein sequence space to assist in the detection of remote relationships. Establishing protein relationships in the absence of structural evidence or natural ‘intermediately related sequences’ is a challenging task. Recently, we have demonstrated that the computational design of artificial intermediary sequences/linkers is an effective approach to fill naturally occurring voids in protein sequence space. Through a large-scale assessment we have demonstrated that such sequences can be plugged into commonly employed search databases to improve the performance of routinely used sequence search methods in detecting remote relationships. Since it is anticipated that such data sets will be employed to establish protein relationships, two databases that have already captured these relationships at the structural and functional domain level, namely, the SCOP database and the Pfam database, have been ‘enriched’ with these artificial intermediary sequences. NrichD database currently contains 3 611 010 artificial sequences that have been generated between 27 882 pairs of families from 374 SCOP folds. The data sets are freely available for download. Additional features include the design of artificial sequences between any two protein families of interest to the user.
Collapse
Affiliation(s)
- Richa Mudgal
- IISc Mathematics Initiative, Indian Institute of Science, Bangalore 560 012, Karnataka, India
| | - Sankaran Sandhya
- Department of Biochemistry, Indian Institute of Science, Bangalore 560 012, Karnataka, India
| | - Gayatri Kumar
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560 012, Karnataka, India
| | - Ramanathan Sowdhamini
- National Centre for Biological Sciences, Gandhi Krishi Vignan Kendra Campus, Bellary road, Bangalore 560 065, Karnataka, India
| | - Nagasuma R Chandra
- Department of Biochemistry, Indian Institute of Science, Bangalore 560 012, Karnataka, India
| | | |
Collapse
|
8
|
Lo MK, Søgaard TM, Karlin DG. Evolution and structural organization of the C proteins of paramyxovirinae. PLoS One 2014; 9:e90003. [PMID: 24587180 PMCID: PMC3934983 DOI: 10.1371/journal.pone.0090003] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2013] [Accepted: 01/24/2014] [Indexed: 12/21/2022] Open
Abstract
The phosphoprotein (P) gene of most Paramyxovirinae encodes several proteins in overlapping frames: P and V, which share a common N-terminus (PNT), and C, which overlaps PNT. Overlapping genes are of particular interest because they encode proteins originated de novo, some of which have unknown structural folds, challenging the notion that nature utilizes only a limited, well-mapped area of fold space. The C proteins cluster in three groups, comprising measles, Nipah, and Sendai virus. We predicted that all C proteins have a similar organization: a variable, disordered N-terminus and a conserved, α-helical C-terminus. We confirmed this predicted organization by biophysically characterizing recombinant C proteins from Tupaia paramyxovirus (measles group) and human parainfluenza virus 1 (Sendai group). We also found that the C of the measles and Nipah groups have statistically significant sequence similarity, indicating a common origin. Although the C of the Sendai group lack sequence similarity with them, we speculate that they also have a common origin, given their similar genomic location and structural organization. Since C is dispensable for viral replication, unlike PNT, we hypothesize that C may have originated de novo by overprinting PNT in the ancestor of Paramyxovirinae. Intriguingly, in measles virus and Nipah virus, PNT encodes STAT1-binding sites that overlap different regions of the C-terminus of C, indicating they have probably originated independently. This arrangement, in which the same genetic region encodes simultaneously a crucial functional motif (a STAT1-binding site) and a highly constrained region (the C-terminus of C), seems paradoxical, since it should severely reduce the ability of the virus to adapt. The fact that it originated twice suggests that it must be balanced by an evolutionary advantage, perhaps from reducing the size of the genetic region vulnerable to mutations.
Collapse
Affiliation(s)
- Michael K. Lo
- Centers for Disease Control and Prevention, Viral Special Pathogens Branch, Atlanta, Georgia, United States of America
| | - Teit Max Søgaard
- Division of Structural Biology, Oxford University, Oxford, United Kingdom
| | - David G. Karlin
- Division of Structural Biology, Oxford University, Oxford, United Kingdom
- Department of Zoology, University of Oxford, Oxford, United Kingdom
- * E-mail:
| |
Collapse
|
9
|
Powerful sequence similarity search methods and in-depth manual analyses can identify remote homologs in many apparently "orphan" viral proteins. J Virol 2013; 88:10-20. [PMID: 24155369 PMCID: PMC3911697 DOI: 10.1128/jvi.02595-13] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The genome sequences of new viruses often contain many "orphan" or "taxon-specific" proteins apparently lacking homologs. However, because viral proteins evolve very fast, commonly used sequence similarity detection methods such as BLAST may overlook homologs. We analyzed a data set of proteins from RNA viruses characterized as "genus specific" by BLAST. More powerful methods developed recently, such as HHblits or HHpred (available through web-based, user-friendly interfaces), could detect distant homologs of a quarter of these proteins, suggesting that these methods should be used to annotate viral genomes. In-depth manual analyses of a subset of the remaining sequences, guided by contextual information such as taxonomy, gene order, or domain cooccurrence, identified distant homologs of another third. Thus, a combination of powerful automated methods and manual analyses can uncover distant homologs of many proteins thought to be orphans. We expect these methodological results to be also applicable to cellular organisms, since they generally evolve much more slowly than RNA viruses. As an application, we reanalyzed the genome of a bee pathogen, Chronic bee paralysis virus (CBPV). We could identify homologs of most of its proteins thought to be orphans; in each case, identifying homologs provided functional clues. We discovered that CBPV encodes a domain homologous to the Alphavirus methyltransferase-guanylyltransferase; a putative membrane protein, SP24, with homologs in unrelated insect viruses and insect-transmitted plant viruses having different morphologies (cileviruses, higreviruses, blunerviruses, negeviruses); and a putative virion glycoprotein, ORF2, also found in negeviruses. SP24 and ORF2 are probably major structural components of the virions.
Collapse
|