1
|
Eggerichs D, Weindorf N, Weddeling HG, Van der Linden IM, Tischler D. Substrate scope expansion of 4-phenol oxidases by rational enzyme selection and sequence-function relations. Commun Chem 2024; 7:123. [PMID: 38831005 PMCID: PMC11148156 DOI: 10.1038/s42004-024-01207-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 05/15/2024] [Indexed: 06/05/2024] Open
Abstract
Enzymes are natures' catalysts and will have a lasting impact on (organic) synthesis as they possess unchallenged regio- and stereo selectivity. On the downside, this high selectivity limits enzymes' substrate range and hampers their universal application. Therefore, substrate scope expansion of enzyme families by either modification of known biocatalysts or identification of new members is a key challenge in enzyme-driven catalysis. Here, we present a streamlined approach to rationally select enzymes with proposed functionalities from the ever-increasing amount of available sequence data. In a case study on 4-phenol oxidoreductases, eight enzymes of the oxidase branch were selected from 292 sequences on basis of the properties of first shell residues of the catalytic pocket, guided by the computational tool A2CA. Correlations between these residues and enzyme activity yielded robust sequence-function relations, which were exploited by site-saturation mutagenesis. Application of a peroxidase-independent oxidase screening resulted in 16 active enzyme variants which were up to 90-times more active than respective wildtype enzymes and up to 6-times more active than the best performing natural variants. The results were supported by kinetic experiments and structural models. The newly introduced amino acids confirmed the correlation studies which overall highlights the successful logic of the presented approach.
Collapse
Affiliation(s)
- Daniel Eggerichs
- Microbial Biotechnology, Ruhr University Bochum, Universitätsstr. 150, 44780, Bochum, Germany
| | - Nils Weindorf
- Microbial Biotechnology, Ruhr University Bochum, Universitätsstr. 150, 44780, Bochum, Germany
| | - Heiner G Weddeling
- Microbial Biotechnology, Ruhr University Bochum, Universitätsstr. 150, 44780, Bochum, Germany
| | - Inja M Van der Linden
- Microbial Biotechnology, Ruhr University Bochum, Universitätsstr. 150, 44780, Bochum, Germany
| | - Dirk Tischler
- Microbial Biotechnology, Ruhr University Bochum, Universitätsstr. 150, 44780, Bochum, Germany.
| |
Collapse
|
2
|
Dhabalia Ashok A, Freitag JN, Irisarri I, de Vries S, de Vries J. Sequence similarity networks bear out hierarchical relationships of green cytochrome P450. PHYSIOLOGIA PLANTARUM 2024; 176:e14244. [PMID: 38480467 DOI: 10.1111/ppl.14244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 02/14/2024] [Accepted: 02/16/2024] [Indexed: 03/24/2024]
Abstract
Land plants have diversified enzyme families. One of the most prominent is the cytochrome P450 (CYP or CYP450) family. With over 443,000 CYP proteins sequenced across the tree of life, CYPs are ubiquitous in archaea, bacteria, and eukaryotes. Here, we focused on land plants and algae to study the role of CYP diversification. CYPs, acting as monooxygenases, catalyze hydroxylation reactions crucial for specialized plant metabolic pathways, including detoxification and phytohormone production; the CYPome consists of one enormous superfamily that is divided into clans and families. Their evolutionary history speaks of high substrate promiscuity; radiation and functional diversification have yielded numerous CYP families. To understand the evolutionary relationships within the CYPs, we employed sequence similarity network analyses. We recovered distinct clusters representing different CYP families, reflecting their diversified sequences that we link to the prediction of functionalities. Hierarchical clustering and phylogenetic analysis further elucidated relationships between CYP clans, uncovering their shared deep evolutionary history. We explored the distribution and diversification of CYP subfamilies across plant and algal lineages, uncovering novel candidates and providing insights into the evolution of these enzyme families. This identified unexpected relationships between CYP families, such as the link between CYP82 and CYP74, shedding light on their roles in plant defense signaling pathways. Our approach provides a methodology that brings insights into the emergence of new functions within the CYP450 family, contributing to the evolutionary history of plants and algae. These insights can be further validated and implemented via experimental setups under various external conditions.
Collapse
Affiliation(s)
- Amra Dhabalia Ashok
- Institute of Microbiology and Genetics, Department of Applied Bioinformatics, University of Goettingen, Goettingen, Germany
| | - Jella N Freitag
- Institute of Microbiology and Genetics, Department of Applied Bioinformatics, University of Goettingen, Goettingen, Germany
| | - Iker Irisarri
- Institute of Microbiology and Genetics, Department of Applied Bioinformatics, University of Goettingen, Goettingen, Germany
- Section Phylogenomics, Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change (LIB), Museum of Nature, Hamburg, Germany
| | - Sophie de Vries
- Institute of Microbiology and Genetics, Department of Applied Bioinformatics, University of Goettingen, Goettingen, Germany
| | - Jan de Vries
- Institute of Microbiology and Genetics, Department of Applied Bioinformatics, University of Goettingen, Goettingen, Germany
- Campus Institute Data Science (CIDAS), University of Goettingen, Goettingen, Germany
- Goettingen Center for Molecular Biosciences (GZMB), Department of Applied Bioinformatics, University of Goettinzgen, Goettingen, Germany
| |
Collapse
|
3
|
López MB, Oterino MB, González JM. The Structural Biology of Catalase Evolution. Subcell Biochem 2024; 104:33-47. [PMID: 38963482 DOI: 10.1007/978-3-031-58843-3_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/05/2024]
Abstract
Catalases are essential enzymes for removal of hydrogen peroxide, enabling aerobic and anaerobic metabolism in an oxygenated atmosphere. Monofunctional heme catalases, catalase-peroxidases, and manganese catalases, evolved independently more than two billion years ago, constituting a classic example of convergent evolution. Herein, the diversity of catalase sequences is analyzed through sequence similarity networks, providing the context for sequence distribution of major catalase families, and showing that many divergent catalase families remain to be experimentally studied.
Collapse
Affiliation(s)
- María Belén López
- Instituto de Bionanotecnología del NOA (INBIONATEC-CONICET), Universidad Nacional de Santiago del Estero (UNSE), Santiago del Estero, Argentina
| | - María Belén Oterino
- Instituto de Bionanotecnología del NOA (INBIONATEC-CONICET), Universidad Nacional de Santiago del Estero (UNSE), Santiago del Estero, Argentina
| | - Javier M González
- Instituto de Bionanotecnología del NOA (INBIONATEC-CONICET), Universidad Nacional de Santiago del Estero (UNSE), Santiago del Estero, Argentina.
| |
Collapse
|
4
|
Oberg N, Zallot R, Gerlt JA. EFI-EST, EFI-GNT, and EFI-CGFP: Enzyme Function Initiative (EFI) Web Resource for Genomic Enzymology Tools. J Mol Biol 2023; 435:168018. [PMID: 37356897 PMCID: PMC10291204 DOI: 10.1016/j.jmb.2023.168018] [Citation(s) in RCA: 63] [Impact Index Per Article: 63.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 02/04/2023] [Accepted: 02/13/2023] [Indexed: 02/19/2023]
Abstract
The Enzyme Function Initiative (EFI) provides a web resource with "genomic enzymology" web tools to leverage the protein (UniProt) and genome (European Nucleotide Archive; ENA; https://www.ebi.ac.uk/ena/) databases to assist the assignment of in vitro enzymatic activities and in vivo metabolic functions to uncharacterized enzymes (https://efi.igb.illinois.edu/). The tools enable (1) exploration of sequence-function space in enzyme families using sequence similarity networks (SSNs; EFI-EST), (2) easy access to genome context for bacterial, archaeal, and fungal proteins in the SSN clusters so that isofunctional families can be identified and their functions inferred from genome context (EFI-GNT); and (3) determination of the abundance of SSN clusters in NIH Human Metagenome Project metagenomes using chemically guided functional profiling (EFI-CGFP). We describe enhancements that enable SSNs to be generated from taxonomy categories, allowing higher resolution analyses of sequence-function space; we provide examples of the generation of taxonomy category-specific SSNs.
Collapse
Affiliation(s)
- Nils Oberg
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, 1206 West Gregory Drive, Urbana, IL 61801, United States
| | - Rémi Zallot
- Department of Chemistry, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK; Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK
| | - John A Gerlt
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, 1206 West Gregory Drive, Urbana, IL 61801, United States; Department of Biochemistry, University of Illinois at Urbana-Champaign, 1206 West Gregory Drive, Urbana, IL 61801, United States; Department of Chemistry, University of Illinois at Urbana-Champaign, 1206 West Gregory Drive, Urbana, IL 61801, United States.
| |
Collapse
|
5
|
Knox HL, Allen KN. Expanding the viewpoint: Leveraging sequence information in enzymology. Curr Opin Chem Biol 2023; 72:102246. [PMID: 36599282 PMCID: PMC10251232 DOI: 10.1016/j.cbpa.2022.102246] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 10/31/2022] [Accepted: 11/21/2022] [Indexed: 01/04/2023]
Abstract
The use of protein sequence to inform enzymology in terms of structure, mechanism, and function has burgeoned over the past two decades. Referred to as genomic enzymology, the utilization of bioinformatic tools such as sequence similarity networks and phylogenetic analyses has allowed the identification of new substrates and metabolites, novel pathways, and unexpected reaction mechanisms. The holistic examination of superfamilies can yield insight into the origins and paths of evolution of enzymes and the range of their substrates and mechanisms. Herein, we highlight advances in the use of genomic enzymology to address problems which the in-depth analyses of a single enzyme alone could not enable.
Collapse
Affiliation(s)
- Hayley L Knox
- Department of Chemistry, Boston University, 590 Commonwealth Avenue, Boston, MA, 02215-2521, USA
| | - Karen N Allen
- Department of Chemistry, Boston University, 590 Commonwealth Avenue, Boston, MA, 02215-2521, USA.
| |
Collapse
|
6
|
Pasternak Z, Chapnik N, Yosef R, Kopelman NM, Jurkevitch E, Segev E. Identifying protein function and functional links based on large-scale co-occurrence patterns. PLoS One 2022; 17:e0264765. [PMID: 35239724 PMCID: PMC8893610 DOI: 10.1371/journal.pone.0264765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 02/16/2022] [Indexed: 11/23/2022] Open
Abstract
Objective The vast majority of known proteins have not been experimentally tested even at the level of measuring their expression, and the function of many proteins remains unknown. In order to decipher protein function and examine functional associations, we developed "Cliquely", a software tool based on the exploration of co-occurrence patterns. Computational model Using a set of more than 23 million proteins divided into 404,947 orthologous clusters, we explored the co-occurrence graph of 4,742 fully sequenced genomes from the three domains of life. Edge weights in this graph represent co-occurrence probabilities. We use the Bron–Kerbosch algorithm to detect maximal cliques in this graph, fully-connected subgraphs that represent meaningful biological networks from different functional categories. Main results We demonstrate that Cliquely can successfully identify known networks from various pathways, including nitrogen fixation, glycolysis, methanogenesis, mevalonate and ribosome proteins. Identifying the virulence-associated type III secretion system (T3SS) network, Cliquely also added 13 previously uncharacterized novel proteins to the T3SS network, demonstrating the strength of this approach. Cliquely is freely available and open source. Users can employ the tool to explore co-occurrence networks using a protein of interest and a customizable level of stringency, either for the entire dataset or for a one of the three domains—Archaea, Bacteria, or Eukarya.
Collapse
Affiliation(s)
- Zohar Pasternak
- Division of Identification and Forensic Science, Israel Police, Jerusalem, Israel
- Faculty of Management of Technology, Holon Institute of Technology, Holon, Israel
| | - Noam Chapnik
- Faculty of Management of Technology, Holon Institute of Technology, Holon, Israel
| | - Roy Yosef
- Faculty of Management of Technology, Holon Institute of Technology, Holon, Israel
| | - Naama M. Kopelman
- Faculty of Science, Holon Institute of Technology, Holon, Israel
- * E-mail:
| | - Edouard Jurkevitch
- Department of Plant Pathology and Microbiology, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Elad Segev
- Faculty of Science, Holon Institute of Technology, Holon, Israel
| |
Collapse
|
7
|
Cortés-Albayay C, Sangal V, Klenk HP, Nouioui I. Comparative Genomic Study of Vinyl Chloride Cluster and Description of Novel Species, Mycolicibacterium vinylchloridicum sp. nov. Front Microbiol 2021; 12:767895. [PMID: 35003006 PMCID: PMC8727900 DOI: 10.3389/fmicb.2021.767895] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 11/16/2021] [Indexed: 11/30/2022] Open
Abstract
Advanced physicochemical and chemical absorption methods for chlorinated ethenes are feasible but incur high costs and leave traces of pollutants on the site. Biodegradation of such pollutants by anaerobic or aerobic bacteria is emerging as a potential alternative. Several mycobacteria including Mycolicibacterium aurum L1, Mycolicibacterium chubuense NBB4, Mycolicibacterium rhodesiae JS60, Mycolicibacterium rhodesiae NBB3 and Mycolicibacterium smegmatis JS623 have previously been described as assimilators of vinyl chloride (VC). In this study, we compared nucleotide sequence of VC cluster and performed a taxogenomic evaluation of these mycobacterial species. The results showed that the complete VC cluster was acquired by horizontal gene transfer and not intrinsic to the genus Mycobacterium sensu lato. These results also revealed the presence of an additional xcbF1 gene that seems to be involved in Coenzyme M biosynthesis, which is ultimately used in the VC degradation pathway. Furthermore, we suggest for the first time that S/N-Oxide reductase encoding gene was involved in the dissociation of the SsuABC transporters from the organosulfur, which play a crucial role in the Coenzyme M biosynthesis. Based on genomic data, M. aurum L1, M. chubuense NBB4, M. rhodesiae JS60, M. rhodesiae NBB3 and M. smegmatis JS623 were misclassified and form a novel species within the genus Mycobacterium sensu lato. Mycolicibacterium aurum L1T (CECT 8761T = DSM 6695T) was the subject of polyphasic taxonomic studies and showed ANI and dDDH values of 84.7 and 28.5% with its close phylogenetic neighbour, M. sphagni ATCC 33027T. Phenotypic, chemotaxonomic and genomic data considering strain L1T (CECT 8761T = DSM 6695T) as a type strain of novel species with the proposed name, Mycolicibacterium vinylchloridicum sp. nov.
Collapse
Affiliation(s)
- Carlos Cortés-Albayay
- Faculty of Science, School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Vartul Sangal
- Faculty of Health and Life Sciences, Northumbria University, Newcastle upon Tyne, United Kingdom
| | - Hans-Peter Klenk
- Faculty of Science, School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Imen Nouioui
- Leibniz Institute DSMZ – German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
- *Correspondence: Imen Nouioui,
| |
Collapse
|
8
|
Sun Z, Xu B, Spisak S, Kavran JM, Rokita SE. The minimal structure for iodotyrosine deiodinase function is defined by an outlier protein from the thermophilic bacterium Thermotoga neapolitana. J Biol Chem 2021; 297:101385. [PMID: 34748729 PMCID: PMC8668982 DOI: 10.1016/j.jbc.2021.101385] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 10/27/2021] [Accepted: 10/28/2021] [Indexed: 11/12/2022] Open
Abstract
The nitroreductase superfamily of enzymes encompasses many flavin mononucleotide (FMN)-dependent catalysts promoting a wide range of reactions. All share a common core consisting of an FMN-binding domain, and individual subgroups additionally contain one to three sequence extensions radiating from defined positions within this core to support their unique catalytic properties. To identify the minimum structure required for activity in the iodotyrosine deiodinase subgroup of this superfamily, attention was directed to a representative from the thermophilic organism Thermotoga neapolitana (TnIYD). This representative was selected based on its status as an outlier of the subgroup arising from its deficiency in certain standard motifs evident in all homologues from mesophiles. We found that TnIYD lacked a typical N-terminal sequence and one of its two characteristic sequence extensions, neither of which was found to be necessary for activity. We also show that TnIYD efficiently promotes dehalogenation of iodo-, bromo-, and chlorotyrosine, analogous to related deiodinases (IYDs) from humans and other mesophiles. In addition, 2-iodophenol is a weak substrate for TnIYD as it was for all other IYDs characterized to date. Consistent with enzymes from thermophilic organisms, we observed that TnIYD adopts a compact fold and low surface area compared with IYDs from mesophilic organisms. The insights gained from our investigations on TnIYD demonstrate the advantages of focusing on sequences that diverge from conventional standards to uncover the minimum essentials for activity. We conclude that TnIYD now represents a superior starting structure for future efforts to engineer a stable dehalogenase targeting halophenols of environmental concern.
Collapse
Affiliation(s)
- Zuodong Sun
- Department of Chemistry, Johns Hopkins University, Baltimore, Maryland, USA
| | - Bing Xu
- Department of Chemistry, Johns Hopkins University, Baltimore, Maryland, USA
| | - Shaun Spisak
- Chemistry-Biology Interface Graduate Program, Johns Hopkins University, Baltimore, Maryland, USA
| | - Jennifer M Kavran
- Department of Biochemistry and Molecular Biology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, USA; Department of Biophysics and Biophysical Chemistry, School of Medicine, Johns Hopkins University, Baltimore, Maryland, USA; Department of Oncology, School of Medicine, Johns Hopkins University, Baltimore, Maryland, USA
| | - Steven E Rokita
- Department of Chemistry, Johns Hopkins University, Baltimore, Maryland, USA.
| |
Collapse
|
9
|
Luo T, Dou Z, Sun Z, Chen X, Ni Y, Xu G. A novel and robust 3-quinuclidinone reductase from Kaistia algarum for efficient synthesis of (R)-3-quinuclidinol without external cofactor. MOLECULAR CATALYSIS 2021. [DOI: 10.1016/j.mcat.2021.111861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
10
|
Lima LF, Torres AQ, Jardim R, Mesquita RD, Schama R. Evolution of Toll, Spatzle and MyD88 in insects: the problem of the Diptera bias. BMC Genomics 2021; 22:562. [PMID: 34289811 PMCID: PMC8296651 DOI: 10.1186/s12864-021-07886-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 07/13/2021] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Arthropoda, the most numerous and diverse metazoan phylum, has species in many habitats where they encounter various microorganisms and, as a result, mechanisms for pathogen recognition and elimination have evolved. The Toll pathway, involved in the innate immune system, was first described as part of the developmental pathway for dorsal-ventral differentiation in Drosophila. Its later discovery in vertebrates suggested that this system was extremely conserved. However, there is variation in presence/absence, copy number and sequence divergence in various genes along the pathway. As most studies have only focused on Diptera, for a comprehensive and accurate homology-based approach it is important to understand gene function in a number of different species and, in a group as diverse as insects, the use of species belonging to different taxonomic groups is essential. RESULTS We evaluated the diversity of Toll pathway gene families in 39 Arthropod genomes, encompassing 13 different Insect Orders. Through computational methods, we shed some light into the evolution and functional annotation of protein families involved in the Toll pathway innate immune response. Our data indicates that: 1) intracellular proteins of the Toll pathway show mostly species-specific expansions; 2) the different Toll subfamilies seem to have distinct evolutionary backgrounds; 3) patterns of gene expansion observed in the Toll phylogenetic tree indicate that homology based methods of functional inference might not be accurate for some subfamilies; 4) Spatzle subfamilies are highly divergent and also pose a problem for homology based inference; 5) Spatzle subfamilies should not be analyzed together in the same phylogenetic framework; 6) network analyses seem to be a good first step in inferring functional groups in these cases. We specifically show that understanding Drosophila's Toll functions might not indicate the same function in other species. CONCLUSIONS Our results show the importance of using species representing the different orders to better understand insect gene content, origin and evolution. More specifically, in intracellular Toll pathway gene families the presence of orthologues has important implications for homology based functional inference. Also, the different evolutionary backgrounds of Toll gene subfamilies should be taken into consideration when functional studies are performed, especially for TOLL9, TOLL, TOLL2_7, and the new TOLL10 clade. The presence of Diptera specific clades or the ones lacking Diptera species show the importance of overcoming the Diptera bias when performing functional characterization of Toll pathways.
Collapse
Affiliation(s)
- Letícia Ferreira Lima
- Laboratório de Biologia Computacional e Sistemas, Oswaldo Cruz Foundation, Fiocruz, Rio de Janeiro, Brazil
| | - André Quintanilha Torres
- Laboratório de Biologia Computacional e Sistemas, Oswaldo Cruz Foundation, Fiocruz, Rio de Janeiro, Brazil
| | - Rodrigo Jardim
- Laboratório de Biologia Computacional e Sistemas, Oswaldo Cruz Foundation, Fiocruz, Rio de Janeiro, Brazil
| | - Rafael Dias Mesquita
- Laboratório de Bioinformática, Instituto de Química, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
- Instituto Nacional de Ciência e Tecnologia em Entomologia Molecular-INCT-EM, Rio de Janeiro, Brazil
| | - Renata Schama
- Laboratório de Biologia Computacional e Sistemas, Oswaldo Cruz Foundation, Fiocruz, Rio de Janeiro, Brazil.
- Instituto Nacional de Ciência e Tecnologia em Entomologia Molecular-INCT-EM, Rio de Janeiro, Brazil.
| |
Collapse
|
11
|
Glycoconjugate pathway connections revealed by sequence similarity network analysis of the monotopic phosphoglycosyl transferases. Proc Natl Acad Sci U S A 2021; 118:2018289118. [PMID: 33472976 DOI: 10.1073/pnas.2018289118] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The monotopic phosphoglycosyl transferase (monoPGT) superfamily comprises over 38,000 nonredundant sequences represented in bacterial and archaeal domains of life. Members of the superfamily catalyze the first membrane-committed step in en bloc oligosaccharide biosynthetic pathways, transferring a phosphosugar from a soluble nucleoside diphosphosugar to a membrane-resident polyprenol phosphate. The singularity of the monoPGT fold and its employment in the pivotal first membrane-committed step allows confident assignment of both protein and corresponding pathway. The diversity of the family is revealed by the generation and analysis of a sequence similarity network for the superfamily, with fusion of monoPGTs with other pathway members being the most frequent and extensive elaboration. Three common fusions were identified: sugar-modifying enzymes, glycosyl transferases, and regulatory domains. Additionally, unexpected fusions of the monoPGT with members of the polytopic PGT superfamily were discovered, implying a possible evolutionary link through the shared polyprenol phosphate substrate. Notably, a phylogenetic reconstruction of the monoPGT superfamily shows a radial burst of functionalization, with a minority of members comprising only the minimal PGT catalytic domain. The commonality and identity of the fusion partners in the monoPGT superfamily is consistent with advantageous colocalization of pathway members at membrane interfaces.
Collapse
|
12
|
González JM. Visualizing the superfamily of metallo-β-lactamases through sequence similarity network neighborhood connectivity analysis. Heliyon 2021; 7:e05867. [PMID: 33426353 PMCID: PMC7785958 DOI: 10.1016/j.heliyon.2020.e05867] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Revised: 11/19/2020] [Accepted: 12/23/2020] [Indexed: 12/13/2022] Open
Abstract
Protein sequence similarity networks (SSNs) constitute a convenient approach to analyze large polypeptide sequence datasets, and have been successfully applied to study a number of protein families over the past decade. SSN analysis is herein combined with traditional cladistic and phenetic phylogenetic analysis (respectively based on multiple sequence alignments and all-against-all three-dimensional protein structure comparisons) in order to assist the ancestral reconstruction and integrative revision of the superfamily of metallo-β-lactamases (MBLs). It is shown that only 198 out of 15,292 representative nodes contain at least one experimentally obtained protein structure in the Protein Data Bank or a manually annotated SwissProt entry, that is to say, only 1.3 % of the superfamily has been functionally and/or structurally characterized. Besides, neighborhood connectivity coloring, which measures local network interconnectivity, is introduced for detection of protein families within SSN clusters. This approach provides a clear picture of how many families remain unexplored in the superfamily, while most MBL research is heavily biased towards a few families. Further research is suggested in order to determine the SSN topological properties, which will be instrumental for the improvement of automated sequence annotation methods.
Collapse
|
13
|
Tararina MA, Allen KN. Bioinformatic Analysis of the Flavin-Dependent Amine Oxidase Superfamily: Adaptations for Substrate Specificity and Catalytic Diversity. J Mol Biol 2020; 432:3269-3288. [PMID: 32198115 DOI: 10.1016/j.jmb.2020.03.007] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 02/24/2020] [Accepted: 03/06/2020] [Indexed: 12/29/2022]
Abstract
The flavin-dependent amine oxidase (FAO) superfamily consists of over 9000 nonredundant sequences represented in all domains of life. Of the thousands of members identified, only 214 have been functionally annotated to date, and 40 unique structures are represented in the Protein Data Bank. The few functionally characterized members share a catalytic mechanism involving the oxidation of an amine substrate through transfer of a hydride to the FAD cofactor, with differences observed in substrate specificities. Previous studies have focused on comparing a subset of superfamily members. Here, we present a comprehensive analysis of the FAO superfamily based on reaction mechanism and substrate recognition. Using a dataset of 9192 sequences, a sequence similarity network, and subsequently, a genome neighborhood network were constructed, organizing the superfamily into eight subgroups that accord with substrate type. Likewise, through phylogenetic analysis, the evolutionary relationship of subgroups was determined, delineating the divergence between enzymes based on organism, substrate, and mechanism. In addition, using sequences and atomic coordinates of 22 structures from the Protein Data Bank to perform sequence and structural alignments, active-site elements were identified, showing divergence from the canonical aromatic-cage residues to accommodate large substrates. These specificity determinants are held in a structural framework comprising a core domain catalyzing the oxidation of amines with an auxiliary domain for substrate recognition. Overall, analysis of the FAO superfamily reveals a modular fold with cofactor and substrate-binding domains allowing for diversity of recognition via insertion/deletions. This flexibility allows facile evolution of new activities, as shown by reinvention of function between subfamilies.
Collapse
Affiliation(s)
- Margarita A Tararina
- Program in Biomolecular Pharmacology, Boston University School of Medicine, 72 East Concord Street, Boston, MA 02118, USA
| | - Karen N Allen
- Program in Biomolecular Pharmacology, Boston University School of Medicine, 72 East Concord Street, Boston, MA 02118, USA; Department of Chemistry, Boston University, 590 Commonwealth Avenue, Boston, MA 02215, USA.
| |
Collapse
|
14
|
Campitelli P, Modi T, Kumar S, Ozkan SB. The Role of Conformational Dynamics and Allostery in Modulating Protein Evolution. Annu Rev Biophys 2020; 49:267-288. [PMID: 32075411 DOI: 10.1146/annurev-biophys-052118-115517] [Citation(s) in RCA: 73] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Advances in sequencing techniques and statistical methods have made it possible not only to predict sequences of ancestral proteins but also to identify thousands of mutations in the human exome, some of which are disease associated. These developments have motivated numerous theories and raised many questions regarding the fundamental principles behind protein evolution, which have been traditionally investigated horizontally using the tip of the phylogenetic tree through comparative studies of extant proteins within a family. In this article, we review a vertical comparison of the modern and resurrected ancestral proteins. We focus mainly on the dynamical properties responsible for a protein's ability to adapt new functions in response to environmental changes. Using the Dynamic Flexibility Index and the Dynamic Coupling Index to quantify the relative flexibility and dynamic coupling at a site-specific, single-amino-acid level, we provide evidence that the migration of hinges, which are often functionally critical rigid sites, is a mechanism through which proteins can rapidly evolve. Additionally, we show that disease-associated mutations in proteins often result in flexibility changes even at positions distal from mutational sites, particularly in the modulation of active site dynamics.
Collapse
Affiliation(s)
- Paul Campitelli
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe, Arizona 85281, USA; , ,
| | - Tushar Modi
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe, Arizona 85281, USA; , ,
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, Pennsylvania 19122, USA; .,Department of Biology, Temple University, Philadelphia, Pennsylvania 19122, USA.,Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - S Banu Ozkan
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe, Arizona 85281, USA; , ,
| |
Collapse
|
15
|
Zallot R, Oberg N, Gerlt JA. The EFI Web Resource for Genomic Enzymology Tools: Leveraging Protein, Genome, and Metagenome Databases to Discover Novel Enzymes and Metabolic Pathways. Biochemistry 2019; 58:4169-4182. [PMID: 31553576 DOI: 10.1021/acs.biochem.9b00735] [Citation(s) in RCA: 395] [Impact Index Per Article: 79.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
The assignment of functions to uncharacterized proteins discovered in genome projects requires easily accessible tools and computational resources for large-scale, user-friendly leveraging of the protein, genome, and metagenome databases by experimentalists. This article describes the web resource developed by the Enzyme Function Initiative (EFI; accessed at https://efi.igb.illinois.edu/ ) that provides "genomic enzymology" tools ("web tools") for (1) generating sequence similarity networks (SSNs) for protein families (EFI-EST); (2) analyzing and visualizing genome context of the proteins in clusters in SSNs (in genome neighborhood networks, GNNs, and genome neighborhood diagrams, GNDs) (EFI-GNT); and (3) prioritizing uncharacterized SSN clusters for functional assignment based on metagenome abundance (chemically guided functional profiling, CGFP) (EFI-CGFP). The SSNs generated by EFI-EST are used as the input for EFI-GNT and EFI-CGFP, enabling easy transfer of information among the tools. The networks are visualized and analyzed using Cytoscape, a widely used desktop application; GNDs and CGFP heatmaps summarizing metagenome abundance are viewed within the tools. We provide a detailed example of the integrated use of the tools with an analysis of glycyl radical enzyme superfamily (IPR004184) found in the human gut microbiome. This analysis demonstrates that (1) SwissProt annotations are not always correct, (2) large-scale genome context analyses allow the prediction of novel metabolic pathways, and (3) metagenome abundance can be used to identify/prioritize uncharacterized proteins for functional investigation.
Collapse
|