1
|
Hameleers L, Pijning T, Gray BB, Fauré R, Jurak E. Novel β-galactosidase activity and first crystal structure of Glycoside Hydrolase family 154. N Biotechnol 2024; 80:1-11. [PMID: 38163476 DOI: 10.1016/j.nbt.2023.12.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 12/27/2023] [Accepted: 12/27/2023] [Indexed: 01/03/2024]
Abstract
Polysaccharide Utilization Loci (PULs) are physically linked gene clusters conserved in the Gram-negative phylum of Bacteroidota and are valuable sources for Carbohydrate Active enZyme (CAZyme) discovery. This study focuses on BD-β-Gal, an enzyme encoded in a metagenomic PUL and member of the Glycoside Hydrolase family 154 (GH154). BD-β-Gal showed exo-β-galactosidase activity with regiopreference for hydrolyzing β-d-(1,6) glycosidic linkages. Notably, it exhibited a preference for d-glucopyranosyl (d-Glcp) over d-galactopyranosyl (d-Galp) and d-fructofuranosyl (d-Fruf) at the reducing end of the investigated disaccharides. In addition, we determined the high resolution crystal structure of BD-β-Gal, thus providing the first structural characterization of a GH154 enzyme. Surprisingly, this revealed an (α/α)6 topology, which has not been observed before for β-galactosidases. BD-β-Gal displayed low structural homology with characterized CAZymes, but conservation analysis suggested that the active site was located in a central cavity, with conserved E73, R252, and D253 as putative catalytic residues. Interestingly, BD-β-Gal has a tetrameric structure and a flexible loop from a neighboring protomer may contribute to its reaction specificity. Finally, we showed that the founding member of GH154, BT3677 from Bacteroides thetaiotaomicron, described as β-glucuronidase, displayed exo-β-galactosidase activity like BD-β-Gal but lacked a tetrameric structure.
Collapse
Affiliation(s)
- Lisanne Hameleers
- Department of Bioproduct Engineering, Engineering and Technology institute Groningen (ENTEG), University of Groningen, Nijenborgh 4, Groningen 9747 AG, the Netherlands
| | - Tjaard Pijning
- Department of Biomolecular X-ray Crystallography, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Nijenborgh 7, Groningen 9747 AG, the Netherlands
| | - Brandon B Gray
- Department of Bioproduct Engineering, Engineering and Technology institute Groningen (ENTEG), University of Groningen, Nijenborgh 4, Groningen 9747 AG, the Netherlands
| | - Régis Fauré
- TBI, Université de Toulouse, CNRS, INRAE, INSA, Toulouse, France
| | - Edita Jurak
- Department of Bioproduct Engineering, Engineering and Technology institute Groningen (ENTEG), University of Groningen, Nijenborgh 4, Groningen 9747 AG, the Netherlands.
| |
Collapse
|
2
|
Tannock GW. Understanding the gut microbiota by considering human evolution: a story of fire, cereals, cooking, molecular ingenuity, and functional cooperation. Microbiol Mol Biol Rev 2024; 88:e0012722. [PMID: 38126754 PMCID: PMC10966955 DOI: 10.1128/mmbr.00127-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023] Open
Abstract
SUMMARYThe microbial community inhabiting the human colon, referred to as the gut microbiota, is mostly composed of bacterial species that, through extensive metabolic networking, degrade and ferment components of food and human secretions. The taxonomic composition of the microbiota has been extensively investigated in metagenomic studies that have also revealed details of molecular processes by which common components of the human diet are metabolized by specific members of the microbiota. Most studies of the gut microbiota aim to detect deviations in microbiota composition in patients relative to controls in the hope of showing that some diseases and conditions are due to or exacerbated by alterations to the gut microbiota. The aim of this review is to consider the gut microbiota in relation to the evolution of Homo sapiens which was heavily influenced by the consumption of a nutrient-dense non-arboreal diet, limited gut storage capacity, and acquisition of skills relating to mastering fire, cooking, and cultivation of cereal crops. The review delves into the past to gain an appreciation of what is important in the present. A holistic view of "healthy" microbiota function is proposed based on the evolutionary pathway shared by humans and gut microbes.
Collapse
Affiliation(s)
- Gerald W. Tannock
- Department of Microbiology and Immunology, University of Otago, Dunedin, New Zealand
| |
Collapse
|
3
|
de Crécy-Lagard V, Hutinet G, Cediel-Becerra JDD, Yuan Y, Zallot R, Chevrette MG, Ratnayake RMMN, Jaroch M, Quaiyum S, Bruner S. Biosynthesis and function of 7-deazaguanine derivatives in bacteria and phages. Microbiol Mol Biol Rev 2024; 88:e0019923. [PMID: 38421302 PMCID: PMC10966956 DOI: 10.1128/mmbr.00199-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2024] Open
Abstract
SUMMARYDeazaguanine modifications play multifaceted roles in the molecular biology of DNA and tRNA, shaping diverse yet essential biological processes, including the nuanced fine-tuning of translation efficiency and the intricate modulation of codon-anticodon interactions. Beyond their roles in translation, deazaguanine modifications contribute to cellular stress resistance, self-nonself discrimination mechanisms, and host evasion defenses, directly modulating the adaptability of living organisms. Deazaguanine moieties extend beyond nucleic acid modifications, manifesting in the structural diversity of biologically active natural products. Their roles in fundamental cellular processes and their presence in biologically active natural products underscore their versatility and pivotal contributions to the intricate web of molecular interactions within living organisms. Here, we discuss the current understanding of the biosynthesis and multifaceted functions of deazaguanines, shedding light on their diverse and dynamic roles in the molecular landscape of life.
Collapse
Affiliation(s)
- Valérie de Crécy-Lagard
- Department of Microbiology and Cell Science, University of Florida, Gainesville, Florida, USA
- University of Florida Genetics Institute, Gainesville, Florida, USA
| | - Geoffrey Hutinet
- Department of Biology, Haverford College, Haverford, Pennsylvania, USA
| | | | - Yifeng Yuan
- Department of Microbiology and Cell Science, University of Florida, Gainesville, Florida, USA
| | - Rémi Zallot
- Department of Life Sciences, Manchester Metropolitan University, Manchester, United Kingdom
| | - Marc G. Chevrette
- Department of Microbiology and Cell Science, University of Florida, Gainesville, Florida, USA
| | | | - Marshall Jaroch
- Department of Microbiology and Cell Science, University of Florida, Gainesville, Florida, USA
| | - Samia Quaiyum
- Department of Microbiology and Cell Science, University of Florida, Gainesville, Florida, USA
| | - Steven Bruner
- Department of Chemistry, University of Florida, Gainesville, Florida, USA
| |
Collapse
|
4
|
Michael-Pitschaze T, Cohen N, Ofer D, Hoshen Y, Linial M. Detecting anomalous proteins using deep representations. NAR Genom Bioinform 2024; 6:lqae021. [PMID: 38486884 PMCID: PMC10939404 DOI: 10.1093/nargab/lqae021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 11/17/2023] [Accepted: 02/23/2024] [Indexed: 03/17/2024] Open
Abstract
Many advances in biomedicine can be attributed to identifying unusual proteins and genes. Many of these proteins' unique properties were discovered by manual inspection, which is becoming infeasible at the scale of modern protein datasets. Here, we propose to tackle this challenge using anomaly detection methods that automatically identify unexpected properties. We adopt a state-of-the-art anomaly detection paradigm from computer vision, to highlight unusual proteins. We generate meaningful representations without labeled inputs, using pretrained deep neural network models. We apply these protein language models (pLM) to detect anomalies in function, phylogenetic families, and segmentation tasks. We compute protein anomaly scores to highlight human prion-like proteins, distinguish viral proteins from their host proteome, and mark non-classical ion/metal binding proteins and enzymes. Other tasks concern segmentation of protein sequences into folded and unstructured regions. We provide candidates for rare functionality (e.g. prion proteins). Additionally, we show the anomaly score is useful in 3D folding-related segmentation. Our novel method shows improved performance over strong baselines and has objectively high performance across a variety of tasks. We conclude that the combination of pLM and anomaly detection techniques is a valid method for discovering a range of global and local protein characteristics.
Collapse
Affiliation(s)
- Tomer Michael-Pitschaze
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Niv Cohen
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Dan Ofer
- Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Yedid Hoshen
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Michal Linial
- Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
5
|
Reed CJ, Denise R, Hourihan J, Babor J, Jaroch M, Martinelli M, Hutinet G, de Crécy-Lagard V. Beyond Blast: Enabling Microbiologists to Better Extract Literature, Taxonomic Distributions and Gene Neighborhood Information for Protein Families. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.05.03.539116. [PMID: 37205517 PMCID: PMC10187207 DOI: 10.1101/2023.05.03.539116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Capturing the published corpus of information on all members of a given protein family should be an essential step in any study focusing on specific members of that said family. Using a previously gathered dataset of more than 280 references mentioning a member of the DUF34 (NIF3/Ngg1-interacting Factor 3), we evaluated the efficiency of different databases and search tools, and devised a workflow that experimentalists can use to capture the most published information on members of a protein family in the least amount of time. To complement this workflow, web-based platforms allowing for the exploration of protein family members across sequenced genomes or for the analysis of gene neighborhood information were reviewed for their versatility and ease of use. Recommendations that can be used for experimentalist users, as well as educators, are provided and integrated within a customized, publicly accessible Wiki.
Collapse
Affiliation(s)
- Colbie J. Reed
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611, USA
| | - Rémi Denise
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611, USA
| | - Jacob Hourihan
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611, USA
| | - Jill Babor
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611, USA
| | - Marshall Jaroch
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611, USA
| | - Maria Martinelli
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611, USA
| | - Geoffrey Hutinet
- Department of Biology, Haverford College, 370 Lancaster Avenue, Haverford, PA 19041, USA
| | - Valérie de Crécy-Lagard
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611, USA
- Department of Biology, Haverford College, 370 Lancaster Avenue, Haverford, PA 19041, USA
- University of Florida Genetics Institute, Gainesville, FL 32610, USA
| |
Collapse
|
6
|
Knoshaug EP, Sun P, Nag A, Nguyen H, Mattoon EM, Zhang N, Liu J, Chen C, Cheng J, Zhang R, St. John P, Umen J. Identification and preliminary characterization of conserved uncharacterized proteins from Chlamydomonas reinhardtii, Arabidopsis thaliana, and Setaria viridis. PLANT DIRECT 2023; 7:e527. [PMID: 38044962 PMCID: PMC10690477 DOI: 10.1002/pld3.527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 08/03/2023] [Accepted: 08/11/2023] [Indexed: 12/05/2023]
Abstract
The rapid accumulation of sequenced plant genomes in the past decade has outpaced the still difficult problem of genome-wide protein-coding gene annotation. A substantial fraction of protein-coding genes in all plant genomes are poorly annotated or unannotated and remain functionally uncharacterized. We identified unannotated proteins in three model organisms representing distinct branches of the green lineage (Viridiplantae): Arabidopsis thaliana (eudicot), Setaria viridis (monocot), and Chlamydomonas reinhardtii (Chlorophyte alga). Using similarity searching, we identified a subset of unannotated proteins that were conserved between these species and defined them as Deep Green proteins. Bioinformatic, genomic, and structural predictions were performed to begin classifying Deep Green genes and proteins. Compared to whole proteomes for each species, the Deep Green set was enriched for proteins with predicted chloroplast targeting signals predictive of photosynthetic or plastid functions, a result that was consistent with enrichment for daylight phase diurnal expression patterning. Structural predictions using AlphaFold and comparisons to known structures showed that a significant proportion of Deep Green proteins may possess novel folds. Though only available for three organisms, the Deep Green genes and proteins provide a starting resource of high-value targets for further investigation of potentially new protein structures and functions conserved across the green lineage.
Collapse
Affiliation(s)
- Eric P. Knoshaug
- Biosciences CenterNational Renewable Energy LaboratoryGoldenColoradoUSA
| | - Peipei Sun
- Donald Danforth Plant Science CenterSt. LouisMOUSA
| | - Ambarish Nag
- Computational Sciences CenterNational Renewable Energy LaboratoryGoldenColoradoUSA
| | - Huong Nguyen
- Donald Danforth Plant Science CenterSt. LouisMOUSA
- Institute of Genomics for Crop Abiotic Stress Tolerance, Department of Plant and Soil ScienceTexas Tech UniversityLubbockTexasUSA
| | - Erin M. Mattoon
- Donald Danforth Plant Science CenterSt. LouisMOUSA
- Plant and Microbial Biosciences Program, Division of Biology and Biomedical SciencesWashington University in Saint LouisSt. LouisMissouriUSA
| | | | - Jian Liu
- Department of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Chen Chen
- Department of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Ru Zhang
- Donald Danforth Plant Science CenterSt. LouisMOUSA
| | - Peter St. John
- Biosciences CenterNational Renewable Energy LaboratoryGoldenColoradoUSA
| | - James Umen
- Donald Danforth Plant Science CenterSt. LouisMOUSA
| |
Collapse
|
7
|
Bou-Nader C, Pecqueur L, de Crécy-Lagard V, Hamdane D. Integrative Approach to Probe Alternative Redox Mechanisms in RNA Modifications. Acc Chem Res 2023; 56:3142-3152. [PMID: 37916403 PMCID: PMC10999249 DOI: 10.1021/acs.accounts.3c00418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2023]
Abstract
RNA modifications found in most RNAs, particularly in tRNAs and rRNAs, reveal an abundance of chemical alterations of nucleotides. Over 150 distinct RNA modifications are known, emphasizing a remarkable diversity of chemical moieties in RNA molecules. These modifications play pivotal roles in RNA maturation, structural integrity, and the fidelity and efficiency of translation processes. The catalysts responsible for these modifications are RNA-modifying enzymes that use a striking array of chemistries to directly influence the chemical landscape of RNA. This diversity is further underscored by instances where the same modification is introduced by distinct enzymes that use unique catalytic mechanisms and cofactors across different domains of life. This phenomenon of convergent evolution highlights the biological importance of RNA modification and the vast potential within the chemical repertoire for nucleotide alteration. While shared RNA modifications can hint at conserved enzymatic pathways, a major bottleneck is to identify alternative routes within species that possess a modified RNA but are devoid of known RNA-modifying enzymes. To address this challenge, a combination of bioinformatic and experimental strategies proves invaluable in pinpointing new genes responsible for RNA modifications. This integrative approach not only unveils new chemical insights but also serves as a wellspring of inspiration for biocatalytic applications and drug design. In this Account, we present how comparative genomics and genome mining, combined with biomimetic synthetic chemistry, biochemistry, and anaerobic crystallography, can be judiciously implemented to address unprecedented and alternative chemical mechanisms in the world of RNA modification. We illustrate these integrative methodologies through the study of tRNA and rRNA modifications, dihydrouridine, 5-methyluridine, queuosine, 8-methyladenosine, 5-carboxymethylamino-methyluridine, or 5-taurinomethyluridine, each dependent on a diverse array of redox chemistries, often involving organic compounds, organometallic complexes, and metal coenzymes. We explore how vast genome and tRNA databases empower comparative genomic analyses and enable the identification of novel genes that govern RNA modification. Subsequently, we describe how the isolation of a stable reaction intermediate can guide the synthesis of a biomimetic to unveil new enzymatic pathways. We then discuss the usefulness of a biochemical "shunt" strategy to study catalytic mechanisms and to directly visualize reactive intermediates bound within active sites. While we primarily focus on various RNA-modifying enzymes studied in our laboratory, with a particular emphasis on the discovery of a SAM-independent methylation mechanism, the strategies and rationale presented herein are broadly applicable for the identification of new enzymes and the elucidation of their intricate chemistries. This Account offers a comprehensive glimpse into the evolving landscape of RNA modification research and highlights the pivotal role of integrated approaches to identify novel enzymatic pathways.
Collapse
Affiliation(s)
- Charles Bou-Nader
- Laboratoire de Chimie des Processus Biologiques, CNRS-UMR 8229, Collège De France, Université Pierre et Marie Curie, 11 place Marcelin Berthelot, 75231 Paris Cedex 05, France
| | - Ludovic Pecqueur
- Laboratoire de Chimie des Processus Biologiques, CNRS-UMR 8229, Collège De France, Université Pierre et Marie Curie, 11 place Marcelin Berthelot, 75231 Paris Cedex 05, France
| | - Valérie de Crécy-Lagard
- Department of Microbiology and Cell Science, University of Florida, Gainesville, Florida, 32611, USA
- University of Florida, Genetics Institute, Gainesville, Florida, 32610, USA
| | - Djemel Hamdane
- Laboratoire de Chimie des Processus Biologiques, CNRS-UMR 8229, Collège De France, Université Pierre et Marie Curie, 11 place Marcelin Berthelot, 75231 Paris Cedex 05, France
| |
Collapse
|
8
|
Pathira Kankanamge LS, Ruffner LA, Touch MM, Pina M, Beuning PJ, Ondrechen MJ. Functional annotation of haloacid dehalogenase superfamily structural genomics proteins. Biochem J 2023; 480:1553-1569. [PMID: 37747786 DOI: 10.1042/bcj20230057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 09/20/2023] [Accepted: 09/25/2023] [Indexed: 09/26/2023]
Abstract
Haloacid dehalogenases (HAD) are members of a large superfamily that includes many Structural Genomics proteins with poorly characterized functionality. This superfamily consists of multiple types of enzymes that can act as sugar phosphatases, haloacid dehalogenases, phosphonoacetaldehyde hydrolases, ATPases, or phosphate monoesterases. Here, we report on predicted functional annotations and experimental testing by direct biochemical assay for Structural Genomics proteins from the HAD superfamily. To characterize the functions of HAD superfamily members, nine representative HAD proteins and 21 structural genomics proteins are analyzed. Using techniques based on computed chemical and electrostatic properties of individual amino acids, the functions of five structural genomics proteins from the HAD superfamily are predicted and validated by biochemical assays. A dehalogenase-like hydrolase, RSc1362 (Uniprot Q8XZN3, PDB 3UMB) is predicted to be a dehalogenase and dehalogenase activity is confirmed experimentally. Four proteins predicted to be sugar phosphatases are characterized as follows: a sugar phosphatase from Thermophilus volcanium (Uniprot Q978Y6) with trehalose-6-phosphate phosphatase and fructose-6-phosphate phosphatase activity; haloacid dehalogenase-like hydrolase from Bacteroides thetaiotaomicron (Uniprot Q8A2F3; PDB 3NIW) with fructose-6-phosphate phosphatase and sucrose-6-phosphate phosphatase activity; putative phosphatase from Eubacterium rectale (Uniprot D0VWU2; PDB 3DAO) as a sucrose-6-phosphate phosphatase; and hypothetical protein from Geobacillus kaustophilus (Uniprot Q5L139; PDB 2PQ0) as a fructose-6-phosphate phosphatase. Most of these sugar phosphatases showed some substrate promiscuity.
Collapse
Affiliation(s)
| | - Lydia A Ruffner
- Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, U.S.A
| | - Mong Mary Touch
- Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, U.S.A
| | - Manuel Pina
- Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, U.S.A
| | - Penny J Beuning
- Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, U.S.A
| | - Mary Jo Ondrechen
- Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, U.S.A
| |
Collapse
|
9
|
Sajid S, Mashkoor M, Jørgensen MG, Christensen LP, Hansen PR, Franzyk H, Mirza O, Prabhala BK. The Y-ome Conundrum: Insights into Uncharacterized Genes and Approaches for Functional Annotation. Mol Cell Biochem 2023:10.1007/s11010-023-04827-8. [PMID: 37610616 DOI: 10.1007/s11010-023-04827-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 08/09/2023] [Indexed: 08/24/2023]
Abstract
The ever-increasing availability of genome sequencing data has revealed a substantial number of uncharacterized genes without known functions across various organisms. The first comprehensive genome sequencing of E. coli K12 revealed that more than 50% of its open reading frames corresponded to transcripts with no known functions. The group of protein-coding genes without a functional description and/or a recognized pathway, beginning with the letter "Y", is classified as the "y-ome". Several efforts have been made to elucidate the functions of these genes and to recognize their role in biological processes. This review provides a brief update on various strategies employed when studying the y-ome, such as high-throughput experimental approaches, comparative omics, metabolic engineering, gene expression analysis, and data integration techniques. Additionally, we highlight recent advancements in functional annotation methods, including the use of machine learning, network analysis, and functional genomics approaches. Novel approaches are required to produce more precise functional annotations across the genome to reduce the number of genes with unknown functions.
Collapse
Affiliation(s)
- Salvia Sajid
- Department of Drug Design and Pharmacology, University of Copenhagen, Universitetsparken 2, 2100, Copenhagen Ø, Denmark
- Department of Physics, Chemistry, and Pharmacy, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
| | - Maliha Mashkoor
- Department of Surgery, Center for Surgical Sciences, Zealand University Hospital, Lykkebækvej 1, 4600, Køge, Denmark
| | - Mikkel Girke Jørgensen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
| | - Lars Porskjær Christensen
- Department of Physics, Chemistry, and Pharmacy, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
| | - Paul Robert Hansen
- Department of Drug Design and Pharmacology, University of Copenhagen, Universitetsparken 2, 2100, Copenhagen Ø, Denmark
| | - Henrik Franzyk
- Department of Drug Design and Pharmacology, University of Copenhagen, Universitetsparken 2, 2100, Copenhagen Ø, Denmark
| | - Osman Mirza
- Department of Drug Design and Pharmacology, University of Copenhagen, Universitetsparken 2, 2100, Copenhagen Ø, Denmark
| | - Bala Krishna Prabhala
- Department of Physics, Chemistry, and Pharmacy, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark.
| |
Collapse
|
10
|
Gu X, Cao Z, Zhao L, Seswita-Zilda D, Zhang Q, Fu L, Li J. Metagenomic Insights Reveal the Microbial Diversity and Associated Algal-Polysaccharide-Degrading Enzymes on the Surface of Red Algae among Remote Regions. Int J Mol Sci 2023; 24:11019. [PMID: 37446198 DOI: 10.3390/ijms241311019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 06/27/2023] [Accepted: 06/28/2023] [Indexed: 07/15/2023] Open
Abstract
Macroalgae and macroalgae-associated bacteria together constitute the most efficient metabolic cycling system in the ocean. Their interactions, especially the responses of macroalgae-associated bacteria communities to algae in different geographical locations, are mostly unknown. In this study, metagenomics was used to analyze the microbial diversity and associated algal-polysaccharide-degrading enzymes on the surface of red algae among three remote regions. There were significant differences in the macroalgae-associated bacteria community composition and diversity among the different regions. At the phylum level, Proteobacteria, Bacteroidetes, and Actinobacteria had a significantly high relative abundance among the regions. From the perspective of species diversity, samples from China had the highest macroalgae-associated bacteria diversity, followed by those from Antarctica and Indonesia. In addition, in the functional prediction of the bacterial community, genes associated with amino acid metabolism, carbohydrate metabolism, energy metabolism, metabolism of cofactors and vitamins, and membrane transport had a high relative abundance. Canonical correspondence analysis and redundancy analysis of environmental factors showed that, without considering algae species and composition, pH and temperature were the main environmental factors affecting bacterial community structure. Furthermore, there were significant differences in algal-polysaccharide-degrading enzymes among the regions. Samples from China and Antarctica had high abundances of algal-polysaccharide-degrading enzymes, while those from Indonesia had extremely low abundances. The environmental differences between these three regions may impose a strong geographic differentiation regarding the biodiversity of algal microbiomes and their expressed enzyme genes. This work expands our knowledge of algal microbial ecology, and contributes to an in-depth study of their metabolic characteristics, ecological functions, and applications.
Collapse
Affiliation(s)
- Xiaoqian Gu
- Key Lab of Ecological Environment Science and Technology, First Institute of Oceanography, Ministry of Natural Resources, Qingdao 266061, China
- CAS and Shandong Province Key Laboratory of Experimental Marine Biology, Center for Ocean Mega-Science, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China
| | - Zhe Cao
- Key Lab of Ecological Environment Science and Technology, First Institute of Oceanography, Ministry of Natural Resources, Qingdao 266061, China
| | - Luying Zhao
- Key Lab of Ecological Environment Science and Technology, First Institute of Oceanography, Ministry of Natural Resources, Qingdao 266061, China
| | - Dewi Seswita-Zilda
- Research Center for Deep Sea, Earth Sciences and Maritime Research Organization, National Research and Innovation Agency (BRIN), Jl. Pasir Putih Raya, Pademangan, Jakarta 14430, Indonesia
| | - Qian Zhang
- Key Lab of Ecological Environment Science and Technology, First Institute of Oceanography, Ministry of Natural Resources, Qingdao 266061, China
| | - Liping Fu
- Key Lab of Ecological Environment Science and Technology, First Institute of Oceanography, Ministry of Natural Resources, Qingdao 266061, China
| | - Jiang Li
- Key Lab of Ecological Environment Science and Technology, First Institute of Oceanography, Ministry of Natural Resources, Qingdao 266061, China
| |
Collapse
|
11
|
Zeng X, Kahng A, Xue L, Mahamid J, Chang YW, Xu M. High-throughput cryo-ET structural pattern mining by unsupervised deep iterative subtomogram clustering. Proc Natl Acad Sci U S A 2023; 120:e2213149120. [PMID: 37027429 PMCID: PMC10104553 DOI: 10.1073/pnas.2213149120] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 02/24/2023] [Indexed: 04/08/2023] Open
Abstract
Cryoelectron tomography directly visualizes heterogeneous macromolecular structures in their native and complex cellular environments. However, existing computer-assisted structure sorting approaches are low throughput or inherently limited due to their dependency on available templates and manual labels. Here, we introduce a high-throughput template-and-label-free deep learning approach, Deep Iterative Subtomogram Clustering Approach (DISCA), that automatically detects subsets of homogeneous structures by learning and modeling 3D structural features and their distributions. Evaluation on five experimental cryo-ET datasets shows that an unsupervised deep learning based method can detect diverse structures with a wide range of molecular sizes. This unsupervised detection paves the way for systematic unbiased recognition of macromolecular complexes in situ.
Collapse
Affiliation(s)
- Xiangrui Zeng
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA15213
| | - Anson Kahng
- Computer Science Department, University of Rochester, Rochester, NY14620
| | - Liang Xue
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg69117, Germany
- Faculty of Biosciences, Collaboration for joint PhD degree between European Molecular Biology Laboratory and Heidelberg University, Heidelberg69117, Germany
| | - Julia Mahamid
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg69117, Germany
| | - Yi-Wei Chang
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA19104
| | - Min Xu
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA15213
| |
Collapse
|
12
|
Makarova KS, Wolf YI, Koonin EV. In Silico Approaches for Prediction of Anti-CRISPR Proteins. J Mol Biol 2023; 435:168036. [PMID: 36868398 PMCID: PMC10073340 DOI: 10.1016/j.jmb.2023.168036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 02/18/2023] [Accepted: 02/23/2023] [Indexed: 03/05/2023]
Abstract
Numerous viruses infecting bacteria and archaea encode CRISPR-Cas system inhibitors, known as anti-CRISPR proteins (Acr). The Acrs typically are highly specific for particular CRISPR variants, resulting in remarkable sequence and structural diversity and complicating accurate prediction and identification of Acrs. In addition to their intrinsic interest for understanding the coevolution of defense and counter-defense systems in prokaryotes, Acrs could be natural, potent on-off switches for CRISPR-based biotechnological tools, so their discovery, characterization and application are of major importance. Here we discuss the computational approaches for Acr prediction. Due to the enormous diversity and likely multiple origins of the Acrs, sequence similarity searches are of limited use. However, multiple features of protein and gene organization have been successfully harnessed to this end including small protein size and distinct amino acid compositions of the Acrs, association of acr genes in virus genomes with genes encoding helix-turn-helix proteins that regulate Acr expression (Acr-associated proteins, Aca), and presence of self-targeting CRISPR spacers in bacterial and archaeal genomes containing Acr-encoding proviruses. Productive approaches for Acr prediction also involve genome comparison of closely related viruses, of which one is resistant and the other one is sensitive to a particular CRISPR variant, and "guilt by association" whereby genes adjacent to a homolog of a known Aca are identified as candidate Acrs. The distinctive features of Acrs are employed for Acr prediction both by developing dedicated search algorithms and through machine learning. New approaches will be needed to identify novel types of Acrs that are likely to exist.
Collapse
Affiliation(s)
- Kira S Makarova
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, USA.
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, USA
| |
Collapse
|
13
|
Denise R, Babor J, Gerlt JA, de Crécy-Lagard V. Pyridoxal 5'-phosphate synthesis and salvage in Bacteria and Archaea: predicting pathway variant distributions and holes. Microb Genom 2023; 9:mgen000926. [PMID: 36729913 PMCID: PMC9997740 DOI: 10.1099/mgen.0.000926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Pyridoxal 5’-phosphate or PLP is a cofactor derived from B6 vitamers and essential for growth in all known organisms. PLP synthesis and salvage pathways are well characterized in a few model species even though key components, such as the vitamin B6 transporters, are still to be identified in many organisms including the model bacteria Escherichia coli or Bacillus subtilis. Using a comparative genomic approach, PLP synthesis and salvage pathways were predicted in 5840 bacterial and archaeal species with complete genomes. The distribution of the two known de novo biosynthesis pathways and previously identified cases of non-orthologous displacements were surveyed in the process. This analysis revealed that several PLP de novo pathway genes remain to be identified in many organisms, either because sequence similarity alone cannot be used to discriminate among several homologous candidates or due to non-orthologous displacements. Candidates for some of these pathway holes were identified using published TnSeq data, but many remain. We find that ~10 % of the analysed organisms rely on salvage but further analyses will be required to identify potential transporters. This work is a starting point to model the exchanges of B6 vitamers in communities, predict the sensitivity of a given organism to drugs targeting PLP synthesis enzymes, and identify numerous gaps in knowledge that will need to be tackled in the years to come.
Collapse
Affiliation(s)
- Rémi Denise
- Department of Microbiology and Cell Sciences, Gainesville, USA.,Present address: APC Microbiome Ireland, University College Cork, Cork, Ireland
| | - Jill Babor
- Department of Microbiology and Cell Sciences, Gainesville, USA
| | | | - Valérie de Crécy-Lagard
- Department of Microbiology and Cell Sciences, Gainesville, USA.,Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
14
|
Thirumalai A, Ganapathy Raman P, Jayavelu T, Subramanian R. Bridging the gap between maleate hydratase, citraconase and isopropylmalate isomerase: Insights into the single broad-specific enzyme. Enzyme Microb Technol 2023; 162:110140. [DOI: 10.1016/j.enzmictec.2022.110140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 09/23/2022] [Accepted: 10/08/2022] [Indexed: 11/13/2022]
|
15
|
Brown DC, Aggarwal N, Turner RJ. Exploration of the presence and abundance of multidrug resistance efflux genes in oil and gas environments. MICROBIOLOGY (READING, ENGLAND) 2022; 168. [PMID: 36190831 DOI: 10.1099/mic.0.001248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
As sequencing technology improves and the cost of metagenome sequencing decreases, the number of sequenced environments increases. These metagenomes provide a wealth of data in the form of annotated and unannotated genes. The role of multidrug resistance efflux pumps (MDREPs) is the removal of antibiotics, biocides and toxic metabolites created during aromatic hydrocarbon metabolism. Due to their naturally occurring role in hydrocarbon metabolism and their role in biocide tolerance, MDREP genes are of particular importance for the protection of pipeline assets. However, the heterogeneity of MDREP genes creates a challenge during annotation and detection. Here we use a selection of primers designed to target MDREPs in six pure species and apply them to publicly available metagenomes associated with oil and gas environments. Using in silico PCR with relaxed primer binding conditions we probed the metagenomes of a shale reservoir, a heavy oil tailings pond, a civil wastewater treatment, two marine sediments exposed to hydrocarbons following the Deepwater Horizon oil spill and a non-exposed marine sediment to assess the presence and abundance of MDREP genes. Through relaxed primer binding conditions during in silico PCR, the prevalence of MDREPs was determined. The percentage of nucleotide sequences identified by the MDREP primers was partially augmented by exposure to hydrocarbons in marine sediment and in shale reservoir compared to hydrocarbon-free marine sediments while tailings ponds and wastewater had the highest percentages. We believe this approach lays the groundwork for a supervised method of identifying poorly conserved genes within metagenomes.
Collapse
|
16
|
Exploring Bacterial Attributes That Underpin Symbiont Life in the Monogastric Gut. Appl Environ Microbiol 2022; 88:e0112822. [PMID: 36036591 PMCID: PMC9499014 DOI: 10.1128/aem.01128-22] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The large bowel of monogastric animals, such as that of humans, is home to a microbial community (microbiota) composed of a diversity of mostly bacterial species. Interrelationships between the microbiota as an entity and the host are complex and lifelong and are characteristic of a symbiosis. The relationships may be disrupted in association with disease, resulting in dysbiosis. Modifications to the microbiota to correct dysbiosis require knowledge of the fundamental mechanisms by which symbionts inhabit the gut. This review aims to summarize aspects of niche fitness of bacterial species that inhabit the monogastric gut, especially of humans, and to indicate the research path by which progress can be made in exploring bacterial attributes that underpin symbiont life in the gut.
Collapse
|
17
|
Rhee KY, Jansen RS, Grundner C. Activity-based annotation: the emergence of systems biochemistry. Trends Biochem Sci 2022; 47:785-794. [PMID: 35430135 PMCID: PMC9378515 DOI: 10.1016/j.tibs.2022.03.017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 03/10/2022] [Accepted: 03/22/2022] [Indexed: 01/21/2023]
Abstract
Current tools to annotate protein function have failed to keep pace with the speed of DNA sequencing and exponentially growing number of proteins of unknown function (PUFs). A major contributing factor to this mismatch is the historical lack of high-throughput methods to experimentally determine biochemical activity. Activity-based methods, such as activity-based metabolite and protein profiling, are emerging as new approaches for unbiased, global, biochemical annotation of protein function. In this review, we highlight recent experimental, activity-based approaches that offer new opportunities to determine protein function in a biologically agnostic and systems-level manner.
Collapse
Affiliation(s)
- Kyu Y Rhee
- Department of Medicine, Weill Cornell Medical College, New York, NY, USA.
| | - Robert S Jansen
- Department of Microbiology, Radboud University, Nijmegen, The Netherlands.
| | - Christoph Grundner
- Center for Global Infectious Disease Research, Seattle Children's Research Institute, Seattle, WA, USA; Department of Global Health, University of Washington, Seattle, WA, USA; Department of Pediatrics, University of Washington, Seattle, WA, USA.
| |
Collapse
|
18
|
Yu M. Computational analysis on two putative mitochondrial protein-coding genes from the Emydura subglobosa genome: A functional annotation approach. PLoS One 2022; 17:e0268031. [PMID: 35981005 PMCID: PMC9387794 DOI: 10.1371/journal.pone.0268031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 04/21/2022] [Indexed: 11/19/2022] Open
Abstract
Rapid advancements in automated genomic technologies have uncovered many unique findings about the turtle genome and its associated features including olfactory gene expansions and duplications of toll-like receptors. However, despite the advent of large-scale sequencing, assembly, and annotation, about 40-50% of genes in eukaryotic genomes are left without functional annotation, severely limiting our knowledge of the biological information of genes. Additionally, these automated processes are prone to errors since draft genomes consist of several disconnected scaffolds whose order is unknown; erroneous draft assemblies may also be contaminated with foreign sequences and propagate to cause errors in annotation. Many of these automated annotations are thus incomplete and inaccurate, highlighting the need for functional annotation to link gene sequences to biological identity. In this study, we have functionally annotated two genes of the red-bellied short-neck turtle (Emydura subglobosa), a member of the relatively understudied pleurodire lineage of turtles. We improved upon initial ab initio gene predictions through homology-based evidence and generated refined consensus gene models. Through functional, localization, and structural analyses of the predicted proteins, we discovered conserved putative genes encoding mitochondrial proteins that play a role in C21-steroid hormone biosynthetic processes and fatty acid catabolism-both of which are distantly related by the tricarboxylic acid (TCA) cycle and share similar metabolic pathways. Overall, these findings further our knowledge about the genetic features underlying turtle physiology, morphology, and longevity, which have important implications for the treatment of human diseases and evolutionary studies.
Collapse
Affiliation(s)
- Megan Yu
- Department of Molecular, Cell & Developmental Biology, University of California–Los Angeles, Los Angeles, California, United States of America
| |
Collapse
|
19
|
de Crécy-lagard V, Amorin de Hegedus R, Arighi C, Babor J, Bateman A, Blaby I, Blaby-Haas C, Bridge AJ, Burley SK, Cleveland S, Colwell LJ, Conesa A, Dallago C, Danchin A, de Waard A, Deutschbauer A, Dias R, Ding Y, Fang G, Friedberg I, Gerlt J, Goldford J, Gorelik M, Gyori BM, Henry C, Hutinet G, Jaroch M, Karp PD, Kondratova L, Lu Z, Marchler-Bauer A, Martin MJ, McWhite C, Moghe GD, Monaghan P, Morgat A, Mungall CJ, Natale DA, Nelson WC, O’Donoghue S, Orengo C, O’Toole KH, Radivojac P, Reed C, Roberts RJ, Rodionov D, Rodionova IA, Rudolf JD, Saleh L, Sheynkman G, Thibaud-Nissen F, Thomas PD, Uetz P, Vallenet D, Carter EW, Weigele PR, Wood V, Wood-Charlson EM, Xu J. A roadmap for the functional annotation of protein families: a community perspective. Database (Oxford) 2022; 2022:6663924. [PMID: 35961013 PMCID: PMC9374478 DOI: 10.1093/database/baac062] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 06/28/2022] [Accepted: 08/03/2022] [Indexed: 12/23/2022]
Abstract
Over the last 25 years, biology has entered the genomic era and is becoming a science of ‘big data’. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages. Such a large gap in knowledge hampers all aspects of biological enterprise and, thereby, is standing in the way of genomic biology reaching its full potential. A brainstorming meeting to address this issue funded by the National Science Foundation was held during 3–4 February 2022. Bringing together data scientists, biocurators, computational biologists and experimentalists within the same venue allowed for a comprehensive assessment of the current state of functional annotations of protein families. Further, major issues that were obstructing the field were identified and discussed, which ultimately allowed for the proposal of solutions on how to move forward.
Collapse
Affiliation(s)
- Valérie de Crécy-lagard
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | | | - Cecilia Arighi
- Department of Computer and Information Sciences, University of Delaware , Newark, DE 19713, USA
| | - Jill Babor
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus , Hinxton CB10 1SD, UK
| | - Ian Blaby
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA
| | - Crysten Blaby-Haas
- Biology Department, Brookhaven National Laboratory , Upton, NY 11973, USA
| | - Alan J Bridge
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire , Geneva 4 CH-1211, Switzerland
| | - Stephen K Burley
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
| | - Stacey Cleveland
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Lucy J Colwell
- Departmenf of Chemistry, University of Cambridge , Lensfield Road, Cambridge CB2 1EW, UK
| | - Ana Conesa
- Spanish National Research Council, Institute for Integrative Systems Biology , Paterna, Valencia 46980, Spain
| | - Christian Dallago
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology , i12, Boltzmannstr. 3, Garching/Munich 85748, Germany
| | - Antoine Danchin
- School of Biomedical Sciences, Li KaShing Faculty of Medicine, The University of Hong Kong , 21 Sassoon Road, Pokfulam, SAR Hong Kong 999077, China
| | - Anita de Waard
- Research Collaboration Unit, Elsevier , Jericho, VT 05465, USA
| | - Adam Deutschbauer
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA
| | - Raquel Dias
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Yousong Ding
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida , Gainesville, FL 32610, USA
| | - Gang Fang
- NYU-Shanghai , Shanghai 200120, China
| | - Iddo Friedberg
- Department of Veterinary Microbiology and Preventive Medicine, Iowa State University , Ames, IA 50011, USA
| | - John Gerlt
- Institute for Genomic Biology and Departments of Biochemistry and Chemistry, University of Illinois at Urbana-Champaign , Urbana, IL 61801, USA
| | - Joshua Goldford
- Physics of Living Systems, Massachusetts Institute of Technology , Cambridge, MA 02139, USA
| | - Mark Gorelik
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Benjamin M Gyori
- Laboratory of Systems Pharmacology, Harvard Medical School , Boston, MA 02115, USA
| | - Christopher Henry
- Mathematics and Computer Science Division, Argonne National Laboratory , Argonne, IL 60439, USA
| | - Geoffrey Hutinet
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Marshall Jaroch
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Peter D Karp
- Bioinformatics Research Group, SRI International , Menlo Park, CA 94025, USA
| | | | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH) , 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Aron Marchler-Bauer
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH) , 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Maria-Jesus Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus , Hinxton CB10 1SD, UK
| | - Claire McWhite
- Lewis-Sigler Institute for Integrative Genomics, Princeton University , Princeton, NJ 08540, USA
| | - Gaurav D Moghe
- Plant Biology Section, School of Integrative Plant Science, Cornell University , Ithaca, NY 14853, USA
| | - Paul Monaghan
- Department of Agricultural Education and Communication, University of Florida , Gainesville, FL 32611, USA
| | - Anne Morgat
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire , Geneva 4 CH-1211, Switzerland
| | - Christopher J Mungall
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA
| | - Darren A Natale
- Georgetown University Medical Center , Washington, DC 20007, USA
| | - William C Nelson
- Biological Sciences Division, Pacific Northwest National Laboratories , Richland, WA 99354, USA
| | - Seán O’Donoghue
- School of Biotechnology and Biomolecular Sciences, University of NSW , Sydney, NSW 2052, Australia
| | - Christine Orengo
- Department of Structural and Molecular Biology, University College London , London WC1E 6BT, UK
| | | | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University , Boston, MA 02115, USA
| | - Colbie Reed
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | | | - Dmitri Rodionov
- Sanford Burnham Prebys Medical Discovery Institute , La Jolla, CA 92037, USA
| | - Irina A Rodionova
- Department of Bioengineering, Division of Engineering, University of California at San Diego , La Jolla, CA 92093-0412, USA
| | - Jeffrey D Rudolf
- Department of Chemistry, University of Florida , Gainesville, FL 32611, USA
| | - Lana Saleh
- New England Biolabs , Ipswich, MA 01938, USA
| | - Gloria Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia , Charlottesville, VA, USA
| | - Francoise Thibaud-Nissen
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH) , 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Paul D Thomas
- Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033, USA
| | - Peter Uetz
- Center for Biological Data Science, Virginia Commonwealth University , Richmond, VA 23284, USA
| | - David Vallenet
- LABGeM, Génomique Métabolique, CEA, Genoscope, Institut François Jacob, Université d’Évry, Université Paris-Saclay, CNRS , Evry 91057, France
| | - Erica Watson Carter
- Department of Plant Pathology, University of Florida Citrus Research and Education Center , 700 Experiment Station Rd., Lake Alfred, FL 33850, USA
| | | | - Valerie Wood
- Department of Biochemistry, University of Cambridge , Cambridge CB2 1GA, UK
| | - Elisha M Wood-Charlson
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA
| | - Jin Xu
- Department of Plant Pathology, University of Florida Citrus Research and Education Center , 700 Experiment Station Rd., Lake Alfred, FL 33850, USA
| |
Collapse
|
20
|
Cho KT, Sen TZ, Andorf CM. Predicting Tissue-Specific mRNA and Protein Abundance in Maize: A Machine Learning Approach. Front Artif Intell 2022; 5:830170. [PMID: 35719692 PMCID: PMC9204276 DOI: 10.3389/frai.2022.830170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 04/26/2022] [Indexed: 11/13/2022] Open
Abstract
Machine learning and modeling approaches have been used to classify protein sequences for a broad set of tasks including predicting protein function, structure, expression, and localization. Some recent studies have successfully predicted whether a given gene is expressed as mRNA or even translated to proteins potentially, but given that not all genes are expressed in every condition and tissue, the challenge remains to predict condition-specific expression. To address this gap, we developed a machine learning approach to predict tissue-specific gene expression across 23 different tissues in maize, solely based on DNA promoter and protein sequences. For class labels, we defined high and low expression levels for mRNA and protein abundance and optimized classifiers by systematically exploring various methods and combinations of k-mer sequences in a two-phase approach. In the first phase, we developed Markov model classifiers for each tissue and built a feature vector based on the predictions. In the second phase, the feature vector was used as an input to a Bayesian network for final classification. Our results show that these methods can achieve high classification accuracy of up to 95% for predicting gene expression for individual tissues. By relying on sequence alone, our method works in settings where costly experimental data are unavailable and reveals useful insights into the functional, evolutionary, and regulatory characteristics of genes.
Collapse
Affiliation(s)
- Kyoung Tak Cho
- Department of Computer Science, Iowa State University, Ames, IA, United States
| | - Taner Z. Sen
- USDA-ARS, Crop Improvement and Genetics Research Unit, Albany, CA, United States
| | - Carson M. Andorf
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA, United States
- *Correspondence: Carson M. Andorf
| |
Collapse
|
21
|
Vanni C, Schechter MS, Acinas SG, Barberán A, Buttigieg PL, Casamayor EO, Delmont TO, Duarte CM, Eren AM, Finn RD, Kottmann R, Mitchell A, Sánchez P, Siren K, Steinegger M, Gloeckner FO, Fernàndez-Guerra A. Unifying the known and unknown microbial coding sequence space. eLife 2022; 11:67667. [PMID: 35356891 PMCID: PMC9132574 DOI: 10.7554/elife.67667] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 03/30/2022] [Indexed: 12/02/2022] Open
Abstract
Genes of unknown function are among the biggest challenges in molecular biology, especially in microbial systems, where 40–60% of the predicted genes are unknown. Despite previous attempts, systematic approaches to include the unknown fraction into analytical workflows are still lacking. Here, we present a conceptual framework, its translation into the computational workflow AGNOSTOS and a demonstration on how we can bridge the known-unknown gap in genomes and metagenomes. By analyzing 415,971,742 genes predicted from 1749 metagenomes and 28,941 bacterial and archaeal genomes, we quantify the extent of the unknown fraction, its diversity, and its relevance across multiple organisms and environments. The unknown sequence space is exceptionally diverse, phylogenetically more conserved than the known fraction and predominantly taxonomically restricted at the species level. From the 71 M genes identified to be of unknown function, we compiled a collection of 283,874 lineage-specific genes of unknown function for Cand. Patescibacteria (also known as Candidate Phyla Radiation, CPR), which provides a significant resource to expand our understanding of their unusual biology. Finally, by identifying a target gene of unknown function for antibiotic resistance, we demonstrate how we can enable the generation of hypotheses that can be used to augment experimental data. It is estimated that scientists do not know what half of microbial genes actually do. When these genes are discovered in microorganisms grown in the lab or found in environmental samples, it is not possible to identify what their roles are. Many of these genes are excluded from further analyses for these reasons, meaning that the study of microbial genes tends to be limited to genes that have already been described. These limitations hinder research into microbiology, because information from newly discovered genes cannot be integrated to better understand how these organisms work. Experiments to understand what role these genes have in the microorganisms are labor-intensive, so new analytical strategies are needed. To do this, Vanni et al. developed a new framework to categorize genes with unknown roles, and a computational workflow to integrate them into traditional analyses. When this approach was applied to over 400 million microbial genes (both with known and unknown roles), it showed that the share of genes with unknown functions is only about 30 per cent, smaller than previously thought. The analysis also showed that these genes are very diverse, revealing a huge space for future research and potential applications. Combining their approach with experimental data, Vanni et al. were able to identify a gene with a previously unknown purpose that could be involved in antibiotic resistance. This system could be useful for other scientists studying microorganisms to get a more complete view of microbial systems. In future, it may also be used to analyze the genetics of other organisms, such as plants and animals.
Collapse
Affiliation(s)
- Chiara Vanni
- Microbial Genomics and Bioinformatics Research G, Max Planck Institute for Marine Microbiology, Bremen, Germany
| | | | - Silvia G Acinas
- Department of Marine Biology and Oceanography, Institut de Ciències del Mar-CMIMA (CSIC), Barcelona, Spain
| | - Albert Barberán
- Department of Environmental Science, University of Arizona, Tucson, United States
| | - Pier Luigi Buttigieg
- Helmholtz Centre for Polar and Marine Research, Alfred Wegener Institute, Bremerhaven, Germany
| | - Emilio O Casamayor
- Center for Advanced Studies of Blanes CEAB-CSIC, Spanish Council for Research, Blanes, Spain
| | - Tom O Delmont
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Paris, France
| | - Carlos M Duarte
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - A Murat Eren
- Department of Medicine, University of Chicago, Chicago, United States
| | - Robert D Finn
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Hinxton, United Kingdom
| | - Renzo Kottmann
- Microbial Genomics and Bioinformatics Research G, Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Alex Mitchell
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Hinxton, United Kingdom
| | - Pablo Sánchez
- Department of Marine Biology and Oceanography, Institut de Ciències del Mar-CMIMA (CSIC), Barcelona, Spain
| | - Kimmo Siren
- Section for Evolutionary Genomics, The GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Frank Oliver Gloeckner
- MARUM, Helmholtz Center for Polar and Marine Research, University of Bremen, Bremen, Germany
| | - Antonio Fernàndez-Guerra
- Lundbeck Foundation GeoGenetics Centre, GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
22
|
A deep learning model to detect novel pore-forming proteins. Sci Rep 2022; 12:2013. [PMID: 35132124 PMCID: PMC8821639 DOI: 10.1038/s41598-022-05970-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 01/12/2022] [Indexed: 11/09/2022] Open
Abstract
Many pore-forming proteins originating from pathogenic bacteria are toxic against agricultural pests. They are the key ingredients in several pesticidal products for agricultural use, including transgenic crops. There is an urgent need to identify novel pore-forming proteins to combat development of resistance in pests to existing products, and to develop products that are effective against a broader range of pests. Existing computational methodologies to search for these proteins rely on sequence homology-based approaches. These approaches are based on similarities between protein sequences, and thus are limited in their usefulness for discovering novel proteins. In this paper, we outline a novel deep learning model trained on pore-forming proteins from the public domain. We compare different ways of encoding protein information during training, and contrast it with traditional approaches. We show that our model is capable of identifying known pore formers with no sequence similarity to the proteins used to train the model, and therefore holds promise for identifying novel pore formers.
Collapse
|
23
|
Takihara H, Miura N, Aoki-Kinoshita KF, Okuda S. Functional glyco-metagenomics elucidates the role of glycan-related genes in environments. BMC Bioinformatics 2021; 22:505. [PMID: 34663219 PMCID: PMC8522060 DOI: 10.1186/s12859-021-04425-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 10/04/2021] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND Glycan-related genes play a fundamental role in various processes for energy acquisition and homeostasis maintenance while adapting to the environment in which the organism exists; however, their role in the microbiome in the environment is unclear. METHODS Sequence alignment was performed between known glycan-related genes and complete genomes of microorganisms, and optimal parameters for identifying glycan-related genes were determined based on the alignments. Using the constructed scheme (> 90% of identity and > 25 aa of alignment length), glycan-related genes in various environments were identified from 198 different metagenome data. RESULTS As a result, we identified 86.73 million glycan-related genes from the metagenome data. Among the 12 environments classified in this study, the percentage of glycan-related genes was high in the human-associated environment, suggesting that these environments utilize glycan metabolism better than other environments. On the other hand, the relative abundances of both glycoside hydrolases and glycosyltransferases surprisingly had a coverage of over 80% in all the environments. These glycoside hydrolases and glycosyltransferases were classified into two groups of (1) general enzyme families identified in various environments and (2) specific enzymes found only in certain environments. The general enzyme families were mostly from genes involved in monosaccharide metabolism, and most of the specific enzymes were polysaccharide degrading enzymes. CONCLUSION These findings suggest that environmental microorganisms could change the composition of their glycan-related genes to adapt the processes involved in acquiring energy from glycans in their environments. Our functional glyco-metagenomics approach has made it possible to clarify the relationship between the environment and genes from the perspective of carbohydrates, and the existence of glycan-related genes that exist specifically in the environment.
Collapse
Affiliation(s)
- Hayato Takihara
- Division of Bioinformatics, Niigata University Graduate School of Medical and Dental Sciences, 1-757 Asahimachi-dori, Chuo-ku, Niigata, 951-8510, Japan
| | - Nobuaki Miura
- Division of Bioinformatics, Niigata University Graduate School of Medical and Dental Sciences, 1-757 Asahimachi-dori, Chuo-ku, Niigata, 951-8510, Japan
| | - Kiyoko F Aoki-Kinoshita
- Glycan and Life Systems Integration Center, Faculty of Science and Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo, 192-8577, Japan
| | - Shujiro Okuda
- Division of Bioinformatics, Niigata University Graduate School of Medical and Dental Sciences, 1-757 Asahimachi-dori, Chuo-ku, Niigata, 951-8510, Japan.
| |
Collapse
|
24
|
Zeng Y, Howe G, Yi K, Zeng X, Zhang J, Chang YW, Xu M. UNSUPERVISED DOMAIN ALIGNMENT BASED OPEN SET STRUCTURAL RECOGNITION OF MACROMOLECULES CAPTURED BY CRYO-ELECTRON TOMOGRAPHY. PROCEEDINGS. INTERNATIONAL CONFERENCE ON IMAGE PROCESSING 2021; 2021:106-110. [PMID: 35350462 PMCID: PMC8959888 DOI: 10.1109/icip42928.2021.9506205] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Cellular cryo-Electron Tomography (cryo-ET) provides three-dimensional views of structural and spatial information of various macromolecules in cells in a near-native state. Subtomogram classification is a key step for recognizing and differentiating these macromolecular structures. In recent years, deep learning methods have been developed for high-throughput subtomogram classification tasks; however, conventional supervised deep learning methods cannot recognize macromolecular structural classes that do not exist in the training data. This imposes a major weakness since most native macromolecular structures in cells are unknown and consequently, cannot be included in the training data. Therefore, open set learning which can recognize unknown macromolecular structures is necessary for boosting the power of automatic subtomogram classification. In this paper, we propose a method called Margin-based Loss for Unsupervised Domain Alignment (MLUDA) for open set recognition problems where only a few categories of interest are shared between cross-domain data. Through extensive experiments, we demonstrate that MLUDA performs well at cross-domain open-set classification on both public datasets and medical imaging datasets. So our method is of practical importance.
Collapse
Affiliation(s)
- Yuchen Zeng
- Computational Biology Department, Carnegie Mellon University, United States
| | - Gregory Howe
- Computational Biology Department, Carnegie Mellon University, United States
| | - Kai Yi
- King Abdullah University of Science and Technology, Saudi Arabia
| | - Xiangrui Zeng
- Computational Biology Department, Carnegie Mellon University, United States
| | - Jing Zhang
- Department of Computer Science, University of California Irvine, United States
| | - Yi-Wei Chang
- Perelman School of Medicine, University of Pennsylvania, United States
| | - Min Xu
- Computational Biology Department, Carnegie Mellon University, United States
| |
Collapse
|
25
|
Discovery and mining of enzymes from the human gut microbiome. Trends Biotechnol 2021; 40:240-254. [PMID: 34304905 DOI: 10.1016/j.tibtech.2021.06.008] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 06/24/2021] [Accepted: 06/25/2021] [Indexed: 12/19/2022]
Abstract
Advances in technological and bioinformatics approaches have led to the generation of a plethora of human gut metagenomic datasets. Metabolomics has also provided substantial data regarding the small metabolites produced and modified by the microbiota. Comparatively, the microbial enzymes mediating the transformation of metabolites have not been intensively investigated. Here, we discuss the recent efforts and technologies used for discovering and mining enzymes from the human gut microbiota. The wealth of knowledge on metabolites, reactions, genome sequences, and structures of proteins, may drive the development of strategies for enzyme mining. Ongoing efforts to annotate gut microbiota enzymes will explain catalytic mechanisms that may guide the clinical applications of the gut microbiome for diagnostic and therapeutic purposes.
Collapse
|
26
|
de Rond T, Asay JE, Moore BS. Co-occurrence of enzyme domains guides the discovery of an oxazolone synthetase. Nat Chem Biol 2021; 17:794-799. [PMID: 34099916 PMCID: PMC8238888 DOI: 10.1038/s41589-021-00808-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Accepted: 04/29/2021] [Indexed: 02/04/2023]
Abstract
Multidomain enzymes orchestrate two or more catalytic activities to carry out metabolic transformations with increased control and speed. Here, we report the design and development of a genome-mining approach for targeted discovery of biochemical transformations through the analysis of co-occurring enzyme domains (CO-ED) in a single protein. CO-ED was designed to identify unannotated multifunctional enzymes for functional characterization and discovery based on the premise that linked enzyme domains have evolved to function collaboratively. Guided by CO-ED, we targeted an unannotated predicted ThiF-nitroreductase di-domain enzyme found in more than 50 proteobacteria. Through heterologous expression and biochemical reconstitution, we discovered a series of natural products containing the rare oxazolone heterocycle and characterized their biosynthesis. Notably, we identified the di-domain enzyme as an oxazolone synthetase, validating CO-ED-guided genome mining as a methodology with potential broad utility for both the discovery of unusual enzymatic transformations and the functional annotation of multidomain enzymes.
Collapse
Affiliation(s)
- Tristan de Rond
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA 92093
| | - Julia E. Asay
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA 92093
| | - Bradley S. Moore
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA 92093,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093
| |
Collapse
|
27
|
Key amino acid residues in homoserine-acetyltransferase from M. tuberculosis give insight into the evolution of MetX family of enzymes - HAT, SAT and HST. Biochimie 2021; 189:13-25. [PMID: 34090964 DOI: 10.1016/j.biochi.2021.05.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 05/23/2021] [Accepted: 05/30/2021] [Indexed: 11/22/2022]
Abstract
Multiple sequence alignment of homoserine-acetyltransferases, serine-acetyltransferases and homoserine-succinyltransferases show they all belong to MetX family, having evolved from a common ancestor by conserving the catalytic site and substrate binding residues. The discrimination in the substrate selection arises due to the presence of substrate-specific residues lining the substrate-binding pocket. Mutation of Ala59 and Gly62 to Gly and Pro respectively in homoserine-acetyltransferase from M. tuberculosis resulted in a serine-acetyltransferase like enzyme as it acetylated both l-homoserine and l-serine. Homoserine-acetyltransferase from M. tuberculosis when mutated at positon 322 where Leu was converted to Arg, resulted in succinylation over acetylation of l-homoserine. Our studies establish the importance of the substrate binding residues in determining the type of activity possessed by MetX family, despite all of them having the same catalytic triad Ser-Asp-His. Hence key residues at the substrate binding pocket dictate whether the given enzyme shows predominant transferase or hydrolase activity.
Collapse
|
28
|
Poudel S, Cope AL, O'Dell KB, Guss AM, Seo H, Trinh CT, Hettich RL. Identification and characterization of proteins of unknown function (PUFs) in Clostridium thermocellum DSM 1313 strains as potential genetic engineering targets. BIOTECHNOLOGY FOR BIOFUELS 2021; 14:116. [PMID: 33971924 PMCID: PMC8112048 DOI: 10.1186/s13068-021-01964-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Accepted: 04/26/2021] [Indexed: 05/13/2023]
Abstract
BACKGROUND Mass spectrometry-based proteomics can identify and quantify thousands of proteins from individual microbial species, but a significant percentage of these proteins are unannotated and hence classified as proteins of unknown function (PUFs). Due to the difficulty in extracting meaningful metabolic information, PUFs are often overlooked or discarded during data analysis, even though they might be critically important in functional activities, in particular for metabolic engineering research. RESULTS We optimized and employed a pipeline integrating various "guilt-by-association" (GBA) metrics, including differential expression and co-expression analyses of high-throughput mass spectrometry proteome data and phylogenetic coevolution analysis, and sequence homology-based approaches to determine putative functions for PUFs in Clostridium thermocellum. Our various analyses provided putative functional information for over 95% of the PUFs detected by mass spectrometry in a wild-type and/or an engineered strain of C. thermocellum. In particular, we validated a predicted acyltransferase PUF (WP_003519433.1) with functional activity towards 2-phenylethyl alcohol, consistent with our GBA and sequence homology-based predictions. CONCLUSIONS This work demonstrates the value of leveraging sequence homology-based annotations with empirical evidence based on the concept of GBA to broadly predict putative functions for PUFs, opening avenues to further interrogation via targeted experiments.
Collapse
Affiliation(s)
- Suresh Poudel
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
- The Center for Bioenergy Innovation at Oak Ridge National Laboratory, Oak Ridge, TN, USA
- The Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN, USA
| | - Alexander L Cope
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
- The Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN, USA
| | - Kaela B O'Dell
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
- The Center for Bioenergy Innovation at Oak Ridge National Laboratory, Oak Ridge, TN, USA
- The Bredesen Center, University of Tennessee, Knoxville, TN, USA
| | - Adam M Guss
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
- The Bredesen Center, University of Tennessee, Knoxville, TN, USA
| | - Hyeongmin Seo
- The Center for Bioenergy Innovation at Oak Ridge National Laboratory, Oak Ridge, TN, USA
- Department of Chemical and Biomolecular Engineering, University of Tennessee, Knoxville, TN, USA
| | - Cong T Trinh
- The Center for Bioenergy Innovation at Oak Ridge National Laboratory, Oak Ridge, TN, USA
- The Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN, USA
- The Bredesen Center, University of Tennessee, Knoxville, TN, USA
- Department of Chemical and Biomolecular Engineering, University of Tennessee, Knoxville, TN, USA
| | - Robert L Hettich
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA.
| |
Collapse
|
29
|
Black KA, Duan L, Mandyoli L, Selbach BP, Xu W, Ehrt S, Sacchettini JC, Rhee KY. Metabolic bifunctionality of Rv0812 couples folate and peptidoglycan biosynthesis in Mycobacterium tuberculosis. J Exp Med 2021; 218:212052. [PMID: 33950161 PMCID: PMC8105722 DOI: 10.1084/jem.20191957] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 02/16/2021] [Accepted: 03/30/2021] [Indexed: 11/04/2022] Open
Abstract
Comparative sequence analysis has enabled the annotation of millions of genes from organisms across the evolutionary tree. However, this approach has inherently biased the annotation of phylogenetically ubiquitous, rather than species-specific, functions. The ecologically unusual pathogen Mycobacterium tuberculosis (Mtb) has evolved in humans as its sole reservoir and emerged as the leading bacterial cause of death worldwide. However, the physiological factors that define Mtb’s pathogenicity are poorly understood. Here, we report the structure and function of a protein that is required for optimal in vitro fitness and bears homology to two distinct enzymes, Rv0812. Despite diversification of related orthologues into biochemically distinct enzyme families, rv0812 encodes a single active site with aminodeoxychorismate lyase and D–amino acid transaminase activities. The mutual exclusivity of substrate occupancy in this active site mediates coupling between nucleic acid and cell wall biosynthesis, prioritizing PABA over D-Ala/D-Glu biosynthesis. This bifunctionality reveals a novel, enzymatically encoded fail-safe mechanism that may help Mtb and other bacteria couple replication and division.
Collapse
Affiliation(s)
| | - Lijun Duan
- Texas A&M University, College Station, TX
| | | | | | | | | | | | | |
Collapse
|
30
|
Bergès C, Cahoreau E, Millard P, Enjalbert B, Dinclaux M, Heuillet M, Kulyk H, Gales L, Butin N, Chazalviel M, Palama T, Guionnet M, Sokol S, Peyriga L, Bellvert F, Heux S, Portais JC. Exploring the Glucose Fluxotype of the E. coli y-ome Using High-Resolution Fluxomics. Metabolites 2021; 11:metabo11050271. [PMID: 33926117 PMCID: PMC8145925 DOI: 10.3390/metabo11050271] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 04/16/2021] [Accepted: 04/23/2021] [Indexed: 01/26/2023] Open
Abstract
We have developed a robust workflow to measure high-resolution fluxotypes (metabolic flux phenotypes) for large strain libraries under fully controlled growth conditions. This was achieved by optimizing and automating the whole high-throughput fluxomics process and integrating all relevant software tools. This workflow allowed us to obtain highly detailed maps of carbon fluxes in the central carbon metabolism in a fully automated manner. It was applied to investigate the glucose fluxotypes of 180 Escherichia coli strains deleted for y-genes. Since the products of these y-genes potentially play a role in a variety of metabolic processes, the experiments were designed to be agnostic as to their potential metabolic impact. The obtained data highlight the robustness of E. coli’s central metabolism to y-gene deletion. For two y-genes, deletion resulted in significant changes in carbon and energy fluxes, demonstrating the involvement of the corresponding y-gene products in metabolic function or regulation. This work also introduces novel metrics to measure the actual scope and quality of high-throughput fluxomics investigations.
Collapse
Affiliation(s)
- Cécilia Bergès
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
- MetaToul-MetaboHUB, National Infrastructure of Metabolomics & Fluxomics (ANR-11-INBS-0010), 31077 Toulouse, France
| | - Edern Cahoreau
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
- MetaToul-MetaboHUB, National Infrastructure of Metabolomics & Fluxomics (ANR-11-INBS-0010), 31077 Toulouse, France
| | - Pierre Millard
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
| | - Brice Enjalbert
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
| | - Mickael Dinclaux
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
| | - Maud Heuillet
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
- MetaToul-MetaboHUB, National Infrastructure of Metabolomics & Fluxomics (ANR-11-INBS-0010), 31077 Toulouse, France
| | - Hanna Kulyk
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
- MetaToul-MetaboHUB, National Infrastructure of Metabolomics & Fluxomics (ANR-11-INBS-0010), 31077 Toulouse, France
| | - Lara Gales
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
- MetaToul-MetaboHUB, National Infrastructure of Metabolomics & Fluxomics (ANR-11-INBS-0010), 31077 Toulouse, France
| | - Noémie Butin
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
- MetaToul-MetaboHUB, National Infrastructure of Metabolomics & Fluxomics (ANR-11-INBS-0010), 31077 Toulouse, France
- RESTORE, Université de Toulouse, Inserm U1031, CNRS 5070, UPS, EFS, 31100 Toulouse, France
| | - Maxime Chazalviel
- Toxalim (Research Centre in Food Toxicology), UMR1331, Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, 31300 Toulouse, France;
| | - Tony Palama
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
- MetaToul-MetaboHUB, National Infrastructure of Metabolomics & Fluxomics (ANR-11-INBS-0010), 31077 Toulouse, France
| | - Matthieu Guionnet
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
- MetaToul-MetaboHUB, National Infrastructure of Metabolomics & Fluxomics (ANR-11-INBS-0010), 31077 Toulouse, France
| | - Sergueï Sokol
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
| | - Lindsay Peyriga
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
- MetaToul-MetaboHUB, National Infrastructure of Metabolomics & Fluxomics (ANR-11-INBS-0010), 31077 Toulouse, France
| | - Floriant Bellvert
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
- MetaToul-MetaboHUB, National Infrastructure of Metabolomics & Fluxomics (ANR-11-INBS-0010), 31077 Toulouse, France
| | - Stéphanie Heux
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
| | - Jean-Charles Portais
- Toulouse Biotechnology Institute (TBI), Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France; (C.B.); (E.C.); (P.M.); (B.E.); (M.D.); (M.H.); (H.K.); (L.G.); (N.B.); (T.P.); (M.G.); (S.S.); (L.P.); (F.B.); (S.H.)
- MetaToul-MetaboHUB, National Infrastructure of Metabolomics & Fluxomics (ANR-11-INBS-0010), 31077 Toulouse, France
- RESTORE, Université de Toulouse, Inserm U1031, CNRS 5070, UPS, EFS, 31100 Toulouse, France
- Correspondence:
| |
Collapse
|
31
|
Current knowledge and recent advances in understanding metabolism of the model cyanobacterium Synechocystis sp. PCC 6803. Biosci Rep 2021; 40:222317. [PMID: 32149336 PMCID: PMC7133116 DOI: 10.1042/bsr20193325] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 03/05/2020] [Accepted: 03/06/2020] [Indexed: 02/06/2023] Open
Abstract
Cyanobacteria are key organisms in the global ecosystem, useful models for studying metabolic and physiological processes conserved in photosynthetic organisms, and potential renewable platforms for production of chemicals. Characterizing cyanobacterial metabolism and physiology is key to understanding their role in the environment and unlocking their potential for biotechnology applications. Many aspects of cyanobacterial biology differ from heterotrophic bacteria. For example, most cyanobacteria incorporate a series of internal thylakoid membranes where both oxygenic photosynthesis and respiration occur, while CO2 fixation takes place in specialized compartments termed carboxysomes. In this review, we provide a comprehensive summary of our knowledge on cyanobacterial physiology and the pathways in Synechocystis sp. PCC 6803 (Synechocystis) involved in biosynthesis of sugar-based metabolites, amino acids, nucleotides, lipids, cofactors, vitamins, isoprenoids, pigments and cell wall components, in addition to the proteins involved in metabolite transport. While some pathways are conserved between model cyanobacteria, such as Synechocystis, and model heterotrophic bacteria like Escherichia coli, many enzymes and/or pathways involved in the biosynthesis of key metabolites in cyanobacteria have not been completely characterized. These include pathways required for biosynthesis of chorismate and membrane lipids, nucleotides, several amino acids, vitamins and cofactors, and isoprenoids such as plastoquinone, carotenoids, and tocopherols. Moreover, our understanding of photorespiration, lipopolysaccharide assembly and transport, and degradation of lipids, sucrose, most vitamins and amino acids, and haem, is incomplete. We discuss tools that may aid our understanding of cyanobacterial metabolism, notably CyanoSource, a barcoded library of targeted Synechocystis mutants, which will significantly accelerate characterization of individual proteins.
Collapse
|
32
|
Bioinformatic and experimental evidence for suicidal and catalytic plant THI4s. Biochem J 2020; 477:2055-2069. [PMID: 32441748 DOI: 10.1042/bcj20200297] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 05/20/2020] [Accepted: 05/21/2020] [Indexed: 12/14/2022]
Abstract
Like fungi and some prokaryotes, plants use a thiazole synthase (THI4) to make the thiazole precursor of thiamin. Fungal THI4s are suicide enzymes that destroy an essential active-site Cys residue to obtain the sulfur atom needed for thiazole formation. In contrast, certain prokaryotic THI4s have no active-site Cys, use sulfide as sulfur donor, and are truly catalytic. The presence of a conserved active-site Cys in plant THI4s and other indirect evidence implies that they are suicidal. To confirm this, we complemented the Arabidopsistz-1 mutant, which lacks THI4 activity, with a His-tagged Arabidopsis THI4 construct. LC-MS analysis of tryptic peptides of the THI4 extracted from leaves showed that the active-site Cys was predominantly in desulfurated form, consistent with THI4 having a suicide mechanism in planta. Unexpectedly, transcriptome data mining and deep proteome profiling showed that barley, wheat, and oat have both a widely expressed canonical THI4 with an active-site Cys, and a THI4-like paralog (non-Cys THI4) that has no active-site Cys and is the major type of THI4 in developing grains. Transcriptomic evidence also indicated that barley, wheat, and oat grains synthesize thiamin de novo, implying that their non-Cys THI4s synthesize thiazole. Structure modeling supported this inference, as did demonstration that non-Cys THI4s have significant capacity to complement thiazole auxotrophy in Escherichia coli. There is thus a prima facie case that non-Cys cereal THI4s, like their prokaryotic counterparts, are catalytic thiazole synthases. Bioenergetic calculations show that, relative to suicide THI4s, such enzymes could save substantial energy during the grain-filling period.
Collapse
|
33
|
Kornfuehrer T, Romanowski S, de Crécy-Lagard V, Hanson AD, Eustáquio AS. An Enzyme Containing the Conserved Domain of Unknown Function DUF62 Acts as a Stereoselective (R s ,S c )-S-Adenosylmethionine Hydrolase. Chembiochem 2020; 21:3495-3499. [PMID: 32776704 DOI: 10.1002/cbic.202000349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 08/07/2020] [Indexed: 11/09/2022]
Abstract
Homochirality is a signature of biological systems. The essential and ubiquitous cofactor S-adenosyl-l-methionine (SAM) is synthesized in cells from adenosine triphosphate and l-methionine to yield exclusively the (S,S)-SAM diastereomer. (S,S)-SAM plays a crucial role as the primary methyl donor in transmethylation reactions important to the development and homeostasis of all organisms from bacteria to humans. However, (S,S)-SAM slowly racemizes at the sulfonium center to yield the inactive (R,S)-SAM, which can inhibit methyltransferases. Control of SAM homochirality has been shown to involve homocysteine S-methyltransferases in plants, insects, worms, yeast, and in ∼18 % of bacteria. Herein, we show that a recombinant protein containing a domain of unknown function (DUF62) from the actinomycete bacterium Salinispora tropica functions as a stereoselective (R,S)-SAM hydrolase (adenosine-forming). DUF62 proteins are encoded in the genomes of 21 % of bacteria and 42 % of archaea and potentially represent a novel mechanism to remediate SAM damage.
Collapse
Affiliation(s)
- Taylor Kornfuehrer
- Department of Pharmaceutical Sciences and, Center for Biomolecular Sciences, University of Illinois, Chicago, IL 60607, USA
| | - Sean Romanowski
- Department of Pharmaceutical Sciences and, Center for Biomolecular Sciences, University of Illinois, Chicago, IL 60607, USA
| | - Valérie de Crécy-Lagard
- Department of Microbiology and Cell Science and, Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| | - Andrew D Hanson
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - Alessandra S Eustáquio
- Department of Pharmaceutical Sciences and, Center for Biomolecular Sciences, University of Illinois, Chicago, IL 60607, USA
| |
Collapse
|
34
|
Thioproline formation as a driver of formaldehyde toxicity in Escherichia coli. Biochem J 2020; 477:1745-1757. [PMID: 32301498 DOI: 10.1042/bcj20200198] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Revised: 04/14/2020] [Accepted: 04/17/2020] [Indexed: 12/14/2022]
Abstract
Formaldehyde (HCHO) is a reactive carbonyl compound that formylates and cross-links proteins, DNA, and small molecules. It is of specific concern as a toxic intermediate in the design of engineered pathways involving methanol oxidation or formate reduction. The interest in engineering these pathways is not, however, matched by engineering-relevant information on precisely why HCHO is toxic or on what damage-control mechanisms cells deploy to manage HCHO toxicity. The only well-defined mechanism for managing HCHO toxicity is formaldehyde dehydrogenase-mediated oxidation to formate, which is counterproductive if HCHO is a desired pathway intermediate. We therefore sought alternative HCHO damage-control mechanisms via comparative genomic analysis. This analysis associated homologs of the Escherichia coli pepP gene with HCHO-related one-carbon metabolism. Furthermore, deleting pepP increased the sensitivity of E. coli to supplied HCHO but not other carbonyl compounds. PepP is a proline aminopeptidase that cleaves peptides of the general formula X-Pro-Y, yielding X + Pro-Y. HCHO is known to react spontaneously with cysteine to form the close proline analog thioproline (thiazolidine-4-carboxylate), which is incorporated into proteins and hence into proteolytic peptides. We therefore hypothesized that certain thioproline-containing peptides are toxic and that PepP cleaves these aberrant peptides. Supporting this hypothesis, PepP cleaved the model peptide Ala-thioproline-Ala as efficiently as Ala-Pro-Ala in vitro and in vivo, and deleting pepP increased sensitivity to supplied thioproline. Our data thus (i) provide biochemical genetic evidence that thioproline formation contributes substantially to HCHO toxicity and (ii) make PepP a candidate damage-control enzyme for engineered pathways having HCHO as an intermediate.
Collapse
|
35
|
Liu Z, Feng J, Yu B, Ma Q, Liu B. The functional determinants in the organization of bacterial genomes. Brief Bioinform 2020; 22:5892344. [PMID: 32793986 DOI: 10.1093/bib/bbaa172] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Revised: 06/30/2020] [Accepted: 07/07/2020] [Indexed: 12/13/2022] Open
Abstract
Bacterial genomes are now recognized as interacting intimately with cellular processes. Uncovering organizational mechanisms of bacterial genomes has been a primary focus of researchers to reveal the potential cellular activities. The advances in both experimental techniques and computational models provide a tremendous opportunity for understanding these mechanisms, and various studies have been proposed to explore the organization rules of bacterial genomes associated with functions recently. This review focuses mainly on the principles that shape the organization of bacterial genomes, both locally and globally. We first illustrate local structures as operons/transcription units for facilitating co-transcription and horizontal transfer of genes. We then clarify the constraints that globally shape bacterial genomes, such as metabolism, transcription and replication. Finally, we highlight challenges and opportunities to advance bacterial genomic studies and provide application perspectives of genome organization, including pathway hole assignment and genome assembly and understanding disease mechanisms.
Collapse
Affiliation(s)
| | | | - Bin Yu
- College of Mathematics and Physics, Qingdao University of Science and Technology
| | - Qin Ma
- Department of Biomedical Informatics, the Ohio State University
| | | |
Collapse
|
36
|
Prifti E, Chevaleyre Y, Hanczar B, Belda E, Danchin A, Clément K, Zucker JD. Interpretable and accurate prediction models for metagenomics data. Gigascience 2020; 9:giaa010. [PMID: 32150601 PMCID: PMC7062144 DOI: 10.1093/gigascience/giaa010] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Revised: 09/12/2019] [Accepted: 01/27/2020] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND Microbiome biomarker discovery for patient diagnosis, prognosis, and risk evaluation is attracting broad interest. Selected groups of microbial features provide signatures that characterize host disease states such as cancer or cardio-metabolic diseases. Yet, the current predictive models stemming from machine learning still behave as black boxes and seldom generalize well. Their interpretation is challenging for physicians and biologists, which makes them difficult to trust and use routinely in the physician-patient decision-making process. Novel methods that provide interpretability and biological insight are needed. Here, we introduce "predomics", an original machine learning approach inspired by microbial ecosystem interactions that is tailored for metagenomics data. It discovers accurate predictive signatures and provides unprecedented interpretability. The decision provided by the predictive model is based on a simple, yet powerful score computed by adding, subtracting, or dividing cumulative abundance of microbiome measurements. RESULTS Tested on >100 datasets, we demonstrate that predomics models are simple and highly interpretable. Even with such simplicity, they are at least as accurate as state-of-the-art methods. The family of best models, discovered during the learning process, offers the ability to distil biological information and to decipher the predictability signatures of the studied condition. In a proof-of-concept experiment, we successfully predicted body corpulence and metabolic improvement after bariatric surgery using pre-surgery microbiome data. CONCLUSIONS Predomics is a new algorithm that helps in providing reliable and trustworthy diagnostic decisions in the microbiome field. Predomics is in accord with societal and legal requirements that plead for an explainable artificial intelligence approach in the medical field.
Collapse
Affiliation(s)
- Edi Prifti
- IRD, Sorbonne University, UMMISCO, 32 Avenue Henri Varagnat, F-93143 Bondy, France
- Institute of Cardiometabolism and Nutrition, ICAN, Integromics, 91 Boulevard de l'Hopital, F-75013, Paris, France
| | - Yann Chevaleyre
- Paris-Dauphine University, PSL Research University, CNRS, UMR 7243, LAMSADE, place du Mal. de Lattre de Tassigny, F-75016, Paris, France
| | - Blaise Hanczar
- IBISC, University Paris-Saclay, University Evry, Evry, 23 Boulevard de France, F-91034, France
| | - Eugeni Belda
- Institute of Cardiometabolism and Nutrition, ICAN, Integromics, 91 Boulevard de l'Hopital, F-75013, Paris, France
| | - Antoine Danchin
- Institut Cochin INSERM U1016−CNRS UMR8104−Université Paris Descartes, 24 Rue du Faubourg Saint-Jacques, F-75014, Paris, France
| | - Karine Clément
- Sorbonne University, INSERM, Nutrition and Obesities; Systemic Approach Research Unit (NutriOmics), 91 Boulevard de l'Hopital, F-75013, Paris, France
- Assistance Publique-Hôpitaux de Paris, Nutrition Department, CRNH Ile de France, Pitié-Salpêtrière Hospital, 91 Boulevard de l'Hopital, F-75013, Paris, France
| | - Jean-Daniel Zucker
- IRD, Sorbonne University, UMMISCO, 32 Avenue Henri Varagnat, F-93143 Bondy, France
- Institute of Cardiometabolism and Nutrition, ICAN, Integromics, 91 Boulevard de l'Hopital, F-75013, Paris, France
- Sorbonne University, INSERM, Nutrition and Obesities; Systemic Approach Research Unit (NutriOmics), 91 Boulevard de l'Hopital, F-75013, Paris, France
| |
Collapse
|
37
|
Wang PH, Fujishima K, Berhanu S, Kuruma Y, Jia TZ, Khusnutdinova AN, Yakunin AF, McGlynn SE. A Bifunctional Polyphosphate Kinase Driving the Regeneration of Nucleoside Triphosphate and Reconstituted Cell-Free Protein Synthesis. ACS Synth Biol 2020; 9:36-42. [PMID: 31829622 DOI: 10.1021/acssynbio.9b00456] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Reconstituted cell-free protein synthesis systems (e.g., the PURE system) allow the expression of toxic proteins, hetero-oligomeric protein subunits, and proteins with noncanonical amino acids with high levels of homogeneity. In these systems, an artificial ATP/GTP regeneration system is required to drive protein synthesis, which is accomplished using three kinases and phosphocreatine. Here, we demonstrate the replacement of these three kinases with one bifunctional Cytophaga hutchinsonii polyphosphate kinase that phosphorylates nucleosides in an exchange reaction from polyphosphate. The optimized single-kinase system produced a final sfGFP concentration (∼530 μg/mL) beyond that of the three-kinase system (∼400 μg/mL), with a 5-fold faster mRNA translation rate in the first 90 min. The single-kinase system is also compatible with the expression of heat-sensitive firefly luciferase at 37 °C. Potentially, the single-kinase nucleoside triphosphate regeneration approach developed herein could expand future applications of cell-free protein synthesis systems and could be used to drive other biochemical processes in synthetic biology which require both ATP and GTP.
Collapse
Affiliation(s)
- Po-Hsiang Wang
- Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo, 152-8550, Japan
| | - Kosuke Fujishima
- Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo, 152-8550, Japan
- Graduate School of Media and Governance, Keio University, Fujisawa, 108-8345, Japan
| | - Samuel Berhanu
- Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo, 152-8550, Japan
| | - Yutetsu Kuruma
- Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo, 152-8550, Japan
- JST, PRESTO, Saitama, 102-0076, Japan
| | - Tony Z. Jia
- Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo, 152-8550, Japan
- Blue Marble Space Institute of Science, Seattle, Washington 98154, United States
| | - Anna N. Khusnutdinova
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Ontario ON M5S, Canada
| | - Alexander F. Yakunin
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Ontario ON M5S, Canada
- Centre for Environmental Biotechnology, Bangor University, Bangor, Wales LL57 2DG, United Kingdom
| | - Shawn E. McGlynn
- Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo, 152-8550, Japan
- Blue Marble Space Institute of Science, Seattle, Washington 98154, United States
| |
Collapse
|
38
|
Taxonomic Distribution of Cytochrome P450 Monooxygenases (CYPs) among the Budding Yeasts (Sub-Phylum Saccharomycotina). Microorganisms 2019; 7:microorganisms7080247. [PMID: 31398949 PMCID: PMC6723986 DOI: 10.3390/microorganisms7080247] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2019] [Revised: 08/06/2019] [Accepted: 08/07/2019] [Indexed: 12/14/2022] Open
Abstract
Cytochrome P450 monooxygenases (CYPs) are ubiquitous throughout the tree of life and play diverse roles in metabolism including the synthesis of secondary metabolites as well as the degradation of recalcitrant organic substrates. The genomes of budding yeasts (phylum Ascomycota, sub-phylum Saccharomycotina) typically contain fewer families of CYPs than filamentous fungi. There are currently five CYP families among budding yeasts with known function while at least another six CYP families with unknown function (“orphan CYPs”) have been described. The current study surveyed the genomes of 372 species of budding yeasts for CYP-encoding genes in order to determine the taxonomic distribution of individual CYP families across the sub-phylum as well as to identify novel CYP families. Families CYP51 and CYP61 (represented by the ergosterol biosynthetic genes ERG11 and ERG5, respectively) were essentially ubiquitous among the budding yeasts while families CYP52 (alkane/fatty acid hydroxylases), CYP56 (N-formyl-l-tyrosine oxidase) displayed several instances of gene loss at the genus or family level. Phylogenetic analysis suggested that the three orphan families CYP5217, CYP5223 and CYP5252 diverged from a common ancestor gene following the origin of the budding yeast sub-phylum. The genomic survey also identified eight CYP families that had not previously been reported in budding yeasts.
Collapse
|
39
|
Discovery of novel carbohydrate-active enzymes through the rational exploration of the protein sequences space. Proc Natl Acad Sci U S A 2019; 116:6063-6068. [PMID: 30850540 PMCID: PMC6442616 DOI: 10.1073/pnas.1815791116] [Citation(s) in RCA: 128] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Over the last two decades, the number of gene/protein sequences gleaned from sequencing projects of individual genomes and environmental DNA has grown exponentially. Only a tiny fraction of these predicted proteins has been experimentally characterized, and the function of most proteins remains hypothetical or only predicted based on sequence similarity. Despite the development of postgenomic methods, such as transcriptomics, proteomics, and metabolomics, the assignment of function to protein sequences remains one of the main challenges in modern biology. As in all classes of proteins, the growing number of predicted carbohydrate-active enzymes (CAZymes) has not been accompanied by a systematic and accurate attribution of function. Taking advantage of the CAZy database, which groups CAZymes into families and subfamilies based on amino acid similarities, we recombinantly produced 564 proteins selected from subfamilies without any biochemically characterized representatives, from distant relatives of characterized enzymes and from nonclassified proteins that show little similarity with known CAZymes. Screening these proteins for activity on a wide collection of carbohydrate substrates led to the discovery of 13 CAZyme families (two of which were also discovered by others during the course of our work), revealed three previously unknown substrate specificities, and assigned a function to 25 subfamilies.
Collapse
|
40
|
Sun J, Sigler CL, Beaudoin GAW, Joshi J, Patterson JA, Cho KH, Ralat MA, Gregory JF, Clark DG, Deng Z, Colquhoun TA, Hanson AD. Parts-Prospecting for a High-Efficiency Thiamin Thiazole Biosynthesis Pathway. PLANT PHYSIOLOGY 2019; 179:958-968. [PMID: 30337452 PMCID: PMC6393793 DOI: 10.1104/pp.18.01085] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2018] [Accepted: 10/10/2018] [Indexed: 05/04/2023]
Abstract
Plants synthesize the thiazole precursor of thiamin (cThz-P) via THIAMIN4 (THI4), a suicide enzyme that mediates one reaction cycle and must then be degraded and resynthesized. It has been estimated that this THI4 turnover consumes 2% to 12% of the maintenance energy budget and that installing an energy-efficient alternative pathway could substantially increase crop yield potential. Available data point to two natural alternatives to the suicidal THI4 pathway: (i) nonsuicidal prokaryotic THI4s that lack the active-site Cys residue on which suicide activity depends, and (ii) an uncharacterized thiazole synthesis pathway in flowers of the tropical arum lily Caladium bicolor that enables production and emission of large amounts of the cThz-P analog 4-methyl-5-vinylthiazole (MVT). We used functional complementation of an Escherichia coli ΔthiG strain to identify a nonsuicidal bacterial THI4 (from Thermovibrio ammonificans) that can function in conditions like those in plant cells. We explored whether C. bicolor synthesizes MVT de novo via a novel route, via a suicidal or a nonsuicidal THI4, or by catabolizing thiamin. Analysis of developmental changes in MVT emission, extractable MVT, thiamin level, and THI4 expression indicated that C. bicolor flowers make MVT de novo via a massively expressed THI4 and that thiamin is not involved. Functional complementation tests indicated that C. bicolor THI4, which has the active-site Cys needed to operate suicidally, may be capable of suicidal and - in hypoxic conditions - nonsuicidal operation. T. ammonificans and C. bicolor THI4s are thus candidate parts for rational redesign or directed evolution of efficient, nonsuicidal THI4s for use in crop improvement.
Collapse
Affiliation(s)
- Jiayi Sun
- Horticultural Sciences Department, University of Florida, Gainesville, Florida 32611
| | - Cindy L Sigler
- Department of Environmental Horticulture, University of Florida, Gainesville, Florida 32611
| | | | - Jaya Joshi
- Horticultural Sciences Department, University of Florida, Gainesville, Florida 32611
| | - Jenelle A Patterson
- Horticultural Sciences Department, University of Florida, Gainesville, Florida 32611
| | - Keun H Cho
- Department of Environmental Horticulture, University of Florida, Gainesville, Florida 32611
| | - Maria A Ralat
- Department of Food Science and Human Nutrition, University of Florida, Gainesville, Florida 32611
| | - Jesse F Gregory
- Department of Food Science and Human Nutrition, University of Florida, Gainesville, Florida 32611
| | - David G Clark
- Department of Environmental Horticulture, University of Florida, Gainesville, Florida 32611
| | - Zhanao Deng
- Gulf Coast Research and Education Center, Department of Environmental Horticulture, University of Florida, Wimauma, Florida 33598
| | - Thomas A Colquhoun
- Department of Environmental Horticulture, University of Florida, Gainesville, Florida 32611
| | - Andrew D Hanson
- Horticultural Sciences Department, University of Florida, Gainesville, Florida 32611
| |
Collapse
|
41
|
Towards functional characterization of archaeal genomic dark matter. Biochem Soc Trans 2019; 47:389-398. [PMID: 30710061 PMCID: PMC6393860 DOI: 10.1042/bst20180560] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 01/08/2019] [Accepted: 01/09/2019] [Indexed: 01/07/2023]
Abstract
A substantial fraction of archaeal genes, from ∼30% to as much as 80%, encode ‘hypothetical' proteins or genomic ‘dark matter'. Archaeal genomes typically contain a higher fraction of dark matter compared with bacterial genomes, primarily, because isolation and cultivation of most archaea in the laboratory, and accordingly, experimental characterization of archaeal genes, are difficult. In the present study, we present quantitative characteristics of the archaeal genomic dark matter and discuss comparative genomic approaches for functional prediction for ‘hypothetical' proteins. We propose a list of top priority candidates for experimental characterization with a broad distribution among archaea and those that are characteristic of poorly studied major archaeal groups such as Thaumarchaea, DPANN (Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanoarchaeota and Nanohaloarchaeota) and Asgard.
Collapse
|
42
|
Griesemer M, Kimbrel JA, Zhou CE, Navid A, D'haeseleer P. Combining multiple functional annotation tools increases coverage of metabolic annotation. BMC Genomics 2018; 19:948. [PMID: 30567498 PMCID: PMC6299973 DOI: 10.1186/s12864-018-5221-9] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Accepted: 11/05/2018] [Indexed: 12/15/2022] Open
Abstract
Background Genome-scale metabolic modeling is a cornerstone of systems biology analysis of microbial organisms and communities, yet these genome-scale modeling efforts are invariably based on incomplete functional annotations. Annotated genomes typically contain 30–50% of genes without functional annotation, severely limiting our knowledge of the “parts lists” that the organisms have at their disposal. These incomplete annotations may be sufficient to derive a model of a core set of well-studied metabolic pathways that support growth in pure culture. However, pathways important for growth on unusual metabolites exchanged in complex microbial communities are often less understood, resulting in missing functional annotations in newly sequenced genomes. Results Here, we present results on a comprehensive reannotation of 27 bacterial reference genomes, focusing on enzymes with EC numbers annotated by KEGG, RAST, EFICAz, and the BRENDA enzyme database, and on membrane transport annotations by TransportDB, KEGG and RAST. Our analysis shows that annotation using multiple tools can result in a drastically larger metabolic network reconstruction, adding on average 40% more EC numbers, 3–8 times more substrate-specific transporters, and 37% more metabolic genes. These results are even more pronounced for bacterial species that are phylogenetically distant from well-studied model organisms such as E. coli. Conclusions Metabolic annotations are often incomplete and inconsistent. Combining multiple functional annotation tools can greatly improve genome coverage and metabolic network size, especially for non-model organisms and non-core pathways. Electronic supplementary material The online version of this article (10.1186/s12864-018-5221-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Marc Griesemer
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
| | - Jeffrey A Kimbrel
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
| | - Carol E Zhou
- Global Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
| | - Ali Navid
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
| | - Patrik D'haeseleer
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA. .,Global Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA.
| |
Collapse
|
43
|
Linder T. Phenotypical characterisation of a putative ω-amino acid transaminase in the yeast Scheffersomyces stipitis. Arch Microbiol 2018; 201:185-192. [PMID: 30519708 PMCID: PMC6514085 DOI: 10.1007/s00203-018-1608-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Revised: 11/30/2018] [Accepted: 12/03/2018] [Indexed: 01/05/2023]
Abstract
Phylogenetic analysis of class III transaminases in the budding yeasts Lachancea kluyveri, Saccharomyces cerevisiae and Scheffersomyces stipitis identified a hitherto uncharacterised Sch. stipitis transaminase encoded by the PICST_54153 gene, which clustered with previously described γ-amino butyric acid (GABA) and β-alanine transaminases. Deletion of the PICST_54153 gene in Sch. stipitis resulted in a complete loss in the utilisation of β-alanine and β-ureidopropionic acid as nitrogen sources, while growth on 1,3-diaminopropane displayed a significant lag phase compared to the wild-type control. It was therefore concluded that the Sch. stipitis PICST_54153 gene likely encodes a β-alanine transaminase. However, minor growth defects when 1,4-diaminobutane or 1,5-diaminopentane was provided as the nitrogen source suggested that the Picst_54153 transaminase may also participate in the catabolism of other diamine-derived ω-amino acids. Unexpectedly, the ∆picst_54153 deletion mutant failed to grow on solid minimal medium in the presence of 5 mM β-alanine even if a preferred nitrogen source was provided.
Collapse
Affiliation(s)
- Tomas Linder
- Department of Molecular Sciences, Swedish University of Agricultural Sciences, Box 7015, 750 07, Uppsala, Sweden.
| |
Collapse
|
44
|
Molecular Factors of Hypochlorite Tolerance in the Hypersaline Archaeon Haloferax volcanii. Genes (Basel) 2018; 9:genes9110562. [PMID: 30463375 PMCID: PMC6267482 DOI: 10.3390/genes9110562] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2018] [Revised: 11/07/2018] [Accepted: 11/13/2018] [Indexed: 12/17/2022] Open
Abstract
Halophilic archaea thrive in hypersaline conditions associated with desiccation, ultraviolet (UV) irradiation and redox active compounds, and thus are naturally tolerant to a variety of stresses. Here, we identified mutations that promote enhanced tolerance of halophilic archaea to redox-active compounds using Haloferax volcanii as a model organism. The strains were isolated from a library of random transposon mutants for growth on high doses of sodium hypochlorite (NaOCl), an agent that forms hypochlorous acid (HOCl) and other redox acid compounds common to aqueous environments of high concentrations of chloride. The transposon insertion site in each of twenty isolated clones was mapped using the following: (i) inverse nested two-step PCR (INT-PCR) and (ii) semi-random two-step PCR (ST-PCR). Genes that were found to be disrupted in hypertolerant strains were associated with lysine deacetylation, proteasomes, transporters, polyamine biosynthesis, electron transfer, and other cellular processes. Further analysis revealed a ΔpsmA1 (α1) markerless deletion strain that produces only the α2 and β proteins of 20S proteasomes was hypertolerant to hypochlorite stress compared with wild type, which produces α1, α2, and β proteins. The results of this study provide new insights into archaeal tolerance of redox active compounds such as hypochlorite.
Collapse
|
45
|
Rapid, Parallel Identification of Catabolism Pathways of Lignin-Derived Aromatic Compounds in Novosphingobium aromaticivorans. Appl Environ Microbiol 2018; 84:AEM.01185-18. [PMID: 30217841 DOI: 10.1128/aem.01185-18] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Accepted: 09/05/2018] [Indexed: 11/20/2022] Open
Abstract
Transposon mutagenesis is a powerful technique in microbial genetics for the identification of genes in uncharacterized pathways. Recently, the throughput of transposon mutagenesis techniques has been dramatically increased through the combination of DNA barcoding and high-throughput sequencing. Here, we show that when applied to catabolic pathways, barcoded transposon libraries can be used to distinguish redundant pathways, decompose complex pathways into substituent modules, discriminate between enzyme homologs, and rapidly identify previously hypothetical enzymes in an unbiased genome-scale search. We used this technique to identify two genes, desC and desD, which are involved in the degradation of the lignin-derived aromatic compound sinapic acid in the nonmodel bacterium Novosphingobium aromaticivorans We show that DesC is a methyl esterase acting on an intermediate formed during sinapic acid catabolism, providing the last enzyme in a proposed catabolic pathway. This approach will be particularly useful in the identification of complete pathways suitable for heterologous expression in metabolic engineering.IMPORTANCE The identification of the genes involved in specific biochemical transformations is a key step in predicting microbial function from nucleic acid sequences and in engineering microbes to endow them with new functions. We have shown that new techniques for transposon mutagenesis can dramatically simplify this process and enable the rapid identification of genes in uncharacterized pathways. These techniques provide the necessary scale to fully elucidate complex biological networks such as those used to degrade mixtures of lignin-derived aromatic compounds.
Collapse
|
46
|
A novel chlorination-induced ribonuclease YabJ from Staphylococcus aureus. Biosci Rep 2018; 38:BSR20180768. [PMID: 30201692 PMCID: PMC6435465 DOI: 10.1042/bsr20180768] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Revised: 08/15/2018] [Accepted: 08/23/2018] [Indexed: 01/09/2023] Open
Abstract
The characteristic fold of a protein is the decisive factor for its biological function. However, small structural changes to amino acids can also affect their function, for example in the case of post-translational modification (PTM). Many different types of PTMs are known, but for some, including chlorination, studies elucidating their importance are limited. A recent study revealed that the YjgF/YER057c/UK114 family (YjgF family) member RidA from Escherichia coli shows chaperone activity after chlorination. Thus, to identify the functional and structural differences of RidA upon chlorination, we studied an RidA homolog from Staphylococcus aureus: YabJ. The overall structure of S. aureus YabJ was similar to other members of the YjgF family, showing deep pockets on its surface, and the residues composing the pockets were well conserved. S. aureus YabJ was highly stable after chlorination, and the chlorinated state is reversible by treatment with DTT. However, it shows no chaperone activity after chlorination. Instead, YabJ from S. aureus shows chlorination-induced ribonuclease activity, and the activity is diminished after subsequent reduction. Even though the yabJ genes from Staphylococcus and Bacillus are clustered with regulators that are expected to code nucleic acid-interacting proteins, the nucleic acid-related activity of bacterial RidA has not been identified before. From our study, we revealed the structure and function of S. aureus YabJ as a novel chlorination-activated ribonuclease. The present study will contribute to an in-depth understanding of chlorination as a PTM.
Collapse
|
47
|
de Crécy-Lagard V, Haas D, Hanson AD. Newly-discovered enzymes that function in metabolite damage-control. Curr Opin Chem Biol 2018; 47:101-108. [PMID: 30268903 DOI: 10.1016/j.cbpa.2018.09.014] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Revised: 08/19/2018] [Accepted: 09/11/2018] [Indexed: 01/26/2023]
Abstract
Enzymes of unknown function are estimated to make up around 25% of the sequenced proteome. In the past decade, over 20 conserved families have been shown to function in the metabolism of 'damaged' or abnormal metabolites that are wasteful and often toxic. These newly discovered damage-control enzymes either repair or inactivate the offending metabolites, or pre-empt their formation in the first place. Comparative genomics has been of prime importance in predicting the functions of damage-control enzymes and in guiding the biochemical and genetic tests required to validate these functions.
Collapse
Affiliation(s)
- Valérie de Crécy-Lagard
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL, USA; Genetics Institute, University of Florida, Gainesville, FL, USA.
| | - Drago Haas
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL, USA
| | - Andrew D Hanson
- Horticultural Sciences Department, University of Florida, Gainesville, FL, USA
| |
Collapse
|
48
|
Zallot R, Oberg NO, Gerlt JA. 'Democratized' genomic enzymology web tools for functional assignment. Curr Opin Chem Biol 2018; 47:77-85. [PMID: 30268904 DOI: 10.1016/j.cbpa.2018.09.009] [Citation(s) in RCA: 94] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Revised: 09/10/2018] [Accepted: 09/11/2018] [Indexed: 12/24/2022]
Abstract
The protein databases contain an exponentially growing number of sequences as a result of the recent increase in ease and decrease in cost of genome sequencing. The rate of data accumulation far exceeds the rate of functional studies, producing an increase in genomic 'dark matter', sequences for which no precise and validated function is defined. Publicly accessible, that is 'democratized,' genomic enzymology web tools are essential to leverage the protein and genome databases for discovery of the in vitro activities and in vivo functions of novel enzymes and proteins belonging to the dark matter. In this review, we discuss the use of web tools that have proven successful for functional assignment. We also describe a mechanism for ensuring the capture of published functional data so that the quality of both curated and automated annotations transfer can be improved.
Collapse
Affiliation(s)
- Rémi Zallot
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, 1206 West Gregory Drive, Urbana, IL 61801, United States
| | - Nils O Oberg
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, 1206 West Gregory Drive, Urbana, IL 61801, United States
| | - John A Gerlt
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, 1206 West Gregory Drive, Urbana, IL 61801, United States; Department of Biochemistry, University of Illinois at Urbana-Champaign, 1206 West Gregory Drive, Urbana, IL 61801, United States; Department of Chemistry, University of Illinois at Urbana-Champaign, 1206 West Gregory Drive, Urbana, IL 61801, United States.
| |
Collapse
|
49
|
Gibson CL, Codreanu SG, Schrimpe-Rutledge AC, Retzlaff CL, Wright J, Mortlock DP, Sherrod SD, McLean JA, Blakely RD. Global untargeted serum metabolomic analyses nominate metabolic pathways responsive to loss of expression of the orphan metallo β-lactamase, MBLAC1. Mol Omics 2018; 14:142-155. [PMID: 29868674 PMCID: PMC6015503 DOI: 10.1039/c7mo00022g] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
The C. elegans gene swip-10 encodes an orphan metallo β-lactamase that genetic studies indicate is vital for limiting neuronal excitability and viability. Sequence analysis indicates that the mammalian gene Mblac1 is the likely ortholog of swip-10, with greatest sequence identity localized to the encoded protein's single metallo β-lactamase domain. The substrate for the SWIP-10 protein remains unknown and to date no functional roles have been ascribed to MBLAC1, though we have shown that the protein binds the neuroprotective β-lactam antibiotic, ceftriaxone. To gain insight into the functional role of MBLAC1 in vivo, we used CRISPR/Cas9 methods to disrupt N-terminal coding sequences of the mouse Mblac1 gene, resulting in a complete loss of protein expression in viable, homozygous knockout (KO) animals. Using serum from both WT and KO mice, we performed global, untargeted metabolomic analyses, resolving small molecules via hydrophilic interaction chromatography (HILIC) based ultra-performance liquid chromatography, coupled to mass spectrometry (UPLC-MS/MS). Unsupervised principal component analysis reliably segregated the metabolomes of MBLAC1 KO and WT mice, with 92 features subsequently nominated as significantly different by ANOVA, and for which we made tentative and putative metabolite assignments. Bioinformatic analyses of these molecules nominate validated pathways subserving bile acid biosynthesis and linoleate metabolism, networks known to be responsive to metabolic and oxidative stress. Our findings lead to hypotheses that can guide future targeted studies seeking to identify the substrate for MBLAC1 and how substrate hydrolysis supports the neuroprotective actions of ceftriaxone.
Collapse
Affiliation(s)
- Chelsea L. Gibson
- Department of Biomedical Science, Charles E. Schmidt College of Medicine, Jupiter FL, USA
- Department of Pharmacology, Vanderbilt University, Nashville, TN USA
| | - Simona G. Codreanu
- Department of Chemistry, Vanderbilt University, Nashville, TN USA
- Center for Innovative Technology, Vanderbilt University, Nashville, TN USA
| | - Alexandra C. Schrimpe-Rutledge
- Department of Chemistry, Vanderbilt University, Nashville, TN USA
- Center for Innovative Technology, Vanderbilt University, Nashville, TN USA
| | - Cassandra L. Retzlaff
- Department of Biomedical Science, Charles E. Schmidt College of Medicine, Jupiter FL, USA
| | - Jane Wright
- Department of Pharmacology, Vanderbilt University, Nashville, TN USA
| | - Doug P. Mortlock
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, TN USA
| | - Stacy D. Sherrod
- Department of Chemistry, Vanderbilt University, Nashville, TN USA
- Center for Innovative Technology, Vanderbilt University, Nashville, TN USA
| | - John A. McLean
- Department of Chemistry, Vanderbilt University, Nashville, TN USA
- Center for Innovative Technology, Vanderbilt University, Nashville, TN USA
| | - Randy D. Blakely
- Department of Biomedical Science, Charles E. Schmidt College of Medicine, Jupiter FL, USA
- Brain Institute, Florida Atlantic University, Jupiter FL, USA
| |
Collapse
|
50
|
Komárek J, Ivanov Kavková E, Houser J, Horáčková A, Ždánská J, Demo G, Wimmerová M. Structure and properties of AB21, a novelAgaricus bisporusprotein with structural relation to bacterial pore-forming toxins. Proteins 2018; 86:897-911. [DOI: 10.1002/prot.25522] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Revised: 04/23/2018] [Accepted: 04/26/2018] [Indexed: 12/13/2022]
Affiliation(s)
- Jan Komárek
- Central European Institute of Technology, Masaryk University, Kamenice 5; Brno 62500 Czech Republic
- National Centre for Biomolecular Research; Faculty of Science, Masaryk University, Kotlarska 2; Brno 61137 Czech Republic
| | - Eva Ivanov Kavková
- Department of Biochemistry; Faculty of Science, Masaryk University, Kotlarska 2; Brno 61137 Czech Republic
| | - Josef Houser
- Central European Institute of Technology, Masaryk University, Kamenice 5; Brno 62500 Czech Republic
- National Centre for Biomolecular Research; Faculty of Science, Masaryk University, Kotlarska 2; Brno 61137 Czech Republic
| | - Aneta Horáčková
- Department of Biochemistry; Faculty of Science, Masaryk University, Kotlarska 2; Brno 61137 Czech Republic
| | - Jitka Ždánská
- Central European Institute of Technology, Masaryk University, Kamenice 5; Brno 62500 Czech Republic
| | - Gabriel Demo
- Central European Institute of Technology, Masaryk University, Kamenice 5; Brno 62500 Czech Republic
- National Centre for Biomolecular Research; Faculty of Science, Masaryk University, Kotlarska 2; Brno 61137 Czech Republic
| | - Michaela Wimmerová
- Central European Institute of Technology, Masaryk University, Kamenice 5; Brno 62500 Czech Republic
- National Centre for Biomolecular Research; Faculty of Science, Masaryk University, Kotlarska 2; Brno 61137 Czech Republic
- Department of Biochemistry; Faculty of Science, Masaryk University, Kotlarska 2; Brno 61137 Czech Republic
| |
Collapse
|