1
|
Jamal QMS, Ahmad V. Bacterial metabolomics: current applications for human welfare and future aspects. JOURNAL OF ASIAN NATURAL PRODUCTS RESEARCH 2024:1-24. [PMID: 39078342 DOI: 10.1080/10286020.2024.2385365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 07/22/2024] [Accepted: 07/24/2024] [Indexed: 07/31/2024]
Abstract
An imbalanced microbiome is linked to several diseases, such as cancer, inflammatory bowel disease, obesity, and even neurological disorders. Bacteria and their by-products are used for various industrial and clinical purposes. The metabolites under discussion were chosen based on their biological impacts on host and gut microbiota interactions as established by metabolome research. The separation of bacterial metabolites by using statistics and machine learning analysis creates new opportunities for applications of bacteria and their metabolites in the environmental and medical sciences. Thus, the metabolite production strategies, methodologies, and importance of bacterial metabolites for human well-being are discussed in this review.
Collapse
Affiliation(s)
- Qazi Mohammad Sajid Jamal
- Department of Health Informatics, College of Applied Medical Sciences, Qassim University, Buraydah 51452, Saudi Arabia
| | - Varish Ahmad
- Health Information Technology Department, The Applied College, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| |
Collapse
|
2
|
Konno N, Maeno S, Tanizawa Y, Arita M, Endo A, Iwasaki W. Evolutionary paths toward multi-level convergence of lactic acid bacteria in fructose-rich environments. Commun Biol 2024; 7:902. [PMID: 39048718 PMCID: PMC11269746 DOI: 10.1038/s42003-024-06580-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 07/11/2024] [Indexed: 07/27/2024] Open
Abstract
Convergence provides clues to unveil the non-random nature of evolution. Intermediate paths toward convergence inform us of the stochasticity and the constraint of evolutionary processes. Although previous studies have suggested that substantial constraints exist in microevolutionary paths, it remains unclear whether macroevolutionary convergence follows stochastic or constrained paths. Here, we performed comparative genomics for hundreds of lactic acid bacteria (LAB) species, including clades showing a convergent gene repertoire and sharing fructose-rich habitats. By adopting phylogenetic comparative methods we showed that the genomic convergence of distinct fructophilic LAB (FLAB) lineages was caused by parallel losses of more than a hundred orthologs and the gene losses followed significantly similar orders. Our results further suggested that the loss of adhE, a key gene for phenotypic convergence to FLAB, follows a specific evolutionary path of domain architecture decay and amino acid substitutions in multiple LAB lineages sharing fructose-rich habitats. These findings unveiled the constrained evolutionary paths toward the convergence of free-living bacterial clades at the genomic and molecular levels.
Collapse
Affiliation(s)
- Naoki Konno
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Bunkyo-ku, Tokyo, Japan.
| | - Shintaro Maeno
- Research Center for Advance Science and Innovation Organization for Research Initiatives, Yamaguchi University, Yamaguchi, Yamaguchi, Japan
| | - Yasuhiro Tanizawa
- Department of Informatics, National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Masanori Arita
- Department of Informatics, National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Akihito Endo
- Department of Nutritional Science and Food Safety, Faculty of Applied Bioscience, Tokyo University of Agriculture, Tokyo, Japan
| | - Wataru Iwasaki
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Bunkyo-ku, Tokyo, Japan.
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan.
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan.
- Atmosphere and Ocean Research Institute, The University of Tokyo, Kashiwa, Chiba, Japan.
- Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan.
- Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Bunkyo-ku, Tokyo, Japan.
| |
Collapse
|
3
|
Wolfe JM. Pangenomes at the limits of evolution. Trends Ecol Evol 2024; 39:419-420. [PMID: 38580497 DOI: 10.1016/j.tree.2024.03.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 03/25/2024] [Indexed: 04/07/2024]
Abstract
Evolutionary pathways can be random or deterministic. In a recent article, Beavan et al. investigate this balance by applying machine learning models to microbial pangenomes. The presence of almost one-third of genes can be reliably inferred, indicating a surprising amount of predictable evolution.
Collapse
Affiliation(s)
- Joanna M Wolfe
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA; Department of Organismic & Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
4
|
Garzon MH, Colorado FA. Towards an Analytical Biology. Curr Genomics 2024; 25:65-68. [PMID: 38751597 PMCID: PMC11092911 DOI: 10.2174/0113892029283759231227075715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 11/22/2023] [Accepted: 12/14/2023] [Indexed: 05/18/2024] Open
Abstract
This article draws a perspective on the increasingly unavoidable question of whether steps can be taken in genomics and biology at large to move them more rapidly towards more analytical and deductive biology, akin to similar developments that occurred in other natural sciences, such as physics and chemistry, centuries ago. It provides a summary of recent advances in other relevant sciences in the last 3 decades that are likely to pull it in that direction in the next decade or so, as well as what methods and tools will make it possible.
Collapse
Affiliation(s)
- Max H. Garzon
- Department of Computer Science, University of Memphis, 373 Dunn, USA
| | - Fredy A. Colorado
- Department of Biology, National University of Colombia, Bogotá, Colombia
| |
Collapse
|
5
|
Hwang Y, Cornman AL, Kellogg EH, Ovchinnikov S, Girguis PR. Genomic language model predicts protein co-regulation and function. Nat Commun 2024; 15:2880. [PMID: 38570504 PMCID: PMC10991518 DOI: 10.1038/s41467-024-46947-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 03/13/2024] [Indexed: 04/05/2024] Open
Abstract
Deciphering the relationship between a gene and its genomic context is fundamental to understanding and engineering biological systems. Machine learning has shown promise in learning latent relationships underlying the sequence-structure-function paradigm from massive protein sequence datasets. However, to date, limited attempts have been made in extending this continuum to include higher order genomic context information. Evolutionary processes dictate the specificity of genomic contexts in which a gene is found across phylogenetic distances, and these emergent genomic patterns can be leveraged to uncover functional relationships between gene products. Here, we train a genomic language model (gLM) on millions of metagenomic scaffolds to learn the latent functional and regulatory relationships between genes. gLM learns contextualized protein embeddings that capture the genomic context as well as the protein sequence itself, and encode biologically meaningful and functionally relevant information (e.g. enzymatic function, taxonomy). Our analysis of the attention patterns demonstrates that gLM is learning co-regulated functional modules (i.e. operons). Our findings illustrate that gLM's unsupervised deep learning of the metagenomic corpus is an effective and promising approach to encode functional semantics and regulatory syntax of genes in their genomic contexts and uncover complex relationships between genes in a genomic region.
Collapse
Affiliation(s)
- Yunha Hwang
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA.
| | | | - Elizabeth H Kellogg
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, USA.
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Peter R Girguis
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
6
|
Beavan A, Domingo-Sananes MR, McInerney JO. Contingency, repeatability, and predictability in the evolution of a prokaryotic pangenome. Proc Natl Acad Sci U S A 2024; 121:e2304934120. [PMID: 38147560 PMCID: PMC10769857 DOI: 10.1073/pnas.2304934120] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 11/05/2023] [Indexed: 12/28/2023] Open
Abstract
Pangenomes exhibit remarkable variability in many prokaryotic species, much of which is maintained through the processes of horizontal gene transfer and gene loss. Repeated acquisitions of near-identical homologs can easily be observed across pangenomes, leading to the question of whether these parallel events potentiate similar evolutionary trajectories, or whether the remarkably different genetic backgrounds of the recipients mean that postacquisition evolutionary trajectories end up being quite different. In this study, we present a machine learning method that predicts the presence or absence of genes in the Escherichia coli pangenome based on complex patterns of the presence or absence of other accessory genes within a genome. Our analysis leverages the repeated transfer of genes through the E. coli pangenome to observe patterns of repeated evolution following similar events. We find that the presence or absence of a substantial set of genes is highly predictable from other genes alone, indicating that selection potentiates and maintains gene-gene co-occurrence and avoidance relationships deterministically over long-term bacterial evolution and is robust to differences in host evolutionary history. We propose that at least part of the pangenome can be understood as a set of genes with relationships that govern their likely cohabitants, analogous to an ecosystem's set of interacting organisms. Our findings indicate that intragenomic gene fitness effects may be key drivers of prokaryotic evolution, influencing the repeated emergence of complex gene-gene relationships across the pangenome.
Collapse
Affiliation(s)
- Alan Beavan
- School of Life Sciences, The University of Nottingham, NottinghamNG7 2UH, United Kingdom
| | - Maria Rosa Domingo-Sananes
- School of Life Sciences, The University of Nottingham, NottinghamNG7 2UH, United Kingdom
- School of Science and Technology, Nottingham Trent University, NottinghamNG1 4FQ, United Kingdom
| | - James O. McInerney
- School of Life Sciences, The University of Nottingham, NottinghamNG7 2UH, United Kingdom
| |
Collapse
|
7
|
Nardulli P, Ballini A, Zamparella M, De Vito D. The Role of Stakeholders' Understandings in Emerging Antimicrobial Resistance: A One Health Approach. Microorganisms 2023; 11:2797. [PMID: 38004808 PMCID: PMC10673085 DOI: 10.3390/microorganisms11112797] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 11/10/2023] [Accepted: 11/14/2023] [Indexed: 11/26/2023] Open
Abstract
The increasing misuse of antibiotics in human and veterinary medicine and in agroecosystems and the consequent selective pressure of resistant strains lead to multidrug resistance (AMR), an expanding global phenomenon. Indeed, this phenomenon represents a major public health target with significant clinical implications related to increased morbidity and mortality and prolonged hospital stays. The current presence of microorganisms multi-resistant to antibiotics isolated in patients is a problem because of the additional burden of disease it places on the most fragile patients and the difficulty of finding effective therapies. In recent decades, international organizations like the World Health Organization (WHO) and the European Centre for Disease Prevention and Control (ECDC) have played significant roles in addressing the issue of AMR. The ECDC estimates that in the European Union alone, antibiotic resistance causes 33,000 deaths and approximately 880,000 cases of disability each year. The epidemiological impact of AMR inevitably also has direct economic consequences related not only to the loss of life but also to a reduction in the number of days worked, increased use of healthcare resources for diagnostic procedures and the use of second-line antibiotics when available. In 2015, the WHO, recognising AMR as a complex problem that can only be addressed by coordinated multi-sectoral interventions, promoted the One Health approach that considers human, animal, and environmental health in an integrated manner. In this review, the authors try to address why a collaboration of all stakeholders involved in AMR growth and management is necessary in order to achieve optimal health for people, animals, plants, and the environment, highlighting that AMR is a growing threat to human and animal health, food safety and security, economic prosperity, and ecosystems worldwide.
Collapse
Affiliation(s)
- Patrizia Nardulli
- S.C. Farmacia e UMACA IRCCS Istituto Tumori “Giovanni Paolo II”, Viale O. Flacco 65, 70124 Bari, Italy;
| | - Andrea Ballini
- Department of Clinical and Experimental Medicine, University of Foggia, 71122 Foggia, Italy
| | | | - Danila De Vito
- Department of Translational Biomedicine and Neuroscience, Medical School, University Aldo Moro of Bari, 70124 Bari, Italy;
| |
Collapse
|
8
|
Babele PK, Srivastava A, Young JD. Metabolic flux phenotyping of secondary metabolism in cyanobacteria. Trends Microbiol 2023; 31:1118-1130. [PMID: 37331829 DOI: 10.1016/j.tim.2023.05.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/10/2023] [Accepted: 05/15/2023] [Indexed: 06/20/2023]
Abstract
Cyanobacteria generate energy from photosynthesis and produce various secondary metabolites with diverse commercial and pharmaceutical applications. Unique metabolic and regulatory pathways in cyanobacteria present new challenges for researchers to enhance their product yields, titers, and rates. Therefore, further advancements are critically needed to establish cyanobacteria as a preferred bioproduction platform. Metabolic flux analysis (MFA) quantitatively determines the intracellular flows of carbon within complex biochemical networks, which elucidate the control of metabolic pathways by transcriptional, translational, and allosteric regulatory mechanisms. The emerging field of systems metabolic engineering (SME) involves the use of MFA and other omics technologies to guide the rational development of microbial production strains. This review highlights the potential of MFA and SME to optimize the production of cyanobacterial secondary metabolites and discusses the technical challenges that lie ahead.
Collapse
Affiliation(s)
- Piyoosh K Babele
- College of Agriculture, Rani Lakshmi Bai Central Agricultural University Jhansi, 284003, Uttar Pradesh, India.
| | - Amit Srivastava
- University of Jyväskylä, Nanoscience Centre, Department of Biological and Environmental Science, 40014 Jyväskylä, Finland
| | - Jamey D Young
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, PMB 351604, Nashville, TN 37235-1604, USA; Department of Molecular Physiology and Biophysics, Vanderbilt University, PMB 351604, Nashville, TN 37235-1604, USA.
| |
Collapse
|
9
|
Castelli P, De Ruvo A, Bucciacchio A, D'Alterio N, Cammà C, Di Pasquale A, Radomski N. Harmonization of supervised machine learning practices for efficient source attribution of Listeria monocytogenes based on genomic data. BMC Genomics 2023; 24:560. [PMID: 37736708 PMCID: PMC10515079 DOI: 10.1186/s12864-023-09667-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 09/10/2023] [Indexed: 09/23/2023] Open
Abstract
BACKGROUND Genomic data-based machine learning tools are promising for real-time surveillance activities performing source attribution of foodborne bacteria such as Listeria monocytogenes. Given the heterogeneity of machine learning practices, our aim was to identify those influencing the source prediction performance of the usual holdout method combined with the repeated k-fold cross-validation method. METHODS A large collection of 1 100 L. monocytogenes genomes with known sources was built according to several genomic metrics to ensure authenticity and completeness of genomic profiles. Based on these genomic profiles (i.e. 7-locus alleles, core alleles, accessory genes, core SNPs and pan kmers), we developed a versatile workflow assessing prediction performance of different combinations of training dataset splitting (i.e. 50, 60, 70, 80 and 90%), data preprocessing (i.e. with or without near-zero variance removal), and learning models (i.e. BLR, ERT, RF, SGB, SVM and XGB). The performance metrics included accuracy, Cohen's kappa, F1-score, area under the curves from receiver operating characteristic curve, precision recall curve or precision recall gain curve, and execution time. RESULTS The testing average accuracies from accessory genes and pan kmers were significantly higher than accuracies from core alleles or SNPs. While the accuracies from 70 and 80% of training dataset splitting were not significantly different, those from 80% were significantly higher than the other tested proportions. The near-zero variance removal did not allow to produce results for 7-locus alleles, did not impact significantly the accuracy for core alleles, accessory genes and pan kmers, and decreased significantly accuracy for core SNPs. The SVM and XGB models did not present significant differences in accuracy between each other and reached significantly higher accuracies than BLR, SGB, ERT and RF, in this order of magnitude. However, the SVM model required more computing power than the XGB model, especially for high amount of descriptors such like core SNPs and pan kmers. CONCLUSIONS In addition to recommendations about machine learning practices for L. monocytogenes source attribution based on genomic data, the present study also provides a freely available workflow to solve other balanced or unbalanced multiclass phenotypes from binary and categorical genomic profiles of other microorganisms without source code modifications.
Collapse
Affiliation(s)
- Pierluigi Castelli
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Andrea De Ruvo
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Andrea Bucciacchio
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Nicola D'Alterio
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Cesare Cammà
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Adriano Di Pasquale
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Nicolas Radomski
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy.
| |
Collapse
|
10
|
Thai TD, Lim W, Na D. Synthetic bacteria for the detection and bioremediation of heavy metals. Front Bioeng Biotechnol 2023; 11:1178680. [PMID: 37122866 PMCID: PMC10133563 DOI: 10.3389/fbioe.2023.1178680] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 04/04/2023] [Indexed: 05/02/2023] Open
Abstract
Toxic heavy metal accumulation is one of anthropogenic environmental pollutions, which poses risks to human health and ecological systems. Conventional heavy metal remediation approaches rely on expensive chemical and physical processes leading to the formation and release of other toxic waste products. Instead, microbial bioremediation has gained interest as a promising and cost-effective alternative to conventional methods, but the genetic complexity of microorganisms and the lack of appropriate genetic engineering technologies have impeded the development of bioremediating microorganisms. Recently, the emerging synthetic biology opened a new avenue for microbial bioremediation research and development by addressing the challenges and providing novel tools for constructing bacteria with enhanced capabilities: rapid detection and degradation of heavy metals while enhanced tolerance to toxic heavy metals. Moreover, synthetic biology also offers new technologies to meet biosafety regulations since genetically modified microorganisms may disrupt natural ecosystems. In this review, we introduce the use of microorganisms developed based on synthetic biology technologies for the detection and detoxification of heavy metals. Additionally, this review explores the technical strategies developed to overcome the biosafety requirements associated with the use of genetically modified microorganisms.
Collapse
Affiliation(s)
| | | | - Dokyun Na
- Department of Biomedical Engineering, Chung-Ang University, Seoul, Republic of Korea
| |
Collapse
|