1
|
Aplakidou E, Vergoulidis N, Chasapi M, Venetsianou NK, Kokoli M, Panagiotopoulou E, Iliopoulos I, Karatzas E, Pafilis E, Georgakopoulos-Soares I, Kyrpides NC, Pavlopoulos GA, Baltoumas FA. Visualizing metagenomic and metatranscriptomic data: A comprehensive review. Comput Struct Biotechnol J 2024; 23:2011-2033. [PMID: 38765606 PMCID: PMC11101950 DOI: 10.1016/j.csbj.2024.04.060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Revised: 04/25/2024] [Accepted: 04/25/2024] [Indexed: 05/22/2024] Open
Abstract
The fields of Metagenomics and Metatranscriptomics involve the examination of complete nucleotide sequences, gene identification, and analysis of potential biological functions within diverse organisms or environmental samples. Despite the vast opportunities for discovery in metagenomics, the sheer volume and complexity of sequence data often present challenges in processing analysis and visualization. This article highlights the critical role of advanced visualization tools in enabling effective exploration, querying, and analysis of these complex datasets. Emphasizing the importance of accessibility, the article categorizes various visualizers based on their intended applications and highlights their utility in empowering bioinformaticians and non-bioinformaticians to interpret and derive insights from meta-omics data effectively.
Collapse
Affiliation(s)
- Eleni Aplakidou
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- Department of Informatics and Telecommunications, Data Science and Information Technologies program, University of Athens, 15784 Athens, Greece
| | - Nikolaos Vergoulidis
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
| | - Maria Chasapi
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- Department of Informatics and Telecommunications, Data Science and Information Technologies program, University of Athens, 15784 Athens, Greece
| | - Nefeli K. Venetsianou
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
| | - Maria Kokoli
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
| | - Eleni Panagiotopoulou
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- Department of Informatics and Telecommunications, Data Science and Information Technologies program, University of Athens, 15784 Athens, Greece
| | - Ioannis Iliopoulos
- Department of Basic Sciences, School of Medicine, University of Crete, 71003 Heraklion, Greece
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Heraklion, Greece
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Nikos C. Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Georgios A. Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Center of New Biotechnologies & Precision Medicine, Department of Medicine, School of Health Sciences, National and Kapodistrian University of Athens, Greece
- Hellenic Army Academy, 16673 Vari, Greece
| | - Fotis A. Baltoumas
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
| |
Collapse
|
2
|
Li M, Cruz CD, Ilina P, Tammela P. High-throughput combination assay for studying biofilm formation of uropathogenic Escherichia coli. Arch Microbiol 2024; 206:344. [PMID: 38967798 PMCID: PMC11226472 DOI: 10.1007/s00203-024-04029-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 05/23/2024] [Accepted: 06/03/2024] [Indexed: 07/06/2024]
Abstract
Uropathogenic Escherichia coli, the most common cause for urinary tract infections, forms biofilm enhancing its antibiotic resistance. To assess the effects of compounds on biofilm formation of uropathogenic Escherichia coli UMN026 strain, a high-throughput combination assay using resazurin followed by crystal violet staining was optimized for 384-well microplate. Optimized assay parameters included, for example, resazurin and crystal violet concentrations, and incubation time for readouts. For the assay validation, quality parameters Z' factor, coefficient of variation, signal-to-noise, and signal-to-background were calculated. Microplate uniformity, signal variability, edge well effects, and fold shift were also assessed. Finally, a screening with known antibacterial compounds was conducted to evaluate the assay performance. The best conditions found were achieved by using 12 µg/mL resazurin for 150 min and 0.023% crystal violet. This assay was able to detect compounds displaying antibiofilm activity against UMN026 strain at sub-inhibitory concentrations, in terms of metabolic activity and/or biomass.
Collapse
Affiliation(s)
- M Li
- Drug Research Program, Division of Pharmaceutical Biosciences, Faculty of Pharmacy, University of Helsinki, P.O. Box 56, Helsinki, FI-00014, Finland
| | - C D Cruz
- Drug Research Program, Division of Pharmaceutical Biosciences, Faculty of Pharmacy, University of Helsinki, P.O. Box 56, Helsinki, FI-00014, Finland
| | - P Ilina
- Drug Research Program, Division of Pharmaceutical Biosciences, Faculty of Pharmacy, University of Helsinki, P.O. Box 56, Helsinki, FI-00014, Finland
| | - P Tammela
- Drug Research Program, Division of Pharmaceutical Biosciences, Faculty of Pharmacy, University of Helsinki, P.O. Box 56, Helsinki, FI-00014, Finland.
| |
Collapse
|
3
|
Chuckran PF, Estera-Molina K, Huntemann M, Foster B, Roux S, Mukherjee S, Hajek P, Reddy TBK, Daum C, Chen IMA, Pennacchio C, Eloe-Fadrosh EA, Dijkstra P, Firestone MK, Blazewicz SJ, Pett-Ridge J. Metatranscriptomes of California grassland soil microbial communities in response to rewetting. Microbiol Resour Announc 2024; 13:e0032224. [PMID: 38771040 DOI: 10.1128/mra.00322-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 04/17/2024] [Indexed: 05/22/2024] Open
Abstract
When very dry soil is rewet, rapid stimulation of microbial activity has important implications for ecosystem biogeochemistry, yet associated changes in microbial transcription are poorly known. Here, we present metatranscriptomes of California annual grassland soil microbial communities, collected over 1 week from soils rewet after a summer drought-providing a time series of short-term transcriptional response during rewetting.
Collapse
Affiliation(s)
- Peter F Chuckran
- Department of Environmental Science, Policy, and Management, University of California, Berkeley, California, USA
| | - Katerina Estera-Molina
- Department of Environmental Science, Policy, and Management, University of California, Berkeley, California, USA
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California, USA
| | - Marcel Huntemann
- Department of Energy, Joint Genome Institute, Berkeley, California, USA
| | - Brian Foster
- Department of Energy, Joint Genome Institute, Berkeley, California, USA
| | - Simon Roux
- Department of Energy, Joint Genome Institute, Berkeley, California, USA
| | | | - Patrick Hajek
- Department of Energy, Joint Genome Institute, Berkeley, California, USA
| | - T B K Reddy
- Department of Energy, Joint Genome Institute, Berkeley, California, USA
| | - Chris Daum
- Department of Energy, Joint Genome Institute, Berkeley, California, USA
| | - I-Min A Chen
- Department of Energy, Joint Genome Institute, Berkeley, California, USA
| | | | | | - Paul Dijkstra
- Center for Ecosystem Science and Society (ECOSS) and Department of Biological Sciences, Northern Arizona University, Flagstaff, Arizona, USA
| | - Mary K Firestone
- Department of Environmental Science, Policy, and Management, University of California, Berkeley, California, USA
| | - Steven J Blazewicz
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California, USA
| | - Jennifer Pett-Ridge
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California, USA
- Life & Environmental Sciences Department, University of California Merced, Merced, California, USA
- Innovative Genomics Institute, University of California Berkeley, Berkeley, Californi, USA
| |
Collapse
|
4
|
Coelho LP, Santos-Júnior CD, de la Fuente-Nunez C. Challenges in computational discovery of bioactive peptides in 'omics data. Proteomics 2024; 24:e2300105. [PMID: 38458994 DOI: 10.1002/pmic.202300105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 02/06/2024] [Accepted: 02/06/2024] [Indexed: 03/10/2024]
Abstract
Peptides have a plethora of activities in biological systems that can potentially be exploited biotechnologically. Several peptides are used clinically, as well as in industry and agriculture. The increase in available 'omics data has recently provided a large opportunity for mining novel enzymes, biosynthetic gene clusters, and molecules. While these data primarily consist of DNA sequences, other types of data provide important complementary information. Due to their size, the approaches proven successful at discovering novel proteins of canonical size cannot be naïvely applied to the discovery of peptides. Peptides can be encoded directly in the genome as short open reading frames (smORFs), or they can be derived from larger proteins by proteolysis. Both of these peptide classes pose challenges as simple methods for their prediction result in large numbers of false positives. Similarly, functional annotation of larger proteins, traditionally based on sequence similarity to infer orthology and then transferring functions between characterized proteins and uncharacterized ones, cannot be applied for short sequences. The use of these techniques is much more limited and alternative approaches based on machine learning are used instead. Here, we review the limitations of traditional methods as well as the alternative methods that have recently been developed for discovering novel bioactive peptides with a focus on prokaryotic genomes and metagenomes.
Collapse
Affiliation(s)
- Luis Pedro Coelho
- Centre for Microbiome Research, School of Biomedical Sciences, Queensland University of Technology, Woolloongabba, Queensland, Australia
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
| | - Célio Dias Santos-Júnior
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
- Laboratory of Microbial Processes & Biodiversity - LMPB, Hydrobiology Department, Federal University of São Carlos - UFSCar, São Paulo, Brazil
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
5
|
Jensen RO, Schulz F, Roux S, Klingeman DM, Mitchell WP, Udwary D, Moraïs S, Reynoso V, Winkler J, Nagaraju S, De Tissera S, Shapiro N, Ivanova N, Reddy TBK, Mizrahi I, Utturkar SM, Bayer EA, Woyke T, Mouncey NJ, Jewett MC, Simpson SD, Köpke M, Jones DT, Brown SD. Phylogenomics and genetic analysis of solvent-producing Clostridium species. Sci Data 2024; 11:432. [PMID: 38693191 PMCID: PMC11063209 DOI: 10.1038/s41597-024-03210-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Accepted: 04/02/2024] [Indexed: 05/03/2024] Open
Abstract
The genus Clostridium is a large and diverse group within the Bacillota (formerly Firmicutes), whose members can encode useful complex traits such as solvent production, gas-fermentation, and lignocellulose breakdown. We describe 270 genome sequences of solventogenic clostridia from a comprehensive industrial strain collection assembled by Professor David Jones that includes 194 C. beijerinckii, 57 C. saccharobutylicum, 4 C. saccharoperbutylacetonicum, 5 C. butyricum, 7 C. acetobutylicum, and 3 C. tetanomorphum genomes. We report methods, analyses and characterization for phylogeny, key attributes, core biosynthetic genes, secondary metabolites, plasmids, prophage/CRISPR diversity, cellulosomes and quorum sensing for the 6 species. The expanded genomic data described here will facilitate engineering of solvent-producing clostridia as well as non-model microorganisms with innately desirable traits. Sequences could be applied in conventional platform biocatalysts such as yeast or Escherichia coli for enhanced chemical production. Recently, gene sequences from this collection were used to engineer Clostridium autoethanogenum, a gas-fermenting autotrophic acetogen, for continuous acetone or isopropanol production, as well as butanol, butanoic acid, hexanol and hexanoic acid production.
Collapse
Affiliation(s)
| | - Frederik Schulz
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Simon Roux
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | | | | | - Daniel Udwary
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Sarah Moraïs
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva, 84105, Israel
| | | | | | | | | | - Nicole Shapiro
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Natalia Ivanova
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - T B K Reddy
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Itzhak Mizrahi
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva, 84105, Israel
| | - Sagar M Utturkar
- Institute for Cancer Research, Purdue University, West Lafayette, IN, USA
| | - Edward A Bayer
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva, 84105, Israel
- Department of Biomolecular Sciences, The Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Tanja Woyke
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- University of California Merced, Life and Environmental Sciences, Merced, CA, USA
| | - Nigel J Mouncey
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Michael C Jewett
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | | | | | - David T Jones
- Department of Microbiology, University of Otago, Dunedin, New Zealand.
| | | |
Collapse
|
6
|
Ren K, Zhou F, Zhang F, Yin M, Zhu Y, Wang S, Chen Y, Huang T, Wu Z, He J, Zhang A, Guo C, Huang Z. Discovery and structural mechanism of DNA endonucleases guided by RAGATH-18-derived RNAs. Cell Res 2024; 34:370-385. [PMID: 38575718 PMCID: PMC11061315 DOI: 10.1038/s41422-024-00952-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 03/09/2024] [Indexed: 04/06/2024] Open
Abstract
CRISPR-Cas systems and IS200/IS605 transposon-associated TnpBs have been utilized for the development of genome editing technologies. Using bioinformatics analysis and biochemical experiments, here we present a new family of RNA-guided DNA endonucleases. Our bioinformatics analysis initially identifies the stable co-occurrence of conserved RAGATH-18-derived RNAs (reRNAs) and their upstream IS607 TnpBs with an average length of 390 amino acids. IS607 TnpBs form programmable DNases through interaction with reRNAs. We discover the robust dsDNA interference activity of IS607 TnpB systems in bacteria and human cells. Further characterization of the Firmicutes bacteria IS607 TnpB system (ISFba1 TnpB) reveals that its dsDNA cleavage activity is remarkably sensitive to single mismatches between the guide and target sequences in human cells. Our findings demonstrate that a length of 20 nt in the guide sequence of reRNA achieves the highest DNA cleavage activity for ISFba1 TnpB. A cryo-EM structure of the ISFba1 TnpB effector protein bound by its cognate RAGATH-18 motif-containing reRNA and a dsDNA target reveals the mechanisms underlying reRNA recognition by ISFba1 TnpB, reRNA-guided dsDNA targeting, and the sensitivity of the ISFba1 TnpB system to base mismatches between the guide and target DNA. Collectively, this study identifies the IS607 TnpB family of compact and specific RNA-guided DNases with great potential for application in gene editing.
Collapse
Affiliation(s)
- Kuan Ren
- HIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Fengxia Zhou
- HIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
- Westlake Center for Genome Editing, Westlake Laboratory of Life Sciences and Biomedicine, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
| | - Fan Zhang
- HIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China.
| | - Mingyu Yin
- HIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Yuwei Zhu
- HIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Shouyu Wang
- HIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Yan Chen
- HIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Tengjin Huang
- HIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Zixuan Wu
- Westlake Center for Genome Editing, Westlake Laboratory of Life Sciences and Biomedicine, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
| | - Jiale He
- Westlake Center for Genome Editing, Westlake Laboratory of Life Sciences and Biomedicine, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
| | - Anqi Zhang
- HIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Changyou Guo
- HIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Zhiwei Huang
- HIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China.
- Westlake Center for Genome Editing, Westlake Laboratory of Life Sciences and Biomedicine, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China.
- New Cornerstone Science Laboratory, Shenzhen, Guangdong, China.
| |
Collapse
|
7
|
Torrance EL, Burton C, Diop A, Bobay LM. Evolution of homologous recombination rates across bacteria. Proc Natl Acad Sci U S A 2024; 121:e2316302121. [PMID: 38657048 PMCID: PMC11067023 DOI: 10.1073/pnas.2316302121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 03/08/2024] [Indexed: 04/26/2024] Open
Abstract
Bacteria are nonsexual organisms but are capable of exchanging DNA at diverse degrees through homologous recombination. Intriguingly, the rates of recombination vary immensely across lineages where some species have been described as purely clonal and others as "quasi-sexual." However, estimating recombination rates has proven a difficult endeavor and estimates often vary substantially across studies. It is unclear whether these variations reflect natural variations across populations or are due to differences in methodologies. Consequently, the impact of recombination on bacterial evolution has not been extensively evaluated and the evolution of recombination rate-as a trait-remains to be accurately described. Here, we developed an approach based on Approximate Bayesian Computation that integrates multiple signals of recombination to estimate recombination rates. We inferred the rate of recombination of 162 bacterial species and one archaeon and tested the robustness of our approach. Our results confirm that recombination rates vary drastically across bacteria; however, we found that recombination rate-as a trait-is conserved in several lineages but evolves rapidly in others. Although some traits are thought to be associated with recombination rate (e.g., GC-content), we found no clear association between genomic or phenotypic traits and recombination rate. Overall, our results provide an overview of recombination rate, its evolution, and its impact on bacterial evolution.
Collapse
Affiliation(s)
- Ellis L Torrance
- Department of Biology, University of North Carolina, Greensboro, NC 27412
| | - Corey Burton
- Department of Biology, University of North Carolina, Greensboro, NC 27412
| | - Awa Diop
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695
| | - Louis-Marie Bobay
- Department of Biology, University of North Carolina, Greensboro, NC 27412
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695
| |
Collapse
|
8
|
Austin GI, Kav AB, Park H, Biermann J, Uhlemann AC, Korem T. Processing-bias correction with DEBIAS-M improves cross-study generalization of microbiome-based prediction models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.09.579716. [PMID: 38405914 PMCID: PMC10888995 DOI: 10.1101/2024.02.09.579716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Every step in common microbiome profiling protocols has variable efficiency for each microbe. For example, different DNA extraction kits may have different efficiency for Gram-positive and -negative bacteria. These variable efficiencies, combined with technical variation, create strong processing biases, which impede the identification of signals that are reproducible across studies and the development of generalizable and biologically interpretable prediction models. "Batch-correction" methods have been used to alleviate these issues computationally with some success. However, many make strong parametric assumptions which do not necessarily apply to microbiome data or processing biases, or require the use of an outcome variable, which risks overfitting. Lastly and importantly, existing transformations used to correct microbiome data are largely non-interpretable, and could, for example, introduce values to features that were initially mostly zeros. Altogether, processing bias currently compromises our ability to glean robust and generalizable biological insights from microbiome data. Here, we present DEBIAS-M (Domain adaptation with phenotype Estimation and Batch Integration Across Studies of the Microbiome), an interpretable framework for inference and correction of processing bias, which facilitates domain adaptation in microbiome studies. DEBIAS-M learns bias-correction factors for each microbe in each batch that simultaneously minimize batch effects and maximize cross-study associations with phenotypes. Using benchmarks of HIV and colorectal cancer classification from gut microbiome data, and cervical neoplasia prediction from cervical microbiome data, we demonstrate that DEBIAS-M outperforms batch-correction methods commonly used in the field. Notably, we show that the inferred bias-correction factors are stable, interpretable, and strongly associated with specific experimental protocols. Overall, we show that DEBIAS-M allows for better modeling of microbiome data and identification of interpretable signals that are reproducible across studies.
Collapse
Affiliation(s)
- George I. Austin
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Aya Brown Kav
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Heekuk Park
- Division of Infectious Diseases, Columbia University Irving Medical Center, New York, NY, USA
| | - Jana Biermann
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Department of Medicine, Division of Hematology/Oncology, Columbia University Irving Medical Center, New York, NY, USA
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Anne-Catrin Uhlemann
- Division of Infectious Diseases, Columbia University Irving Medical Center, New York, NY, USA
| | - Tal Korem
- Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Department of Obstetrics and Gynecology, Columbia University Irving Medical Center, New York, NY, USA
| |
Collapse
|
9
|
Ebenhöh O, Ebeling J, Meyer R, Pohlkotte F, Nies T. Microbial Pathway Thermodynamics: Stoichiometric Models Unveil Anabolic and Catabolic Processes. Life (Basel) 2024; 14:247. [PMID: 38398756 PMCID: PMC10890395 DOI: 10.3390/life14020247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 01/29/2024] [Accepted: 02/05/2024] [Indexed: 02/25/2024] Open
Abstract
The biotechnological exploitation of microorganisms enables the use of metabolism for the production of economically valuable substances, such as drugs or food. It is, thus, unsurprising that the investigation of microbial metabolism and its regulation has been an active research field for many decades. As a result, several theories and techniques were developed that allow for the prediction of metabolic fluxes and yields as biotechnologically relevant output parameters. One important approach is to derive macrochemical equations that describe the overall metabolic conversion of an organism and basically treat microbial metabolism as a black box. The opposite approach is to include all known metabolic reactions of an organism to assemble a genome-scale metabolic model. Interestingly, both approaches are rather successful at characterizing and predicting the expected product yield. Over the years, macrochemical equations especially have been extensively characterized in terms of their thermodynamic properties. However, a common challenge when characterizing microbial metabolism by a single equation is to split this equation into two, describing the two modes of metabolism, anabolism and catabolism. Here, we present strategies to systematically identify separate equations for anabolism and catabolism. Based on metabolic models, we systematically identify all theoretically possible catabolic routes and determine their thermodynamic efficiency. We then show how anabolic routes can be derived, and we use these to approximate biomass yield. Finally, we challenge the view of metabolism as a linear energy converter, in which the free energy gradient of catabolism drives the anabolic reactions.
Collapse
Affiliation(s)
- Oliver Ebenhöh
- Institute of Quantitative and Theoretical Biology, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
- Cluster of Excellence on Plant Sciences, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | - Josha Ebeling
- Institute of Quantitative and Theoretical Biology, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | - Ronja Meyer
- Institute of Quantitative and Theoretical Biology, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | - Fabian Pohlkotte
- Institute of Quantitative and Theoretical Biology, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | - Tim Nies
- Institute of Quantitative and Theoretical Biology, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
| |
Collapse
|
10
|
Manners SH, Carere CR, Dhami MK, Dobson RCJ, Stott MB. Draft genome sequence of Thermococcus waiotapuensis WT1 T, a thermophilic sulfur-dependent archaeon from the order Thermococcales. Microbiol Resour Announc 2024; 13:e0081523. [PMID: 38095867 DOI: 10.1128/mra.00815-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Accepted: 11/21/2023] [Indexed: 12/23/2023] Open
Abstract
Thermococcus waiotapuensis WT1T is a thermophilic, peptide, and amino acid-fermenting archaeon from the order Thermococcales. It was isolated from Waiotapu, Aotearoa-New Zealand, and has a genome size of 1.80 Mbp. The genome contains 2,000 total genes, of which 1,913 encode proteins and 46 encode tRNA.
Collapse
Affiliation(s)
- Sarah H Manners
- Te Kura Pūtaiao Koiora School of Biological Sciences, Te Whare Wānanga o Waitaha University of Canterbury , Christchurch, New Zealand
- Biomolecular Interaction Centre, Te Whare Wānanga o Waitaha, University of Canterbury , Christchurch, New Zealand
| | - Carlo R Carere
- Biomolecular Interaction Centre, Te Whare Wānanga o Waitaha, University of Canterbury , Christchurch, New Zealand
- Department of Chemical and Process Engineering, Te Tari Pūhanga Tukanga Matū, Te Whare Wānanga o Waitaha, University of Canterbury , Christchurch, New Zealand
| | - Manpreet K Dhami
- Biocontrol and Molecular Ecology, Manaaki Whenua Landcare Research , Lincoln, New Zealand
| | - Renwick C J Dobson
- Te Kura Pūtaiao Koiora School of Biological Sciences, Te Whare Wānanga o Waitaha University of Canterbury , Christchurch, New Zealand
- Biomolecular Interaction Centre, Te Whare Wānanga o Waitaha, University of Canterbury , Christchurch, New Zealand
| | - Matthew B Stott
- Te Kura Pūtaiao Koiora School of Biological Sciences, Te Whare Wānanga o Waitaha University of Canterbury , Christchurch, New Zealand
- Biomolecular Interaction Centre, Te Whare Wānanga o Waitaha, University of Canterbury , Christchurch, New Zealand
| |
Collapse
|
11
|
Baltoumas FA, Karatzas E, Liu S, Ovchinnikov S, Sofianatos Y, Chen IM, Kyrpides N, Pavlopoulos G. NMPFamsDB: a database of novel protein families from microbial metagenomes and metatranscriptomes. Nucleic Acids Res 2024; 52:D502-D512. [PMID: 37811892 PMCID: PMC10767849 DOI: 10.1093/nar/gkad800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 09/19/2023] [Indexed: 10/10/2023] Open
Abstract
The Novel Metagenome Protein Families Database (NMPFamsDB) is a database of metagenome- and metatranscriptome-derived protein families, whose members have no hits to proteins of reference genomes or Pfam domains. Each protein family is accompanied by multiple sequence alignments, Hidden Markov Models, taxonomic information, ecosystem and geolocation metadata, sequence and structure predictions, as well as 3D structure models predicted with AlphaFold2. In its current version, NMPFamsDB hosts over 100 000 protein families, each with at least 100 members. The reported protein families significantly expand (more than double) the number of known protein sequence clusters from reference genomes and reveal new insights into their habitat distribution, origins, functions and taxonomy. We expect NMPFamsDB to be a valuable resource for microbial proteome-wide analyses and for further discovery and characterization of novel functions. NMPFamsDB is publicly available in http://www.nmpfamsdb.org/ or https://bib.fleming.gr/NMPFamsDB.
Collapse
Affiliation(s)
- Fotis A Baltoumas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, 16672, Greece
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, 16672, Greece
| | - Sirui Liu
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA 02138, USA
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA 02138, USA
| | - Yorgos Sofianatos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, 16672, Greece
| | - I-Min Chen
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720-8150, USA
| | - Nikos C Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720-8150, USA
| | - Georgios A Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, 16672, Greece
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720-8150, USA
- Center for New Biotechnologies and Precision Medicine, School of Medicine, National and Kapodistrian University of Athens, 75 Mikras Asias Street, Athens 11527, Greece
| |
Collapse
|
12
|
Chen T, Yang M, Cui G, Tang J, Shen Y, Liu J, Yuan Y, Guo J, Huang L. IMP: bridging the gap for medicinal plant genomics. Nucleic Acids Res 2024; 52:D1347-D1354. [PMID: 37870445 PMCID: PMC10767881 DOI: 10.1093/nar/gkad898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 10/02/2023] [Accepted: 10/04/2023] [Indexed: 10/24/2023] Open
Abstract
Medicinal plants have garnered significant attention in ethnomedicine and traditional medicine due to their potential antitumor, anti-inflammatory and antioxidant properties. Recent advancements in genome sequencing and synthetic biology have revitalized interest in natural products. Despite the availability of sequenced genomes and transcriptomes of these plants, the absence of publicly accessible gene annotations and tabular formatted gene expression data has hindered their effective utilization. To address this pressing issue, we have developed IMP (Integrated Medicinal Plantomics), a freely accessible platform at https://www.bic.ac.cn/IMP. IMP curated a total of 8 565 672 genes for 84 high-quality genome assemblies, and 2156 transcriptome sequencing samples encompassing various organs, tissues, developmental stages and stimulations. With the integrated 10 analysis modules, users could simply examine gene annotations, sequences, functions, distributions and expressions in IMP in a one-stop mode. We firmly believe that IMP will play a vital role in enhancing the understanding of molecular metabolic pathways in medicinal plants or plants with medicinal benefits, thereby driving advancements in synthetic biology, and facilitating the exploration of natural sources for valuable chemical constituents like drug discovery and drug production.
Collapse
Affiliation(s)
- Tong Chen
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100000, China
| | - Mei Yang
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100000, China
- Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 301617, China
| | - Guanghong Cui
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100000, China
| | - Jinfu Tang
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100000, China
| | - Ye Shen
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100000, China
| | - Juan Liu
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100000, China
| | - Yuan Yuan
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100000, China
| | - Juan Guo
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100000, China
| | - Luqi Huang
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100000, China
| |
Collapse
|
13
|
Gumerov VM, Ulrich LE, Zhulin IB. MiST 4.0: a new release of the microbial signal transduction database, now with a metagenomic component. Nucleic Acids Res 2024; 52:D647-D653. [PMID: 37791884 PMCID: PMC10767990 DOI: 10.1093/nar/gkad847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 09/15/2023] [Accepted: 09/21/2023] [Indexed: 10/05/2023] Open
Abstract
Signal transduction systems in bacteria and archaea link environmental stimuli to specific adaptive cellular responses. They control gene expression, motility, biofilm formation, development and other processes that are vital to survival. The microbial signal transduction (MiST) database is an online resource that stores tens of thousands of genomes and allows users to explore their signal transduction profiles, analyze genomes in bulk using the database application programming interface (API) and make testable hypotheses about the functions of newly identified signaling systems. However, signal transduction in metagenomes remained completely unexplored. To lay the foundation for research in metagenomic signal transduction, we have prepared a new release of the MiST database, MiST 4.0, which features over 10 000 metagenome-assembled genomes (MAGs), a scaled representation of proteins and detailed BioSample information. In addition, several thousands of new genomes have been processed and stored in the database. A new interface has been developed that allows users to seamlessly switch between genomes and MAGs. MiST 4.0 is freely available at https://mistdb.com; metagenomes and MAGs can also be explored using the API available on the same page.
Collapse
Affiliation(s)
- Vadim M Gumerov
- Department of Microbiology and Translational Data Analytics Institute, The Ohio State University, Columbus, OH 43210, USA
| | | | - Igor B Zhulin
- Department of Microbiology and Translational Data Analytics Institute, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
14
|
Camargo AP, Call L, Roux S, Nayfach S, Huntemann M, Palaniappan K, Ratner A, Chu K, Mukherjeep S, Reddy TBK, Chen IM, Ivanova N, Eloe-Fadrosh E, Woyke T, Baltrus D, Castañeda-Barba S, de la Cruz F, Funnell BE, Hall JJ, Mukhopadhyay A, Rocha EC, Stalder T, Top E, Kyrpides N. IMG/PR: a database of plasmids from genomes and metagenomes with rich annotations and metadata. Nucleic Acids Res 2024; 52:D164-D173. [PMID: 37930866 PMCID: PMC10767988 DOI: 10.1093/nar/gkad964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 10/09/2023] [Accepted: 10/14/2023] [Indexed: 11/08/2023] Open
Abstract
Plasmids are mobile genetic elements found in many clades of Archaea and Bacteria. They drive horizontal gene transfer, impacting ecological and evolutionary processes within microbial communities, and hold substantial importance in human health and biotechnology. To support plasmid research and provide scientists with data of an unprecedented diversity of plasmid sequences, we introduce the IMG/PR database, a new resource encompassing 699 973 plasmid sequences derived from genomes, metagenomes and metatranscriptomes. IMG/PR is the first database to provide data of plasmid that were systematically identified from diverse microbiome samples. IMG/PR plasmids are associated with rich metadata that includes geographical and ecosystem information, host taxonomy, similarity to other plasmids, functional annotation, presence of genes involved in conjugation and antibiotic resistance. The database offers diverse methods for exploring its extensive plasmid collection, enabling users to navigate plasmids through metadata-centric queries, plasmid comparisons and BLAST searches. The web interface for IMG/PR is accessible at https://img.jgi.doe.gov/pr. Plasmid metadata and sequences can be downloaded from https://genome.jgi.doe.gov/portal/IMG_PR.
Collapse
Affiliation(s)
- Antonio Pedro Camargo
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Lee Call
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Simon Roux
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Stephen Nayfach
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Marcel Huntemann
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | | | - Anna Ratner
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ken Chu
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Supratim Mukherjeep
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - T B K Reddy
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - I-Min A Chen
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Natalia N Ivanova
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Emiley A Eloe-Fadrosh
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Tanja Woyke
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - David A Baltrus
- School of Plant Sciences, University of Arizona, Tucson AZ, USA
- School of Animal and Comparative Biomedical Sciences, University of Arizona, Tucson AZ, USA
| | | | - Fernando de la Cruz
- Instituto de Biomedicina y Biotecnología de Cantabria (Consejo Superior de Investigaciones Científicas – Universidad de Cantabria), Cantabria, Spain
| | - Barbara E Funnell
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5G 1M1, Canada
| | - James P J Hall
- Department of Evolution, Ecology and Behaviour, Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool L69 7ZB, UK
| | - Aindrila Mukhopadhyay
- Joint BioEnergy Institute, Emeryville, CA 94608, USA
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Eduardo P C Rocha
- Institut Pasteur, Université de Paris Cité, CNRS UMR3525, Microbial Evolutionary Genomics, Paris, France
| | - Thibault Stalder
- Department of Biological Sciences, University of Idaho, Moscow, ID 83844, USA
| | - Eva Top
- Department of Biological Sciences, University of Idaho, Moscow, ID 83844, USA
| | - Nikos C Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| |
Collapse
|
15
|
Graver BA, Chakravarty N, Solomon KV. Prokaryotic Argonautes for in vivo biotechnology and molecular diagnostics. Trends Biotechnol 2024; 42:61-73. [PMID: 37451948 DOI: 10.1016/j.tibtech.2023.06.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 06/23/2023] [Accepted: 06/26/2023] [Indexed: 07/18/2023]
Abstract
Prokaryotic Argonautes (pAgos) are an emerging class of programmable endonucleases that are believed to be more flexible than existing CRISPR-Cas systems and have significant potential for biotechnology. Current applications of pAgos include a myriad of molecular diagnostics and in vitro DNA assembly tools. However, efforts have historically been centered on thermophilic pAgo variants. To enable in vivo biotechnological applications such as gene editing, focus has shifted to pAgos from mesophilic organisms. We discuss what is known of pAgos, how they are being developed for various applications, and strategies to overcome current challenges to in vivo applications in prokaryotes and eukaryotes.
Collapse
Affiliation(s)
- Brett A Graver
- Department of Biological Sciences, University of Delaware, Newark, DE 19716, USA
| | - Namrata Chakravarty
- Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, DE 19716, USA
| | - Kevin V Solomon
- Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, DE 19716, USA.
| |
Collapse
|
16
|
Couceiro JF, Marques M, Silva SG, Keller-Costa T, Costa R. Aquimarina aquimarini sp. nov. and Aquimarina spinulae sp. nov., novel bacterial species with versatile natural product biosynthesis potential isolated from marine sponges. Int J Syst Evol Microbiol 2024; 74. [PMID: 38240740 DOI: 10.1099/ijsem.0.006228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024] Open
Abstract
This study describes two Gram-negative, flexirubin-producing, biofilm-forming, motile-by-gliding and rod-shaped bacteria, isolated from the marine sponges Ircinia variabilis and Sarcotragus spinosulus collected off the coast of Algarve, Portugal. Both strains, designated Aq135T and Aq349T, were classified into the genus Aquimarina by means of 16S rRNA gene sequencing. We then performed phylogenetic, phylogenomic and biochemical analyses to determine whether these strains represent novel Aquimarina species. Whereas the closest 16S rRNA gene relatives to strain Aq135T were Aquimarina macrocephali JAMB N27T (97.8 %) and Aquimarina sediminis w01T (97.1 %), strain Aq349T was more closely related to Aquimarina megaterium XH134T (99.2 %) and Aquimarina atlantica 22II-S11-z7T (98.1 %). Both strains showed genome-wide average nucleotide identity scores below the species level cut-off (95 %) with all Aquimarina type strains with publicly available genomes, including their closest relatives. Digital DNA-DNA hybridization further suggested a novel species status for both strains since values lower than 70 % hybridization level with other Aquimarina type strains were obtained. Strains Aq135T and Aq349T grew from 4 to 30°C and with between 1-5 % (w/v) NaCl in marine broth. The most abundant fatty acids were iso-C17 : 03-OH and iso-C15 : 0 and the only respiratory quinone was MK-6. Strain Aq135T was catalase-positive and β-galactosidase-negative, while Aq349T was catalase-negative and β-galactosidase-positive. These strains hold unique sets of secondary metabolite biosynthetic gene clusters and are known to produce the peptide antibiotics aquimarins (Aq135T) and the trans-AT polyketide cuniculene (Aq349T), respectively. Based on the polyphasic approach employed in this study, we propose the novel species names Aquimarina aquimarini sp. nov. (type strain Aq135T=DSM 115833T=UCCCB 169T=ATCC TSD-360T) and Aquimarina spinulae sp. nov. (type strain Aq349T=DSM 115834T=UCCCB 170T=ATCC TSD-361T).
Collapse
Affiliation(s)
- Joana F Couceiro
- iBB-Institute for Bioengineering and Biosciences and i4HB-Institute for Health and Bioeconomy, Instituto Superior Técnico, University of Lisbon, Av. Rovisco Pais, 1049-001 Lisbon, Portugal
- Department of Bioengeneering, Instituto Superior Técnico, University of Lisbon, Av. Rovisco Pais, 1049-001 Lisbon, Portugal
| | - Matilde Marques
- iBB-Institute for Bioengineering and Biosciences and i4HB-Institute for Health and Bioeconomy, Instituto Superior Técnico, University of Lisbon, Av. Rovisco Pais, 1049-001 Lisbon, Portugal
- Department of Bioengeneering, Instituto Superior Técnico, University of Lisbon, Av. Rovisco Pais, 1049-001 Lisbon, Portugal
| | - Sandra G Silva
- iBB-Institute for Bioengineering and Biosciences and i4HB-Institute for Health and Bioeconomy, Instituto Superior Técnico, University of Lisbon, Av. Rovisco Pais, 1049-001 Lisbon, Portugal
- Department of Bioengeneering, Instituto Superior Técnico, University of Lisbon, Av. Rovisco Pais, 1049-001 Lisbon, Portugal
| | - Tina Keller-Costa
- iBB-Institute for Bioengineering and Biosciences and i4HB-Institute for Health and Bioeconomy, Instituto Superior Técnico, University of Lisbon, Av. Rovisco Pais, 1049-001 Lisbon, Portugal
- Department of Bioengeneering, Instituto Superior Técnico, University of Lisbon, Av. Rovisco Pais, 1049-001 Lisbon, Portugal
| | - Rodrigo Costa
- iBB-Institute for Bioengineering and Biosciences and i4HB-Institute for Health and Bioeconomy, Instituto Superior Técnico, University of Lisbon, Av. Rovisco Pais, 1049-001 Lisbon, Portugal
- Department of Bioengeneering, Instituto Superior Técnico, University of Lisbon, Av. Rovisco Pais, 1049-001 Lisbon, Portugal
| |
Collapse
|
17
|
Eloe-Fadrosh EA, Mungall CJ, Miller MA, Smith M, Patil SS, Kelliher JM, Johnson LYD, Rodriguez FE, Chain PSG, Hu B, Thornton MB, McCue LA, McHardy AC, Harris NL, Reddy TBK, Mukherjee S, Hunter CI, Walls R, Schriml LM. A Practical Approach to Using the Genomic Standards Consortium MIxS Reporting Standard for Comparative Genomics and Metagenomics. Methods Mol Biol 2024; 2802:587-609. [PMID: 38819573 DOI: 10.1007/978-1-0716-3838-5_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Comparative analysis of (meta)genomes necessitates aggregation, integration, and synthesis of well-annotated data using standards. The Genomic Standards Consortium (GSC) collaborates with the research community to develop and maintain the Minimum Information about any (x) Sequence (MIxS) reporting standard for genomic data. To facilitate the use of the GSC's MIxS reporting standard, we provide a description of the structure and terminology, how to navigate ontologies for required terms in MIxS, and demonstrate practical usage through a soil metagenome example.
Collapse
Affiliation(s)
- Emiley A Eloe-Fadrosh
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| | - Christopher J Mungall
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Mark Andrew Miller
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Montana Smith
- Pacific Northwest National Laboratory, Richland, WA, USA
| | - Sujay Sanjeev Patil
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Julia M Kelliher
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Leah Y D Johnson
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | | | - Patrick S G Chain
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Bin Hu
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Michael B Thornton
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Lee Ann McCue
- Pacific Northwest National Laboratory, Richland, WA, USA
| | - Alice Carolyn McHardy
- Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Nomi L Harris
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - T B K Reddy
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Supratim Mukherjee
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Christopher I Hunter
- GigaScience Press, Hong Kong Science Park, Pak Shek Kok, New Territories, Hong Kong
| | | | - Lynn M Schriml
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| |
Collapse
|
18
|
Timme RE, Karsch-Mizrachi I, Waheed Z, Arita M, MacCannell D, Maguire F, Petit III R, Page AJ, Mendes CI, Nasar MI, Oluniyi P, Tyler AD, Raphenya AR, Guthrie JL, Olawoye I, Rinck G, O’Cathail C, Lees J, Cochrane G, Cummins C, Brister JR, Klimke W, Feldgarden M, Griffiths E. Putting everything in its place: using the INSDC compliant Pathogen Data Object Model to better structure genomic data submitted for public health applications. Microb Genom 2023; 9:001145. [PMID: 38085797 PMCID: PMC10763499 DOI: 10.1099/mgen.0.001145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 11/13/2023] [Indexed: 12/18/2023] Open
Abstract
Fast, efficient public health actions require well-organized and coordinated systems that can supply timely and accurate knowledge. Public databases of pathogen genomic data, such as the International Nucleotide Sequence Database Collaboration (INSDC), have become essential tools for efficient public health decisions. However, these international resources began primarily for academic purposes, rather than for surveillance or interventions. Now, queries need to access not only the whole genomes of multiple pathogens but also make connections using robust contextual metadata to identify issues of public health relevance. Databases that over time developed a patchwork of submission formats and requirements need to be consistently organized and coordinated internationally to allow effective searches.To help resolve these issues, we propose a common pathogen data structure called the Pathogen Data Object Model (DOM) that will formalize the minimum pieces of sequence data and contextual data necessary for general public health uses, while recognizing that submitters will likely withhold a wide range of non-public contextual data. Further, we propose contributors use the Pathogen DOM for all pathogen submissions (bacterial, viral, fungal, and parasites), which will simplify data submissions and provide a consistent and transparent data structure for downstream data analyses. We also highlight how improved submission tools can support the Pathogen DOM, offering users additional easy-to-use methods to ensure this structure is followed.
Collapse
Affiliation(s)
- Ruth E. Timme
- Center for Food Safety and Applied Nutrition, U.S. Food and Drug Administration, College Park, MD, USA
| | - Ilene Karsch-Mizrachi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Zahra Waheed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Masanori Arita
- DNA Data Bank of Japan, National Institute of Genetics, Mishima, Japan
| | - Duncan MacCannell
- National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Finlay Maguire
- Department of Community Health & Epidemiology, Faculty of Medicine, Dalhousie University, Halifax, Canada
- Faculty of Computer Science, Dalhousie University, Halifax, Canada
| | | | - Andrew J. Page
- Quadram Institute Bioscience, Norwich, Norfolk, UK
- Theiagen Genomics LLC, Highlands Ranch, CO, USA
| | | | - Muhammad Ibtisam Nasar
- Department of Biology, College of Science, United Arab Emirates University- Al Ain, Abu Dhabi, UAE
| | - Paul Oluniyi
- Chan Zuckerberg Biohub Network, San Francisco, CA, USA
| | - Andrea D. Tyler
- Science Technology Cores and Services, National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, Canada
| | - Amogelang R. Raphenya
- Department of Biochemistry and Biomedical Sciences and the Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada
| | - Jennifer L. Guthrie
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, Ontario, Canada
| | - Idowu Olawoye
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, Ontario, Canada
| | - Gabriele Rinck
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Colman O’Cathail
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - John Lees
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - J. Rodney Brister
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - William Klimke
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Michael Feldgarden
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Emma Griffiths
- Faculty of Health Sciences, Simon Fraser University, Burnaby, British Columbia, Canada
| |
Collapse
|
19
|
Meng D, Ai S, Spanos M, Shi X, Li G, Cretoiu D, Zhou Q, Xiao J. Exercise and microbiome: From big data to therapy. Comput Struct Biotechnol J 2023; 21:5434-5445. [PMID: 38022690 PMCID: PMC10665598 DOI: 10.1016/j.csbj.2023.10.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 10/16/2023] [Accepted: 10/17/2023] [Indexed: 12/01/2023] Open
Abstract
Exercise is a vital component in maintaining optimal health and serves as a prospective therapeutic intervention for various diseases. The human microbiome, comprised of trillions of microorganisms, plays a crucial role in overall health. Given the advancements in microbiome research, substantial databases have been created to decipher the functionality and mechanisms of the microbiome in health and disease contexts. This review presents an initial overview of microbiomics development and related databases, followed by an in-depth description of the multi-omics technologies for microbiome. It subsequently synthesizes the research pertaining to exercise-induced modifications of the microbiome and diseases that impact the microbiome. Finally, it highlights the potential therapeutic implications of an exercise-modulated microbiome in intestinal disease, obesity and diabetes, cardiovascular disease, and immune/inflammation-related diseases.
Collapse
Affiliation(s)
- Danni Meng
- Institute of Geriatrics (Shanghai University), Affiliated Nantong Hospital of Shanghai University (The Sixth People’s Hospital of Nantong), School of Medicine, Shanghai University, Nantong 226011, China
- Cardiac Regeneration and Ageing Lab, Institute of Cardiovascular Sciences, Shanghai Engineering Research Center of Organ Repair, School of Life Science, Shanghai University, Shanghai 200444, China
| | - Songwei Ai
- Institute of Geriatrics (Shanghai University), Affiliated Nantong Hospital of Shanghai University (The Sixth People’s Hospital of Nantong), School of Medicine, Shanghai University, Nantong 226011, China
- Cardiac Regeneration and Ageing Lab, Institute of Cardiovascular Sciences, Shanghai Engineering Research Center of Organ Repair, School of Life Science, Shanghai University, Shanghai 200444, China
| | - Michail Spanos
- Cardiovascular Division of the Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Xiaohui Shi
- Institute of Geriatrics (Shanghai University), Affiliated Nantong Hospital of Shanghai University (The Sixth People’s Hospital of Nantong), School of Medicine, Shanghai University, Nantong 226011, China
- Cardiac Regeneration and Ageing Lab, Institute of Cardiovascular Sciences, Shanghai Engineering Research Center of Organ Repair, School of Life Science, Shanghai University, Shanghai 200444, China
| | - Guoping Li
- Cardiovascular Division of the Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Dragos Cretoiu
- Department of Medical Genetics, Carol Davila University of Medicine and Pharmacy, Bucharest 020031, Romania
- Materno-Fetal Assistance Excellence Unit, Alessandrescu-Rusescu National Institute for Mother and Child Health, Bucharest 011062, Romania
| | - Qiulian Zhou
- Institute of Geriatrics (Shanghai University), Affiliated Nantong Hospital of Shanghai University (The Sixth People’s Hospital of Nantong), School of Medicine, Shanghai University, Nantong 226011, China
- Cardiac Regeneration and Ageing Lab, Institute of Cardiovascular Sciences, Shanghai Engineering Research Center of Organ Repair, School of Life Science, Shanghai University, Shanghai 200444, China
| | - Junjie Xiao
- Institute of Geriatrics (Shanghai University), Affiliated Nantong Hospital of Shanghai University (The Sixth People’s Hospital of Nantong), School of Medicine, Shanghai University, Nantong 226011, China
- Cardiac Regeneration and Ageing Lab, Institute of Cardiovascular Sciences, Shanghai Engineering Research Center of Organ Repair, School of Life Science, Shanghai University, Shanghai 200444, China
| |
Collapse
|
20
|
Hackmann TJ, Zhang B. The phenotype and genotype of fermentative prokaryotes. SCIENCE ADVANCES 2023; 9:eadg8687. [PMID: 37756392 PMCID: PMC10530074 DOI: 10.1126/sciadv.adg8687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 08/25/2023] [Indexed: 09/29/2023]
Abstract
Fermentation is a type of metabolism pervasive in oxygen-deprived environments. Despite its importance, we know little about the range and traits of organisms that carry out this metabolism. Our study addresses this gap with a comprehensive analysis of the phenotype and genotype of fermentative prokaryotes. We assembled a dataset with phenotypic records of 8350 organisms plus 4355 genomes and 13.6 million genes. Our analysis reveals fermentation is both widespread (in ~30% of prokaryotes) and complex (forming ~300 combinations of metabolites). Furthermore, it points to previously uncharacterized proteins involved in this metabolism. Previous studies suggest that metabolic pathways for fermentation are well understood, but metabolic models built in our study show gaps in our knowledge. This study demonstrates the complexity of fermentation while showing that there is still much to learn about this metabolism. All resources in our study can be explored by the scientific community with an online, interactive tool.
Collapse
Affiliation(s)
| | - Bo Zhang
- Department of Chemical Engineering, University of California, Santa Barbara, CA, USA
| |
Collapse
|
21
|
Arias PM, Butler J, Randhawa GS, Soltysiak MPM, Hill KA, Kari L. Environment and taxonomy shape the genomic signature of prokaryotic extremophiles. Sci Rep 2023; 13:16105. [PMID: 37752120 PMCID: PMC10522608 DOI: 10.1038/s41598-023-42518-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 09/11/2023] [Indexed: 09/28/2023] Open
Abstract
This study provides comprehensive quantitative evidence suggesting that adaptations to extreme temperatures and pH imprint a discernible environmental component in the genomic signature of microbial extremophiles. Both supervised and unsupervised machine learning algorithms were used to analyze genomic signatures, each computed as the k-mer frequency vector of a 500 kbp DNA fragment arbitrarily selected to represent a genome. Computational experiments classified/clustered genomic signatures extracted from a curated dataset of [Formula: see text] extremophile (temperature, pH) bacteria and archaea genomes, at multiple scales of analysis, [Formula: see text]. The supervised learning resulted in high accuracies for taxonomic classifications at [Formula: see text], and medium to medium-high accuracies for environment category classifications of the same datasets at [Formula: see text]. For [Formula: see text], our findings were largely consistent with amino acid compositional biases and codon usage patterns in coding regions, previously attributed to extreme environment adaptations. The unsupervised learning of unlabelled sequences identified several exemplars of hyperthermophilic organisms with large similarities in their genomic signatures, in spite of belonging to different domains in the Tree of Life.
Collapse
Affiliation(s)
- Pablo Millán Arias
- School of Computer Science, University of Waterloo, Waterloo, ON, Canada.
| | - Joseph Butler
- Department of Biology, University of Western Ontario, London, ON, Canada
| | - Gurjit S Randhawa
- School of Mathematical and Computational Sciences, University of Prince Edward Island, Charlottetown, PE, Canada
| | | | - Kathleen A Hill
- Department of Biology, University of Western Ontario, London, ON, Canada
| | - Lila Kari
- School of Computer Science, University of Waterloo, Waterloo, ON, Canada
| |
Collapse
|
22
|
Peng Y, Zhang L, Mok CKP, Ching JYL, Zhao S, Wong MKL, Zhu J, Chen C, Wang S, Yan S, Qin B, Liu Y, Zhang X, Cheung CP, Cheong PK, Ip KL, Fung ACH, Wong KKY, Hui DSC, Chan FKL, Ng SC, Tun HM. Baseline gut microbiota and metabolome predict durable immunogenicity to SARS-CoV-2 vaccines. Signal Transduct Target Ther 2023; 8:373. [PMID: 37743379 PMCID: PMC10518331 DOI: 10.1038/s41392-023-01629-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 08/22/2023] [Accepted: 08/25/2023] [Indexed: 09/26/2023] Open
Abstract
The role of gut microbiota in modulating the durability of COVID-19 vaccine immunity is yet to be characterised. In this cohort study, we collected blood and stool samples of 121 BNT162b2 and 40 CoronaVac vaccinees at baseline, 1 month, and 6 months post vaccination (p.v.). Neutralisation antibody, plasma cytokine and chemokines were measured and associated with the gut microbiota and metabolome composition. A significantly higher level of neutralising antibody (at 6 months p.v.) was found in BNT162b2 vaccinees who had higher relative abundances of Bifidobacterium adolescentis, Bifidobacterium bifidum, and Roseburia faecis as well as higher concentrations of nicotinic acid (Vitamin B) and γ-Aminobutyric acid (P < 0.05) at baseline. CoronaVac vaccinees with high neutralising antibodies at 6 months p.v. had an increased relative abundance of Phocaeicola dorei, a lower relative abundance of Faecalibacterium prausnitzii, and a higher concentration of L-tryptophan (P < 0.05) at baseline. A higher antibody level at 6 months p.v. was also associated with a higher relative abundance of Dorea formicigenerans at 1 month p.v. among CoronaVac vaccinees (Rho = 0.62, p = 0.001, FDR = 0.123). Of the species altered following vaccination, 79.4% and 42.0% in the CoronaVac and BNT162b2 groups, respectively, recovered at 6 months. Specific to CoronaVac vaccinees, both bacteriome and virome diversity depleted following vaccination and did not recover to baseline at 6 months p.v. (FDR < 0.1). In conclusion, this study identified potential microbiota-based adjuvants that may extend the durability of immune responses to SARS-CoV-2 vaccines.
Collapse
Affiliation(s)
- Ye Peng
- Microbiota I-Center (MagIC), Hong Kong, China
- Jockey Club School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong, China
- Li Ka Shing Institute of Health Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
| | - Lin Zhang
- Microbiota I-Center (MagIC), Hong Kong, China
- Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, China
| | - Chris K P Mok
- Jockey Club School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong, China
- Li Ka Shing Institute of Health Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
| | - Jessica Y L Ching
- Microbiota I-Center (MagIC), Hong Kong, China
- Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, China
| | - Shilin Zhao
- Microbiota I-Center (MagIC), Hong Kong, China
- Li Ka Shing Institute of Health Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
- Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, China
| | - Matthew K L Wong
- Microbiota I-Center (MagIC), Hong Kong, China
- Li Ka Shing Institute of Health Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
| | - Jie Zhu
- Microbiota I-Center (MagIC), Hong Kong, China
- Li Ka Shing Institute of Health Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
| | - Chunke Chen
- Jockey Club School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong, China
- Li Ka Shing Institute of Health Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
| | - Shilan Wang
- Microbiota I-Center (MagIC), Hong Kong, China
- Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, China
| | - Shuai Yan
- Microbiota I-Center (MagIC), Hong Kong, China
| | - Biyan Qin
- Microbiota I-Center (MagIC), Hong Kong, China
| | - Yingzhi Liu
- Microbiota I-Center (MagIC), Hong Kong, China
- Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, China
| | - Xi Zhang
- Jockey Club School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong, China
- Li Ka Shing Institute of Health Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
| | - Chun Pun Cheung
- Microbiota I-Center (MagIC), Hong Kong, China
- Li Ka Shing Institute of Health Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
- Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, China
| | - Pui Kuan Cheong
- Microbiota I-Center (MagIC), Hong Kong, China
- Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, China
| | - Ka Long Ip
- Microbiota I-Center (MagIC), Hong Kong, China
- Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, China
| | - Adrian C H Fung
- Department of Surgery, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Kenneth K Y Wong
- Department of Surgery, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - David S C Hui
- Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, China
- Stanley Ho Centre for Emerging Infectious Diseases, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
| | - Francis K L Chan
- Microbiota I-Center (MagIC), Hong Kong, China
- Centre for Gut Microbiota Research, The Chinese University of Hong Kong, Hong Kong, China
| | - Siew C Ng
- Microbiota I-Center (MagIC), Hong Kong, China.
- Li Ka Shing Institute of Health Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China.
- Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, China.
| | - Hein M Tun
- Microbiota I-Center (MagIC), Hong Kong, China.
- Jockey Club School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong, China.
- Li Ka Shing Institute of Health Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China.
| |
Collapse
|
23
|
Pandey S, Avuthu N, Guda C. StrainIQ: A Novel n-Gram-Based Method for Taxonomic Profiling of Human Microbiota at the Strain Level. Genes (Basel) 2023; 14:1647. [PMID: 37628698 PMCID: PMC10454763 DOI: 10.3390/genes14081647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 08/13/2023] [Accepted: 08/15/2023] [Indexed: 08/27/2023] Open
Abstract
The emergence of next-generation sequencing (NGS) technology has greatly influenced microbiome research and led to the development of novel bioinformatics tools to deeply analyze metagenomics datasets. Identifying strain-level variations in microbial communities is important to understanding the onset and progression of diseases, host-pathogen interrelationships, and drug resistance, in addition to designing new therapeutic regimens. In this study, we developed a novel tool called StrainIQ (strain identification and quantification) based on a new n-gram-based (series of n number of adjacent nucleotides in the DNA sequence) algorithm for predicting and quantifying strain-level taxa from whole-genome metagenomic sequencing data. We thoroughly evaluated our method using simulated and mock metagenomic datasets and compared its performance with existing methods. On average, it showed 85.8% sensitivity and 78.2% specificity on simulated datasets. It also showed higher specificity and sensitivity using n-gram models built from reduced reference genomes and on models with lower coverage sequencing data. It outperforms alternative approaches in genus- and strain-level prediction and strain abundance estimation. Overall, the results show that StrainIQ achieves high accuracy by implementing customized model-building and is an efficient tool for site-specific microbial community profiling.
Collapse
Affiliation(s)
- Sanjit Pandey
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Nagavardhini Avuthu
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Chittibabu Guda
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
- Center for Biomedical Informatics Research and Innovation, University of Nebraska Medical Center, Omaha, NE 68198, USA
| |
Collapse
|
24
|
Marín-Paredes R, Bolívar-Torres HH, Coronel-Gaytán A, Martínez-Romero E, Servín-Garcidueñas LE. A Metagenome from a Steam Vent in Los Azufres Geothermal Field Shows an Abundance of Thermoplasmatales archaea and Bacteria from the Phyla Actinomycetota and Pseudomonadota. Curr Issues Mol Biol 2023; 45:5849-5864. [PMID: 37504286 PMCID: PMC10378326 DOI: 10.3390/cimb45070370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 07/09/2023] [Accepted: 07/10/2023] [Indexed: 07/29/2023] Open
Abstract
Los Azufres National Park is a geothermal field that has a wide number of thermal manifestations; nevertheless, the microbial communities in many of these environments remain unknown. In this study, a metagenome from a sediment sample from Los Azufres National Park was sequenced. In this metagenome, we found that the microbial diversity corresponds to bacteria (Actinomycetota, Pseudomonadota), archaea (Thermoplasmatales and Candidatus Micrarchaeota and Candidatus Parvarchaeota), eukarya (Cyanidiaceae), and viruses (Fussellovirus and Caudoviricetes). The functional annotation showed genes related to the carbon fixation pathway, sulfur metabolism, genes involved in heat and cold shock, and heavy-metal resistance. From the sediment, it was possible to recover two metagenome-assembled genomes from Ferrimicrobium and Cuniculiplasma. Our results showed that there are a large number of microorganisms in Los Azufres that deserve to be studied.
Collapse
Affiliation(s)
- Roberto Marín-Paredes
- Laboratorio de Microbiómica, Escuela Nacional de Estudios Superiores Unidad Morelia, Universidad Nacional Autónoma de México, Morelia 58341, Mexico
| | - Hermes H Bolívar-Torres
- Escuela de Ciencias Biológicas, Universidad Pedagógica y Tecnológica de Colombia, Tunja 150003, Colombia
| | - Alberto Coronel-Gaytán
- Laboratorio de Microbiómica, Escuela Nacional de Estudios Superiores Unidad Morelia, Universidad Nacional Autónoma de México, Morelia 58341, Mexico
| | | | - Luis E Servín-Garcidueñas
- Laboratorio de Microbiómica, Escuela Nacional de Estudios Superiores Unidad Morelia, Universidad Nacional Autónoma de México, Morelia 58341, Mexico
- Laboratorio Nacional de Análisis y Síntesis Ecológica, Escuela Nacional de Estudios Superiores Unidad Morelia, Morelia 58341, Mexico
| |
Collapse
|
25
|
Shen K, Din AU, Sinha B, Zhou Y, Qian F, Shen B. Translational informatics for human microbiota: data resources, models and applications. Brief Bioinform 2023; 24:7152256. [PMID: 37141135 DOI: 10.1093/bib/bbad168] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 04/07/2023] [Accepted: 04/11/2023] [Indexed: 05/05/2023] Open
Abstract
With the rapid development of human intestinal microbiology and diverse microbiome-related studies and investigations, a large amount of data have been generated and accumulated. Meanwhile, different computational and bioinformatics models have been developed for pattern recognition and knowledge discovery using these data. Given the heterogeneity of these resources and models, we aimed to provide a landscape of the data resources, a comparison of the computational models and a summary of the translational informatics applied to microbiota data. We first review the existing databases, knowledge bases, knowledge graphs and standardizations of microbiome data. Then, the high-throughput sequencing techniques for the microbiome and the informatics tools for their analyses are compared. Finally, translational informatics for the microbiome, including biomarker discovery, personalized treatment and smart healthcare for complex diseases, are discussed.
Collapse
Affiliation(s)
- Ke Shen
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Ahmad Ud Din
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Baivab Sinha
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Yi Zhou
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Fuliang Qian
- Center for Systems Biology, Suzhou Medical College of Soochow University, Suzhou 215123, China
- Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Suzhou 215123, China
| | - Bairong Shen
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| |
Collapse
|
26
|
Mukherjee S, Ovchinnikova G, Stamatis D, Li CT, Chen IMA, Kyrpides NC, Reddy TBK. Standardized naming of microbiome samples in Genomes OnLine Database. Database (Oxford) 2023; 2023:7042581. [PMID: 36794865 PMCID: PMC9933444 DOI: 10.1093/database/baad001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 01/24/2023] [Indexed: 02/17/2023]
Abstract
The power of next-generation sequencing has resulted in an explosive growth in the number of projects aiming to understand the metagenomic diversity of complex microbial environments. The interdisciplinary nature of this microbiome research community, along with the absence of reporting standards for microbiome data and samples, poses a significant challenge for follow-up studies. Commonly used names of metagenomes and metatranscriptomes in public databases currently lack the essential information necessary to accurately describe and classify the underlying samples, which makes a comparative analysis difficult to conduct and often results in misclassified sequences in data repositories. The Genomes OnLine Database (GOLD) (https:// gold.jgi.doe.gov/) at the Department of Energy Joint Genome Institute has been at the forefront of addressing this challenge by developing a standardized nomenclature system for naming microbiome samples. GOLD, currently in its twenty-fifth anniversary, continues to enrich the research community with hundreds of thousands of metagenomes and metatranscriptomes with well-curated and easy-to-understand names. Through this manuscript, we describe the overall naming process that can be easily adopted by researchers worldwide. Additionally, we propose the use of this naming system as a best practice for the scientific community to facilitate better interoperability and reusability of microbiome data.
Collapse
Affiliation(s)
- Supratim Mukherjee
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Galina Ovchinnikova
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Dimitri Stamatis
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Cindy Tianqing Li
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - I-Min A Chen
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Nikos C Kyrpides
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - T B K Reddy
- *Corresponding author: Tel: +1 408 505 8273;
| |
Collapse
|
27
|
Sonke A, Trembath-Reichert E. Expanding the taxonomic and environmental extent of an underexplored carbon metabolism-oxalotrophy. Front Microbiol 2023; 14:1161937. [PMID: 37213515 PMCID: PMC10192776 DOI: 10.3389/fmicb.2023.1161937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 04/11/2023] [Indexed: 05/23/2023] Open
Abstract
Oxalate serves various functions in the biological processes of plants, fungi, bacteria, and animals. It occurs naturally in the minerals weddellite and whewellite (calcium oxalates) or as oxalic acid. The environmental accumulation of oxalate is disproportionately low compared to the prevalence of highly productive oxalogens, namely plants. It is hypothesized that oxalotrophic microbes limit oxalate accumulation by degrading oxalate minerals to carbonates via an under-explored biogeochemical cycle known as the oxalate-carbonate pathway (OCP). Neither the diversity nor the ecology of oxalotrophic bacteria is fully understood. This research investigated the phylogenetic relationships of the bacterial genes oxc, frc, oxdC, and oxlT, which encode key enzymes for oxalotrophy, using bioinformatic approaches and publicly available omics datasets. Phylogenetic trees of oxc and oxdC genes demonstrated grouping by both source environment and taxonomy. All four trees included genes from metagenome-assembled genomes (MAGs) that contained novel lineages and environments for oxalotrophs. In particular, sequences of each gene were recovered from marine environments. These results were supported with marine transcriptome sequences and description of key amino acid residue conservation. Additionally, we investigated the theoretical energy yield from oxalotrophy across marine-relevant pressure and temperature conditions and found similar standard state Gibbs free energy to "low energy" marine sediment metabolisms, such as anaerobic oxidation of methane coupled to sulfate reduction. These findings suggest further need to understand the role of bacterial oxalotrophy in the OCP, particularly in marine environments, and its contribution to global carbon cycling.
Collapse
|
28
|
Baltoumas FA, Karatzas E, Paez-Espino D, Venetsianou NK, Aplakidou E, Oulas A, Finn RD, Ovchinnikov S, Pafilis E, Kyrpides NC, Pavlopoulos GA. Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters. FRONTIERS IN BIOINFORMATICS 2023; 3:1157956. [PMID: 36959975 PMCID: PMC10029925 DOI: 10.3389/fbinf.2023.1157956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 02/21/2023] [Indexed: 03/06/2023] Open
Abstract
Metagenomics has enabled accessing the genetic repertoire of natural microbial communities. Metagenome shotgun sequencing has become the method of choice for studying and classifying microorganisms from various environments. To this end, several methods have been developed to process and analyze the sequence data from raw reads to end-products such as predicted protein sequences or families. In this article, we provide a thorough review to simplify such processes and discuss the alternative methodologies that can be followed in order to explore biodiversity at the protein family level. We provide details for analysis tools and we comment on their scalability as well as their advantages and disadvantages. Finally, we report the available data repositories and recommend various approaches for protein family annotation related to phylogenetic distribution, structure prediction and metadata enrichment.
Collapse
Affiliation(s)
- Fotis A. Baltoumas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
- *Correspondence: Fotis A. Baltoumas, ; Nikos C. Kyrpides, ; Georgios A. Pavlopoulos,
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - David Paez-Espino
- Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States
| | - Nefeli K. Venetsianou
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Eleni Aplakidou
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Anastasis Oulas
- The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Robert D. Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, United States
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Heraklion, Greece
| | - Nikos C. Kyrpides
- Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States
- *Correspondence: Fotis A. Baltoumas, ; Nikos C. Kyrpides, ; Georgios A. Pavlopoulos,
| | - Georgios A. Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
- Center of New Biotechnologies and Precision Medicine, Department of Medicine, School of Health Sciences, National and Kapodistrian University of Athens, Athens, Greece
- Hellenic Army Academy, Vari, Greece
- *Correspondence: Fotis A. Baltoumas, ; Nikos C. Kyrpides, ; Georgios A. Pavlopoulos,
| |
Collapse
|