1
|
Yadav A, Subramanian S. HiFiBGC: an ensemble approach for improved biosynthetic gene cluster detection in PacBio HiFi-read metagenomes. BMC Genomics 2024; 25:1096. [PMID: 39550535 PMCID: PMC11569603 DOI: 10.1186/s12864-024-10950-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Accepted: 10/24/2024] [Indexed: 11/18/2024] Open
Abstract
BACKGROUND Microbes produce diverse bioactive natural products with applications in fields such as medicine and agriculture. In their genomes, these natural products are encoded by physically clustered genes known as biosynthetic gene clusters (BGCs). Genome and metagenome sequencing advances have enabled high-throughput identification of BGCs as a promising avenue for natural product discovery. BGC mining from (meta)genomes using in silico tools has allowed access to a vast diversity of potentially novel natural products. However, a fundamental limitation has been the ability to assemble complete BGCs, especially from complex metagenomes. With their fragmented assemblies, short-read technologies struggle to recover complete BGCs, such as the long and repetitive nonribosomal peptide synthetase (NRPS) and polyketide synthase (PKS). Recent advances in long-read sequencing, such as the High Fidelity (HiFi) technology from PacBio, have reduced this limitation and can help retrieve both accurate and complete BGCs from metagenomes, warranting improvement in the existing BGC identification approach for better utilization of HiFi data. RESULTS Here, we present HiFiBGC, a command-line-based workflow to identify BGCs in PacBio HiFi metagenomes. HiFiBGC leverages an ensemble of assemblies from three HiFi-tailored metagenome assemblers and the reads not represented in these assemblies. Based on our analyses of four HiFi metagenomic datasets from four different environments, we show that HiFiBGC identifies, on average, 78% more BGCs than the top-performing single-assembler-based method. This increase is due to HiFiBGC's ensemble assembly approach, which improves recovery by 25%, as well as from the inclusion of mostly fragmented BGCs identified in the unmapped reads. CONCLUSIONS HiFiBGC is a computational workflow for identifying BGCs in long-read HiFi metagenomes, implemented majorly using Python programming language and workflow manager Snakemake. HiFiBGC is available on GitHub at https://github.com/ay-amityadav/HiFiBGC under the MIT license. The code related to the figures and analyses presented in the manuscript is available at https://github.com/ay-amityadav/HiFiBGC_analyses .
Collapse
Affiliation(s)
- Amit Yadav
- CSIR-Institute of Microbial Technology (IMTECH), Sector 39-A, Chandigarh, 160036, India.
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India.
| | - Srikrishna Subramanian
- CSIR-Institute of Microbial Technology (IMTECH), Sector 39-A, Chandigarh, 160036, India.
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India.
| |
Collapse
|
2
|
Lui LM, Nielsen TN. Decomposing a San Francisco estuary microbiome using long-read metagenomics reveals species- and strain-level dominance from picoeukaryotes to viruses. mSystems 2024; 9:e0024224. [PMID: 39158287 PMCID: PMC11406994 DOI: 10.1128/msystems.00242-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Accepted: 07/11/2024] [Indexed: 08/20/2024] Open
Abstract
Although long-read sequencing has enabled obtaining high-quality and complete genomes from metagenomes, many challenges still remain to completely decompose a metagenome into its constituent prokaryotic and viral genomes. This study focuses on decomposing an estuarine metagenome to obtain a more accurate estimate of microbial diversity. To achieve this, we developed a new bead-based DNA extraction method, a novel bin refinement method, and obtained 150 Gbp of Nanopore sequencing. We estimate that there are ~500 bacterial and archaeal species in our sample and obtained 68 high-quality bins (>90% complete, <5% contamination, ≤5 contigs, contig length of >100 kbp, and all ribosomal and tRNA genes). We also obtained many contigs of picoeukaryotes, environmental DNA of larger eukaryotes such as mammals, and complete mitochondrial and chloroplast genomes and detected ~40,000 viral populations. Our analysis indicates that there are only a few strains that comprise most of the species abundances. IMPORTANCE Ocean and estuarine microbiomes play critical roles in global element cycling and ecosystem function. Despite the importance of these microbial communities, many species still have not been cultured in the lab. Environmental sequencing is the primary way the function and population dynamics of these communities can be studied. Long-read sequencing provides an avenue to overcome limitations of short-read technologies to obtain complete microbial genomes but comes with its own technical challenges, such as needed sequencing depth and obtaining high-quality DNA. We present here new sampling and bioinformatics methods to attempt decomposing an estuarine microbiome into its constituent genomes. Our results suggest there are only a few strains that comprise most of the species abundances from viruses to picoeukaryotes, and to fully decompose a metagenome of this diversity requires 1 Tbp of long-read sequencing. We anticipate that as long-read sequencing technologies continue to improve, less sequencing will be needed.
Collapse
Affiliation(s)
- Lauren M Lui
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Torben N Nielsen
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| |
Collapse
|
3
|
Funnicelli MIG, de Carvalho LAL, Teheran-Sierra LG, Dibelli SC, Lemos EGDM, Pinheiro DG. Unveiling genomic features linked to traits of plant growth-promoting bacterial communities from sugarcane. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 947:174577. [PMID: 38981540 DOI: 10.1016/j.scitotenv.2024.174577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 07/04/2024] [Accepted: 07/05/2024] [Indexed: 07/11/2024]
Abstract
Microorganisms are ubiquitous, and those inhabiting plants have been the subject of several studies. Plant-associated bacteria exhibit various biological mechanisms that enable them to colonize host plants and, in some cases, enhance their fitness. In this study, we describe the genomic features predicted to be associated with plant growth-promoting traits in six bacterial communities isolated from sugarcane. The use of highly accurate single-molecule real-time sequencing technology for metagenomic samples from these bacterial communities allowed us to recover 17 genomes. The taxonomic assignments for the binned genomes were performed, revealing taxa distributed across three main phyla: Bacillota, Bacteroidota, and Pseudomonadota, with the latter being the most representative. Subsequently, we functionally annotated the metagenome-assembled genomes (MAGs) to characterize their metabolic pathways related to plant growth-promoting traits. Our study successfully identified the enrichment of important functions related to phosphate and potassium acquisition, modulation of phytohormones, and mechanisms for coping with abiotic stress. These findings could be linked to the robust colonization of these sugarcane endophytes.
Collapse
Affiliation(s)
- Michelli Inácio Gonçalves Funnicelli
- Laboratory of Bioinformatics, Department of Agricultural, Livestock and Environmental Biotechnology, São Paulo State University (UNESP), School of Agricultural and Veterinary Sciences, Jaboticabal, SP, Brazil; Graduate Program in Agricultural and Livestock Microbiology, São Paulo State University (UNESP), School of Agricultural and Veterinary Sciences, Jaboticabal, SP, Brazil
| | - Lucas Amoroso Lopes de Carvalho
- Laboratory of Bioinformatics, Department of Agricultural, Livestock and Environmental Biotechnology, São Paulo State University (UNESP), School of Agricultural and Veterinary Sciences, Jaboticabal, SP, Brazil; Graduate Program in Agricultural and Livestock Microbiology, São Paulo State University (UNESP), School of Agricultural and Veterinary Sciences, Jaboticabal, SP, Brazil
| | - Luis Guillermo Teheran-Sierra
- Agronomy Research Program, Colombian Oil Palm Research Center, Cenipalma, Calle 98 No. 70-91, Piso 14, Bogotá 111121, Colombia
| | - Sabrina Custodio Dibelli
- Laboratory of Bioinformatics, Department of Agricultural, Livestock and Environmental Biotechnology, São Paulo State University (UNESP), School of Agricultural and Veterinary Sciences, Jaboticabal, SP, Brazil; Graduate Program in Agricultural and Livestock Microbiology, São Paulo State University (UNESP), School of Agricultural and Veterinary Sciences, Jaboticabal, SP, Brazil
| | - Eliana Gertrudes de Macedo Lemos
- Graduate Program in Agricultural and Livestock Microbiology, São Paulo State University (UNESP), School of Agricultural and Veterinary Sciences, Jaboticabal, SP, Brazil; Molecular Biology Laboratory, Institute for Research in Bioenergy (IPBEN), São Paulo State University (UNESP), School of Agricultural and Veterinary Sciences, Jaboticabal, SP, Brazil
| | - Daniel Guariz Pinheiro
- Laboratory of Bioinformatics, Department of Agricultural, Livestock and Environmental Biotechnology, São Paulo State University (UNESP), School of Agricultural and Veterinary Sciences, Jaboticabal, SP, Brazil; Graduate Program in Agricultural and Livestock Microbiology, São Paulo State University (UNESP), School of Agricultural and Veterinary Sciences, Jaboticabal, SP, Brazil.
| |
Collapse
|
4
|
Agustinho DP, Fu Y, Menon VK, Metcalf GA, Treangen TJ, Sedlazeck FJ. Unveiling microbial diversity: harnessing long-read sequencing technology. Nat Methods 2024; 21:954-966. [PMID: 38689099 DOI: 10.1038/s41592-024-02262-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 03/29/2024] [Indexed: 05/02/2024]
Abstract
Long-read sequencing has recently transformed metagenomics, enhancing strain-level pathogen characterization, enabling accurate and complete metagenome-assembled genomes, and improving microbiome taxonomic classification and profiling. These advancements are not only due to improvements in sequencing accuracy, but also happening across rapidly changing analysis methods. In this Review, we explore long-read sequencing's profound impact on metagenomics, focusing on computational pipelines for genome assembly, taxonomic characterization and variant detection, to summarize recent advancements in the field and provide an overview of available analytical methods to fully leverage long reads. We provide insights into the advantages and disadvantages of long reads over short reads and their evolution from the early days of long-read sequencing to their recent impact on metagenomics and clinical diagnostics. We further point out remaining challenges for the field such as the integration of methylation signals in sub-strain analysis and the lack of benchmarks.
Collapse
Affiliation(s)
- Daniel P Agustinho
- Human Genome Sequencing center, Baylor College of Medicine, Houston, TX, USA
| | - Yilei Fu
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Vipin K Menon
- Human Genome Sequencing center, Baylor College of Medicine, Houston, TX, USA
- Senior research project manager, Human Genetics, Genentech, South San Francisco, CA, USA
| | - Ginger A Metcalf
- Human Genome Sequencing center, Baylor College of Medicine, Houston, TX, USA
| | - Todd J Treangen
- Department of Computer Science, Rice University, Houston, TX, USA
- Department of Bioengineering, Rice University, Houston, TX, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing center, Baylor College of Medicine, Houston, TX, USA.
- Department of Computer Science, Rice University, Houston, TX, USA.
| |
Collapse
|
5
|
Molina-Pardines C, Haro-Moreno JM, López-Pérez M. Phosphate-related genomic islands as drivers of environmental adaptation in the streamlined marine alphaproteobacterial HIMB59. mSystems 2023; 8:e0089823. [PMID: 38054740 PMCID: PMC10734472 DOI: 10.1128/msystems.00898-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 10/17/2023] [Indexed: 12/07/2023] Open
Abstract
IMPORTANCE These results shed light on the evolutionary strategies of microbes with streamlined genomes to adapt and survive in the oligotrophic conditions that dominate the surface waters of the global ocean. At the individual level, these microbes have been subjected to evolutionary constraints that have led to a more efficient use of nutrients, removing non-essential genes named as "streamlining theory." However, at the population level, they conserve a highly diverse gene pool in flexible genomic islands resulting in polyclonal populations on the same genomic background as an evolutionary response to environmental pressures. Localization of these islands at equivalent positions in the genome facilitates horizontal transfer between clonal lineages. This high level of environmental genomic heterogeneity could explain their cosmopolitan distribution. In the case of the order HIMB59 within the class Alphaproteobacteria, two factors exert evolutionary pressure and determine this intraspecific diversity: phages and the concentration of P in the environment.
Collapse
Affiliation(s)
- Carmen Molina-Pardines
- Evolutionary Genomics Group, División de Microbiología, Universidad Miguel Hernández, San Juan, Alicante, Spain
| | - Jose M. Haro-Moreno
- Evolutionary Genomics Group, División de Microbiología, Universidad Miguel Hernández, San Juan, Alicante, Spain
| | - Mario López-Pérez
- Evolutionary Genomics Group, División de Microbiología, Universidad Miguel Hernández, San Juan, Alicante, Spain
| |
Collapse
|
6
|
Bohn T, Balbuena E, Ulus H, Iddir M, Wang G, Crook N, Eroglu A. Carotenoids in Health as Studied by Omics-Related Endpoints. Adv Nutr 2023; 14:1538-1578. [PMID: 37678712 PMCID: PMC10721521 DOI: 10.1016/j.advnut.2023.09.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 08/25/2023] [Accepted: 09/01/2023] [Indexed: 09/09/2023] Open
Abstract
Carotenoids have been associated with risk reduction for several chronic diseases, including the association of their dietary intake/circulating levels with reduced incidence of obesity, type 2 diabetes, certain types of cancer, and even lower total mortality. In addition to some carotenoids constituting vitamin A precursors, they are implicated in potential antioxidant effects and pathways related to inflammation and oxidative stress, including transcription factors such as nuclear factor κB and nuclear factor erythroid 2-related factor 2. Carotenoids and metabolites may also interact with nuclear receptors, mainly retinoic acid receptor/retinoid X receptor and peroxisome proliferator-activated receptors, which play a role in the immune system and cellular differentiation. Therefore, a large number of downstream targets are likely influenced by carotenoids, including but not limited to genes and proteins implicated in oxidative stress and inflammation, antioxidation, and cellular differentiation processes. Furthermore, recent studies also propose an association between carotenoid intake and gut microbiota. While all these endpoints could be individually assessed, a more complete/integrative way to determine a multitude of health-related aspects of carotenoids includes (multi)omics-related techniques, especially transcriptomics, proteomics, lipidomics, and metabolomics, as well as metagenomics, measured in a variety of biospecimens including plasma, urine, stool, white blood cells, or other tissue cellular extracts. In this review, we highlight the use of omics technologies to assess health-related effects of carotenoids in mammalian organisms and models.
Collapse
Affiliation(s)
- Torsten Bohn
- Nutrition and Health Research Group, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg.
| | - Emilio Balbuena
- Department of Molecular and Structural Biochemistry, College of Agriculture and Life Sciences, North Carolina State University, Raleigh, NC, United States; Plants for Human Health Institute, North Carolina Research Campus, North Carolina State University, Kannapolis, NC, United States
| | - Hande Ulus
- Plants for Human Health Institute, North Carolina Research Campus, North Carolina State University, Kannapolis, NC, United States
| | - Mohammed Iddir
- Nutrition and Health Research Group, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Genan Wang
- Department of Chemical and Biomolecular Engineering, College of Engineering, North Carolina State University, Raleigh, NC, United States
| | - Nathan Crook
- Department of Chemical and Biomolecular Engineering, College of Engineering, North Carolina State University, Raleigh, NC, United States
| | - Abdulkerim Eroglu
- Department of Molecular and Structural Biochemistry, College of Agriculture and Life Sciences, North Carolina State University, Raleigh, NC, United States; Plants for Human Health Institute, North Carolina Research Campus, North Carolina State University, Kannapolis, NC, United States.
| |
Collapse
|
7
|
Huang R, Wang Y, Liu D, Wang S, Lv H, Yan Z. Long-Read Metagenomics of Marine Microbes Reveals Diversely Expressed Secondary Metabolites. Microbiol Spectr 2023; 11:e0150123. [PMID: 37409950 PMCID: PMC10434046 DOI: 10.1128/spectrum.01501-23] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 06/14/2023] [Indexed: 07/07/2023] Open
Abstract
Microbial secondary metabolites play crucial roles in microbial competition, communication, resource acquisition, antibiotic production, and a variety of other biotechnological processes. The retrieval of full-length BGC (biosynthetic gene cluster) sequences from uncultivated bacteria is difficult due to the technical constraints of short-read sequencing, making it impossible to determine BGC diversity. Using long-read sequencing and genome mining, 339 mainly full-length BGCs were recovered in this study, illuminating the wide range of BGCs from uncultivated lineages discovered in seawater from Aoshan Bay, Yellow Sea, China. Many extremely diverse BGCs were discovered in bacterial phyla such as Proteobacteria, Bacteroidota, Acidobacteriota, and Verrucomicrobiota as well as the previously uncultured archaeal phylum "Candidatus Thermoplasmatota." The data from metatranscriptomics showed that 30.1% of secondary metabolic genes were being expressed, and they also revealed the expression pattern of BGC core biosynthetic genes and tailoring enzymes. Taken together, our results demonstrate that long-read metagenomic sequencing combined with metatranscriptomic analysis provides a direct view into the functional expression of BGCs in environmental processes. IMPORTANCE Genome mining of metagenomic data has become the preferred method for the bioprospecting of novel compounds by cataloguing secondary metabolite potential. However, the accurate detection of BGCs requires unfragmented genomic assemblies, which have been technically difficult to obtain from metagenomes until recently with new long-read technologies. We used high-quality metagenome-assembled genomes generated from long-read data to determine the biosynthetic potential of microbes found in the surface water of the Yellow Sea. We recovered 339 highly diverse and mostly full-length BGCs from largely uncultured and underexplored bacterial and archaeal phyla. Additionally, we present long-read metagenomic sequencing combined with metatranscriptomic analysis as a potential method for gaining access to the largely underutilized genetic reservoir of specialized metabolite gene clusters in the majority of microbes that are not cultured. The combination of long-read metagenomic and metatranscriptomic analyses is significant because it can more accurately assess the mechanisms of microbial adaptation to the environment through BGC expression based on metatranscriptomic data.
Collapse
Affiliation(s)
- Ranran Huang
- Institute of Marine Science and Technology, Shandong University, Qingdao, Shandong, China
| | - Yafei Wang
- Institute of Marine Science and Technology, Shandong University, Qingdao, Shandong, China
| | - Daixi Liu
- School of Pharmaceutical Sciences, Shandong University, Jinan, Shandong, China
| | - Shaoyu Wang
- Institute of Marine Science and Technology, Shandong University, Qingdao, Shandong, China
| | - Haibo Lv
- Institute of Marine Science and Technology, Shandong University, Qingdao, Shandong, China
| | - Zhen Yan
- Shandong Key Laboratory of Water Pollution Control and Resource Reuse, School of Environmental Science and Engineering, Shandong University, Qingdao, Shandong, China
- Suzhou Research Institute, Shandong University, Suzhou, Jiangsu, China
| |
Collapse
|
8
|
Orellana LH, Krüger K, Sidhu C, Amann R. Comparing genomes recovered from time-series metagenomes using long- and short-read sequencing technologies. MICROBIOME 2023; 11:105. [PMID: 37179340 PMCID: PMC10182627 DOI: 10.1186/s40168-023-01557-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 04/26/2023] [Indexed: 05/15/2023]
Abstract
BACKGROUND Over the past years, sequencing technologies have expanded our ability to examine novel microbial metabolisms and diversity previously obscured by isolation approaches. Long-read sequencing promises to revolutionize the metagenomic field and recover less fragmented genomes from environmental samples. Nonetheless, how to best benefit from long-read sequencing and whether long-read sequencing can provide recovered genomes of similar characteristics as short-read approaches remains unclear. RESULTS We recovered metagenome-assembled genomes (MAGs) from the free-living fraction at four-time points during a spring bloom in the North Sea. The taxonomic composition of all MAGs recovered was comparable between technologies. However, differences consisted of higher sequencing depth for contigs and higher genome population diversity in short-read compared to long-read metagenomes. When pairing population genomes recovered from both sequencing approaches that shared ≥ 99% average nucleotide identity, long-read MAGs were composed of fewer contigs, a higher N50, and a higher number of predicted genes when compared to short-read MAGs. Moreover, 88% of the total long-read MAGs carried a 16S rRNA gene compared to only 23% of MAGs recovered from short-read metagenomes. Relative abundances for population genomes recovered using both technologies were similar, although disagreements were observed for high and low GC content MAGs. CONCLUSIONS Our results highlight that short-read technologies recovered more MAGs and a higher number of species than long-read due to an overall higher sequencing depth. Long-read samples produced higher quality MAGs and similar species composition compared to short-read sequencing. Differences in the GC content recovered by each sequencing technology resulted in divergences in the diversity recovered and relative abundance of MAGs within the GC content boundaries.
Collapse
Affiliation(s)
- Luis H Orellana
- Department of Molecular Ecology, Max Planck Institute for Marine Microbiology, Celsiusstraße 1, Bremen, 28359, Germany.
| | - Karen Krüger
- Department of Molecular Ecology, Max Planck Institute for Marine Microbiology, Celsiusstraße 1, Bremen, 28359, Germany
| | - Chandni Sidhu
- Department of Molecular Ecology, Max Planck Institute for Marine Microbiology, Celsiusstraße 1, Bremen, 28359, Germany
| | - Rudolf Amann
- Department of Molecular Ecology, Max Planck Institute for Marine Microbiology, Celsiusstraße 1, Bremen, 28359, Germany
| |
Collapse
|
9
|
Tessler M, Cunningham SW, Ingala MR, Warring SD, Brugler MR. An Environmental DNA Primer for Microbial and Restoration Ecology. MICROBIAL ECOLOGY 2023; 85:796-808. [PMID: 36735064 DOI: 10.1007/s00248-022-02168-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Accepted: 12/28/2022] [Indexed: 05/04/2023]
Abstract
Environmental DNA (eDNA) sequencing-DNA collected from the environment from living cells or shed DNA-was first developed for working with microbes and has greatly benefitted microbial ecologists for decades since. These tools have only become increasingly powerful with the advent of metabarcoding and metagenomics. Most new studies that examine diverse assemblages of bacteria, archaea, protists, fungi, and viruses lean heavily into eDNA using these newer technologies, as the necessary sequencing technology and bioinformatic tools have become increasingly affordable and user friendly. However, eDNA methods are rapidly evolving, and sometimes it can feel overwhelming to simply keep up with the basics. In this review, we provide a starting point for microbial ecologists who are new to DNA-based methods by detailing the eDNA methods that are most pertinent, including study design, sample collection and storage, selecting the right sequencing technology, lab protocols, equipment, and a few bioinformatic tools. Furthermore, we focus on how eDNA work can benefit restoration and what modifications are needed when working in this subfield.
Collapse
Affiliation(s)
- Michael Tessler
- Department of Biology, St. Francis College, Brooklyn, NY, USA.
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY, 10024, USA.
- Division of Invertebrate Zoology, American Museum of Natural History, New York, NY, 10024, USA.
| | - Seth W Cunningham
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY, 10024, USA
- Department of Biological Sciences, Fordham University, Bronx, NY, 10458, USA
| | - Melissa R Ingala
- Department of Biological Sciences, Fairleigh Dickinson University, Madison, NJ, 07940, USA
| | | | - Mercer R Brugler
- Division of Invertebrate Zoology, American Museum of Natural History, New York, NY, 10024, USA
- Department of Natural Sciences, University of South Carolina Beaufort, 801 Carteret Street, Beaufort, SC, 29902, USA
| |
Collapse
|
10
|
Haro-Moreno JM, Cabello-Yeves PJ, Garcillán-Barcia MP, Zakharenko A, Zemskaya TI, Rodriguez-Valera F. A novel and diverse group of Candidatus Patescibacteria from bathypelagic Lake Baikal revealed through long-read metagenomics. ENVIRONMENTAL MICROBIOME 2023; 18:12. [PMID: 36823661 PMCID: PMC9948471 DOI: 10.1186/s40793-023-00473-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 02/21/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND Lake Baikal, the world's deepest freshwater lake, contains important numbers of Candidatus Patescibacteria (formerly CPR) in its deepest reaches. However, previously obtained CPR metagenome-assembled genomes recruited very poorly indicating the potential of other groups being present. Here, we have applied for the first time a long-read (PacBio CCS) metagenomic approach to analyze in depth the Ca. Patescibacteria living in the bathypelagic water column of Lake Baikal at 1600 m. RESULTS The retrieval of nearly complete 16S rRNA genes before assembly has allowed us to detect the presence of a novel and a likely endemic group of Ca. Patescibacteria inhabiting bathypelagic Lake Baikal. This novel group seems to possess extremely high intra-clade diversity, precluding complete genomes' assembly. However, read binning and scaffolding indicate that these microbes are similar to other Ca. Patescibacteria (i.e. parasites or symbionts), although they seem to carry more anabolic pathways, likely reflecting the extremely oligotrophic habitat they inhabit. The novel bins have not been found anywhere, but one of the groups appears in small amounts in an oligotrophic and deep alpine Lake Thun. We propose this novel group be named Baikalibacteria. CONCLUSION The recovery of 16S rRNA genes via long-read metagenomics plus the use of long-read binning to uncover highly diverse "hidden" groups of prokaryotes are key strategies to move forward in ecogenomic microbiology. The novel group possesses enormous intraclade diversity akin to what happens with Ca. Patescibacteria at the interclade level, which is remarkable in an environment that has changed little in the last 25 million years.
Collapse
Affiliation(s)
- Jose M Haro-Moreno
- Evolutionary Genomics Group, Departamento Producción Vegetal y Microbiología, Universidad Miguel Hernández, Apartado 18, San Juan de Alicante, 03550, Alicante, Spain
| | - Pedro J Cabello-Yeves
- Cavanilles Institute of Biodiversity and Evolutionary Biology, University of Valencia, 46980, Paterna, Valencia, Spain
| | - M Pilar Garcillán-Barcia
- Instituto de Biomedicina y Biotecnología de Cantabria (IBBTEC), Universidad de Cantabria-Consejo Superior de Investigaciones Científicas, Santander, Spain
| | - Alexandra Zakharenko
- Limnological Institute, Siberian Branch of the Russian Academy of Sciences, Irkutsk, Russia
| | - Tamara I Zemskaya
- Limnological Institute, Siberian Branch of the Russian Academy of Sciences, Irkutsk, Russia
| | - Francisco Rodriguez-Valera
- Evolutionary Genomics Group, Departamento Producción Vegetal y Microbiología, Universidad Miguel Hernández, Apartado 18, San Juan de Alicante, 03550, Alicante, Spain.
| |
Collapse
|
11
|
Gattoni G, de la Haba RR, Martín J, Reyes F, Sánchez-Porro C, Feola A, Zuchegna C, Guerrero-Flores S, Varcamonti M, Ricca E, Selem-Mojica N, Ventosa A, Corral P. Genomic study and lipidomic bioassay of Leeuwenhoekiella parthenopeia: A novel rare biosphere marine bacterium that inhibits tumor cell viability. Front Microbiol 2023; 13:1090197. [PMID: 36687661 PMCID: PMC9859067 DOI: 10.3389/fmicb.2022.1090197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Accepted: 12/09/2022] [Indexed: 01/09/2023] Open
Abstract
The fraction of low-abundance microbiota in the marine environment is a promising target for discovering new bioactive molecules with pharmaceutical applications. Phenomena in the ocean such as diel vertical migration (DVM) and seasonal dynamic events influence the pattern of diversity of marine bacteria, conditioning the probability of isolation of uncultured bacteria. In this study, we report a new marine bacterium belonging to the rare biosphere, Leeuwenhoekiella parthenopeia sp. nov. Mr9T, which was isolated employing seasonal and diel sampling approaches. Its complete characterization, ecology, biosynthetic gene profiling of the whole genus Leeuwenhoekiella, and bioactivity of its extract on human cells are reported. The phylogenomic and microbial diversity studies demonstrated that this bacterium is a new and rare species, barely representing 0.0029% of the bacterial community in Mediterranean Sea metagenomes. The biosynthetic profiling of species of the genus Leeuwenhoekiella showed nine functionally related gene cluster families (GCF), none were associated with pathways responsible to produce known compounds or registered patents, therefore revealing its potential to synthesize novel bioactive compounds. In vitro screenings of L. parthenopeia Mr9T showed that the total lipid content (lipidome) of the cell membrane reduces the prostatic and brain tumor cell viability with a lower effect on normal cells. The lipidome consisted of sulfobacin A, WB 3559A, WB 3559B, docosenamide, topostin B-567, and unknown compounds. Therefore, the bioactivity could be attributed to any of these individual compounds or due to their synergistic effect. Beyond the rarity and biosynthetic potential of this bacterium, the importance and novelty of this study is the employment of sampling strategies based on ecological factors to reach the hidden microbiota, as well as the use of bacterial membrane constituents as potential novel therapeutics. Our findings open new perspectives on cultivation and the relationship between bacterial biological membrane components and their bioactivity in eukaryotic cells, encouraging similar studies in other members of the rare biosphere.
Collapse
Affiliation(s)
- Giuliano Gattoni
- Department of Biology, University of Naples Federico II, Naples, Italy
| | - Rafael R. de la Haba
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Sevilla, Sevilla, Spain
| | | | | | - Cristina Sánchez-Porro
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Sevilla, Sevilla, Spain
| | - Antonia Feola
- Department of Biology, University of Naples Federico II, Naples, Italy
| | - Candida Zuchegna
- Department of Biology, University of Naples Federico II, Naples, Italy
| | - Shaday Guerrero-Flores
- Centro de Ciencias Matemáticas, Universidad Nacional Autónoma de México (UNAM), Morelia, Mexico
| | - Mario Varcamonti
- Department of Biology, University of Naples Federico II, Naples, Italy
| | - Ezio Ricca
- Department of Biology, University of Naples Federico II, Naples, Italy
| | - Nelly Selem-Mojica
- Centro de Ciencias Matemáticas, Universidad Nacional Autónoma de México (UNAM), Morelia, Mexico
| | - Antonio Ventosa
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Sevilla, Sevilla, Spain
| | - Paulina Corral
- Department of Biology, University of Naples Federico II, Naples, Italy,Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Sevilla, Sevilla, Spain,*Correspondence: Paulina Corral,
| |
Collapse
|
12
|
Zhang QY, Ke F, Gui L, Zhao Z. Recent insights into aquatic viruses: Emerging and reemerging pathogens, molecular features, biological effects, and novel investigative approaches. WATER BIOLOGY AND SECURITY 2022; 1:100062. [DOI: 10.1016/j.watbs.2022.100062] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2025]
|
13
|
Abstract
The recovery of DNA from viromes is a major obstacle in the use of long-read sequencing to study their genomes. For this reason, the use of cellular metagenomes (>0.2-μm size range) emerges as an interesting complementary tool, since they contain large amounts of naturally amplified viral genomes from prelytic replication. We have applied second-generation (Illumina NextSeq; short reads) and third-generation (PacBio Sequel II; long reads) sequencing to compare the diversity and features of the viral community in a marine sample obtained from offshore waters of the western Mediterranean. We found that a major wedge of the expected marine viral diversity was directly recovered by the raw PacBio circular consensus sequencing (CCS) reads. More than 30,000 sequences were detected only in this data set, with no homologues in the long- and short-read assembly, and ca. 26,000 had no homologues in the large data set of the Global Ocean Virome 2 (GOV2), highlighting the information gap created by the assembly bias. At the level of complete viral genomes, the performance was similar in both approaches. However, the hybrid long- and short-read assembly provided the longest average length of the sequences and improved the host assignment. Although no novel major clades of viruses were found, there was an increase in the intraclade genomic diversity recovered by long reads that produced an enriched assessment of the real diversity and allowed the discovery of novel genes with biotechnological potential (e.g., endolysin genes). IMPORTANCE We explored the vast genetic diversity of environmental viruses by using a combination of cellular metagenome (as opposed to virome) sequencing using high-fidelity long-read sequences (in this case, PacBio CCS). This approach resulted in the recovery of a representative sample of the viral population, and it performed better (more phage contigs, larger average contig size) than Illumina sequencing applied to the same sample. By this approach, the many biases of assembly are avoided, as the CCS reads recovers (typically around 5 kb) complete genes and even operons, resulting in a better discovery of the viral gene diversity based on viral marker proteins. Thus, biotechnologically promising genes, such as endolysin genes, can be very efficiently searched with this approach. In addition, hybrid assembly produces more complete and longer contigs, which is particularly important for studying little-known viral groups such as the nucleocytoplasmic large DNA viruses (NCLDV).
Collapse
|
14
|
Mageeney CM, Trubl G, Williams KP. Improved Mobilome Delineation in Fragmented Genomes. FRONTIERS IN BIOINFORMATICS 2022; 2:866850. [PMID: 36304297 PMCID: PMC9580842 DOI: 10.3389/fbinf.2022.866850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 03/17/2022] [Indexed: 11/26/2022] Open
Abstract
The mobilome of a microbe, i.e., its set of mobile elements, has major effects on its ecology, and is important to delineate properly in each genome. This becomes more challenging for incomplete genomes, and even more so for metagenome-assembled genomes (MAGs), where misbinning of scaffolds and other losses can occur. Genomic islands (GIs), which integrate into the host chromosome, are a major component of the mobilome. Our GI-detection software TIGER, unique in its precise mapping of GI termini, was applied to 74,561 genomes from 2,473 microbial species, each species containing at least one MAG and one isolate genome. A species-normalized deficit of ∼1.6 GIs/genome was measured for MAGs relative to isolates. To test whether this undercount was due to the higher fragmentation of MAG genomes, TIGER was updated to enable detection of split GIs whose termini are on separate scaffolds or that wrap around the origin of a circular replicon. This doubled GI yields, and the new split GIs matched the quality of single-scaffold GIs, except that highly fragmented GIs may lack central portions. Cross-scaffold search is an important upgrade to GI detection as fragmented genomes increasingly dominate public databases. TIGER2 better captures MAG microdiversity, recovering niche-defining GIs and supporting microbiome research aims such as virus-host linking and ecological assessment.
Collapse
Affiliation(s)
- Catherine M. Mageeney
- Systems Biology Department, Sandia National Laboratories, Livermore, CA, United States
| | - Gareth Trubl
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, United States
| | - Kelly P. Williams
- Systems Biology Department, Sandia National Laboratories, Livermore, CA, United States
- *Correspondence: Kelly P. Williams,
| |
Collapse
|
15
|
Long-read metagenomics of soil communities reveals phylum-specific secondary metabolite dynamics. Commun Biol 2021; 4:1302. [PMID: 34795375 PMCID: PMC8602731 DOI: 10.1038/s42003-021-02809-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Accepted: 10/25/2021] [Indexed: 01/04/2023] Open
Abstract
Microbial biosynthetic gene clusters (BGCs) encoding secondary metabolites are thought to impact a plethora of biologically mediated environmental processes, yet their discovery and functional characterization in natural microbiomes remains challenging. Here we describe deep long-read sequencing and assembly of metagenomes from biological soil crusts, a group of soil communities that are rich in BGCs. Taking advantage of the unusually long assemblies produced by this approach, we recovered nearly 3,000 BGCs for analysis, including 712 full-length BGCs. Functional exploration through metatranscriptome analysis of a 3-day wetting experiment uncovered phylum-specific BGC expression upon activation from dormancy, elucidating distinct roles and complex phylogenetic and temporal dynamics in wetting processes. For example, a pronounced increase in BGC transcription occurs at night primarily in cyanobacteria, implicating BGCs in nutrient scavenging roles and niche competition. Taken together, our results demonstrate that long-read metagenomic sequencing combined with metatranscriptomic analysis provides a direct view into the functional dynamics of BGCs in environmental processes and suggests a central role of secondary metabolites in maintaining phylogenetically conserved niches within biocrusts.
Collapse
|
16
|
Phylogenomics of SAR116 Clade Reveals Two Subclades with Different Evolutionary Trajectories and an Important Role in the Ocean Sulfur Cycle. mSystems 2021; 6:e0094421. [PMID: 34609172 PMCID: PMC8547437 DOI: 10.1128/msystems.00944-21] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
The SAR116 clade within the class Alphaproteobacteria represents one of the most abundant groups of heterotrophic bacteria inhabiting the surface of the ocean. The small number of cultured representatives of SAR116 (only two to date) is a major bottleneck that has prevented an in-depth study at the genomic level to understand the relationship between genome diversity and its role in the marine environment. In this study, we use all publicly available genomes to provide a genomic overview of the phylogeny, metabolism, and biogeography within the SAR116 clade. This increased genomic diversity has led to the discovery of two subclades that, despite coexisting in the same environment, display different properties in their genomic makeup. One represents a novel subclade for which no pure cultures have been isolated and is composed mainly of single-amplified genomes (SAGs). Genomes within this subclade showed convergent evolutionary trajectories with more streamlined features, such as low GC content (ca. 30%), short intergenic spacers (<22 bp), and strong purifying selection (low ratio of nonsynonymous to synonymous polymorphisms [dN/dS]). Besides, they were more abundant in metagenomic databases recruiting at the deep chlorophyll maximum. Less abundant and restricted to the upper photic layers of the global ocean, the other subclade of SAR116, enriched in metagenome-assembled genomes (MAGs), included the only two pure cultures. Genomic analysis suggested that both clades have a significant role in the sulfur cycle with differences in the way both clades can metabolize dimethylsulfoniopropionate (DMSP). IMPORTANCE The SAR116 clade of Alphaproteobacteria is a ubiquitous group of heterotrophic bacteria inhabiting the surface of the ocean, but the information about their ecology and population genomic diversity is scarce due to the difficulty of getting pure culture isolates. The combination of single-cell genomics and metagenomics has become an alternative approach to study these kinds of microbes. Our results expand the understanding of the genomic diversity, distribution, and lifestyles within this clade and provide evidence of different evolutionary trajectories in the genomic makeup of the two subclades that could serve to illustrate how evolutionary pressure can drive different adaptations to the same environment. Therefore, the SAR116 clade represents an ideal model organism for the study of the evolutionary streamlining of genomes in microbes that have relatively close relatedness to each other.
Collapse
|