1
|
Holman LE, Zampirolo G, Gyllencreutz R, Scourse J, Frøslev T, Carøe C, Gopalakrishnan S, Pedersen MW, Bohmann K. Navigating Past Oceans: Comparing Metabarcoding and Metagenomics of Marine Ancient Sediment Environmental DNA. Mol Ecol Resour 2025:e14086. [PMID: 39980208 DOI: 10.1111/1755-0998.14086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 01/29/2025] [Accepted: 01/31/2025] [Indexed: 02/22/2025]
Abstract
The condition of ancient marine ecosystems provides context for contemporary biodiversity changes in human-impacted oceans. Sequencing sedimentary ancient DNA (sedaDNA) is an emerging method for generating high-resolution biodiversity time-series data, offering insights into past ecosystems. However, few studies directly compare the two predominant sedaDNA sequencing approaches: metabarcoding and shotgun-metagenomics, and it remains unclear if these methodological differences affect diversity metrics. We compared these methods using sedaDNA from an archived marine sediment record sampled in the Skagerrak, North Sea, spanning almost 8000 years. We performed metabarcoding of a eukaryotic 18S rRNA region (V9) and sequenced 153-229 million metagenomic reads per sample. Our results show limited overlap between metabarcoding and metagenomics, with only three metazoan genera detected by both methods. For overlapping taxa, metabarcoding detections became inconsistent for samples older than 2000 years, while metagenomics detected taxa throughout the time series. We observed divergent patterns of alpha diversity, with metagenomics indicating decreased richness towards the present and metabarcoding showing an increase. However, beta diversity patterns were similar between methods, with discrepancies only in metazoan data comparisons. Our findings demonstrate that the choice of sequencing method significantly impacts detected biodiversity in an ancient marine sediment record. While we stress that studies with limited variation in DNA degradation among samples may not be strongly affected, researchers should exonerate methodological explanations for observed biodiversity changes in marine sediment cores, particularly when considering alpha diversity, before making ecological interpretations.
Collapse
Affiliation(s)
- Luke E Holman
- Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Giulia Zampirolo
- Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Richard Gyllencreutz
- Department of Geological Sciences, Stockholm University, Stockholm, Sweden
- Bolin Centre for Climate Research, Stockholm University, Stockholm, Sweden
| | - James Scourse
- Centre for Geography and Environmental Science, University of Exeter, Exeter, UK
| | - Tobias Frøslev
- Centre for Ancient Environmental Genomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Global Biodiversity Information Facility, Copenhagen, Denmark
| | | | - Shyam Gopalakrishnan
- Centre for Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | | | - Kristine Bohmann
- Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
2
|
Ribeyre Z, Depardieu C, Prunier J, Pelletier G, Parent GJ, Mackay J, Droit A, Bousquet J, Nolet P, Messier C. De novo transcriptome assembly and discovery of drought-responsive genes in white spruce (Picea glauca). PLoS One 2025; 20:e0316661. [PMID: 39752431 PMCID: PMC11698436 DOI: 10.1371/journal.pone.0316661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Accepted: 12/13/2024] [Indexed: 01/06/2025] Open
Abstract
Forests face an escalating threat from the increasing frequency of extreme drought events driven by climate change. To address this challenge, it is crucial to understand how widely distributed species of economic or ecological importance may respond to drought stress. In this study, we examined the transcriptome of white spruce (Picea glauca (Moench) Voss) to identify key genes and metabolic pathways involved in the species' response to water stress. We assembled a de novo transcriptome, performed differential gene expression analyses at four time points over 22 days during a controlled drought stress experiment involving 2-year-old plants and three genetically distinct clones, and conducted gene enrichment analyses. The transcriptome assembly and gene expression analysis identified a total of 33,287 transcripts corresponding to 18,934 annotated unique genes, including 4,425 genes that are uniquely responsive to drought. Many transcripts that had predicted functions associated with photosynthesis, cell wall organization, and water transport were down-regulated under drought conditions, while transcripts linked to abscisic acid response and defense response were up-regulated. Our study highlights a previously uncharacterized effect of drought stress on lipid metabolism genes in conifers and significant changes in the expression of several transcription factors, suggesting a regulatory response potentially linked to drought response or acclimation. Our research represents a fundamental step in unraveling the molecular mechanisms underlying short-term drought responses in white spruce seedlings. In addition, it provides a valuable source of new genetic data that could contribute to genetic selection strategies aimed at enhancing the drought resistance and resilience of white spruce to changing climates.
Collapse
Affiliation(s)
- Zoé Ribeyre
- Département des Sciences Naturelles, Institut des Sciences de la Forêt Tempérée (ISFORT), Université du Québec en Outaouais (UQO), Ripon, Canada
- Centre d’étude de la Forêt (CEF), Québec, QC, Canada
| | - Claire Depardieu
- Canada Research Chair in Forest Genomics, Institute for Systems and Integrative Biology, Université Laval, Québec, QC, Canada
- Centre for Forest Research, Département des Sciences du Bois et de la Forêt, Université Laval, Québec, QC, Canada
- Natural Resources Canada, Canadian Forest Service, Laurentian Forestry Center, Québec, QC, Canada
| | - Julien Prunier
- Plateforme de Bioinformatique du Centre Hospitalier Universitaire de Québec Associé à l’Université Laval, Québec, QC, Canada
| | - Gervais Pelletier
- Natural Resources Canada, Canadian Forest Service, Laurentian Forestry Center, Québec, QC, Canada
| | - Geneviève J. Parent
- Laboratory of Genomics, Maurice- Lamontagne Institute, Fisheries and Oceans Canada, Mont-Joli, QC, Canada
| | - John Mackay
- Department of Plant Sciences, University of Oxford, Oxford, United Kingdom
| | - Arnaud Droit
- Plateforme de Bioinformatique du Centre Hospitalier Universitaire de Québec Associé à l’Université Laval, Québec, QC, Canada
| | - Jean Bousquet
- Canada Research Chair in Forest Genomics, Institute for Systems and Integrative Biology, Université Laval, Québec, QC, Canada
- Centre for Forest Research, Département des Sciences du Bois et de la Forêt, Université Laval, Québec, QC, Canada
| | - Philippe Nolet
- Département des Sciences Naturelles, Institut des Sciences de la Forêt Tempérée (ISFORT), Université du Québec en Outaouais (UQO), Ripon, Canada
- Centre d’étude de la Forêt (CEF), Québec, QC, Canada
| | - Christian Messier
- Département des Sciences Naturelles, Institut des Sciences de la Forêt Tempérée (ISFORT), Université du Québec en Outaouais (UQO), Ripon, Canada
- Centre d’étude de la Forêt (CEF), Québec, QC, Canada
- Département des Sciences Biologiques, Université du Québec à Montréal (UQAM), Montréal, QC, Canada
| |
Collapse
|
3
|
Li H. BWT construction and search at the terabase scale. Bioinformatics 2024; 40:btae717. [PMID: 39607778 PMCID: PMC11646566 DOI: 10.1093/bioinformatics/btae717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Revised: 11/06/2024] [Accepted: 11/26/2024] [Indexed: 11/30/2024] Open
Abstract
MOTIVATION Burrows-Wheeler Transform (BWT) is a common component in full-text indices. Initially developed for data compression, it is particularly powerful for encoding redundant sequences such as pangenome data. However, BWT construction is resource intensive and hard to be parallelized, and many methods for querying large full-text indices only report exact matches or their simple extensions. These limitations have hampered the biological applications of full-text indices. RESULTS We developed ropebwt3 for efficient BWT construction and query. Ropebwt3 indexed 320 assembled human genomes in 65 h and indexed 7.3 terabases of commonly studied bacterial assemblies in 26 days. This was achieved using up to 170 gigabytes of memory at the peak without working disk space. Ropebwt3 can find maximal exact matches and inexact alignments under affine-gap penalties, and can retrieve similar local haplotypes matching a query sequence. It demonstrates the feasibility of full-text indexing at the terabase scale. AVAILABILITY AND IMPLEMENTATION https://github.com/lh3/ropebwt3.
Collapse
Affiliation(s)
- Heng Li
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, United States
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, United States
| |
Collapse
|
4
|
Zampirolo G, Holman LE, Sawafuji R, Ptáková M, Kovačiková L, Šída P, Pokorný P, Pedersen MW, Walls M. Tracing early pastoralism in Central Europe using sedimentary ancient DNA. Curr Biol 2024; 34:4650-4661.e4. [PMID: 39305897 DOI: 10.1016/j.cub.2024.08.047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 05/22/2024] [Accepted: 08/28/2024] [Indexed: 10/25/2024]
Abstract
Central European forests have been shaped by complex human interactions throughout the Holocene, with significant changes following the introduction of domesticated animals in the Neolithic (∼7.5-6.0 ka before present [BP]). However, understanding early pastoral practices and their impact on forests is limited by methods for detecting animal movement across past landscapes. Here, we examine ancient sedimentary DNA (sedaDNA) preserved at the Velký Mamuťák rock shelter in northern Bohemia (Czech Republic), which has been a forested enclave since the early Holocene. We find that domesticated animals, their associated microbiomes, and plants potentially gathered for fodder have clear representation by the Late Neolithic, around 6.0 ka BP, and persist throughout the Bronze Age into recent times. We identify a change in dominant grazing species from sheep to pigs in the Bronze Age (∼4.1-3.0 ka BP) and interpret the impact this had in the mid-Holocene retrogressions that still define the structure of Central European forests today. This study highlights the ability of ancient metagenomics to bridge archaeological and paleoecological methods and provide an enhanced perspective on the roots of the "Anthropocene."
Collapse
Affiliation(s)
- Giulia Zampirolo
- Section for Molecular Ecology and Evolution, Faculty of Health and Medical Sciences, Globe Institute, University of Copenhagen, Øster Farimagsgade 5, 1353 Copenhagen, Denmark
| | - Luke E Holman
- Section for Molecular Ecology and Evolution, Faculty of Health and Medical Sciences, Globe Institute, University of Copenhagen, Øster Farimagsgade 5, 1353 Copenhagen, Denmark; School of Ocean and Earth Science, National Oceanography Centre, Southampton, University of Southampton, European Way, Southampton SO14 3ZH, UK
| | - Rikai Sawafuji
- Centre for Ancient Environmental Genomics, Faculty of Health and Medical Sciences, Globe Institute, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark; Research Center for Integrative Evolutionary Science, The Graduate University for Advanced Studies (SOKENDAI), Hayama 240-0193, Kanagawa, Japan
| | - Michaela Ptáková
- Laboratory of Archaeobotany and Palaeoecology, Faculty of Science, University of South Bohemia, Na Zlaté stoce 3, 370 05 České Budějovice, Czech Republic
| | - Lenka Kovačiková
- Laboratory of Archaeobotany and Palaeoecology, Faculty of Science, University of South Bohemia, Na Zlaté stoce 3, 370 05 České Budějovice, Czech Republic
| | - Petr Šída
- Philosophical faculty, University of Hradec Králové, nám. Svobody 331/2, 500 02 Hradec Králové, Czech Republic
| | - Petr Pokorný
- Center for Theoretical Study, Charles University and Czech Academy of Sciences, Ovocný trh 5, 116 36 Prague, Czech Republic
| | - Mikkel Winther Pedersen
- Centre for Ancient Environmental Genomics, Faculty of Health and Medical Sciences, Globe Institute, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.
| | - Matthew Walls
- Center for Theoretical Study, Charles University and Czech Academy of Sciences, Ovocný trh 5, 116 36 Prague, Czech Republic; Department of Anthropology and Archaeology, Faculty of Arts, University of Calgary, 2500 University Dr NW, Calgary, AB T2N 4V8, Canada.
| |
Collapse
|
5
|
Veselovsky E, Lebedeva A, Kuznetsova O, Kravchuk D, Belova E, Taraskina A, Grigoreva T, Kavun A, Yudina V, Belyaeva L, Nikulin V, Mileyko V, Tryakin A, Fedyanin M, Ivanov M. Evaluation of blood MSI burden dynamics to trace immune checkpoint inhibitor therapy efficacy through the course of treatment. Sci Rep 2024; 14:23454. [PMID: 39379462 PMCID: PMC11461614 DOI: 10.1038/s41598-024-73952-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 09/23/2024] [Indexed: 10/10/2024] Open
Abstract
Analysis of serial liquid biopsy (LB) samples has been found to be a promising approach for the monitoring of tumor dynamics in the course of therapy for patients with colorectal cancer (CRC). Currently, somatic mutations are used for tracing the dynamics of the tumor via LB. However, the analysis of the dynamic changes in the molecular signatures such as microsatellite instability (MSI) is not currently used. We hypothesized that changes in blood MSI burden (bMSI) could be registered using serial LB sampling in the course of immune checkpoint inhibitors (ICI), and that its changes could potentially correlate with treatment outcomes. We report the preliminary findings of the observational trial launched to study (NCT06414304) the dynamics of bMSI in 9 MSI-positive CRC patients receiving ICI. NGS-based MSI testing was performed on both pre-treatment FFPE and serial LB samples. For patients who had detectable bMSI burden in any of the LB samples (n = 8, 89%), median bMSI was 1.4% (range, 0.01-40%). Among patients with detectable MSI in available FFPE samples, median MSI burden was 29.3% (range, 10-40%). bMSI detected in baseline LB and FFPE samples were positively correlated (Pearson's R 0.47). Maximal variant allele frequencies of driver mutations observed in LB were also positively correlated with bMSI burden (Pearson's R 0.7). Patients who had clinical benefit had undetectable bMSI burden at follow-up. Our results provide the rationale for further validation of bMSI as a predictive biomarker of ICI in MSI-positive patients.
Collapse
Affiliation(s)
- Egor Veselovsky
- OncoAtlas LLC, 4/1A, Leninskiy Prospect, Moscow, Russian Federation, 119049
- Department of Evolutionary Genetics of Development, Koltzov Institute of Developmental Biology of the Russian Academy of Sciences, Moscow, Russian Federation
| | - Alexandra Lebedeva
- OncoAtlas LLC, 4/1A, Leninskiy Prospect, Moscow, Russian Federation, 119049
- Sechenov First Moscow State Medical University, Moscow, Russian Federation
| | - Olesya Kuznetsova
- OncoAtlas LLC, 4/1A, Leninskiy Prospect, Moscow, Russian Federation, 119049
- Federal State Budgetary Institution N.N. Blokhin National Medical Research Center of Oncology, Moscow, Russian Federation
| | - Daria Kravchuk
- State Budgetary Institution of Health Care of the City of Moscow "Moscow Multidisciplinary Clinical Center" "Kommunarka" of the Department of Health of the City of Moscow, Moscow, Russian Federation
| | - Ekaterina Belova
- OncoAtlas LLC, 4/1A, Leninskiy Prospect, Moscow, Russian Federation, 119049
- Sechenov First Moscow State Medical University, Moscow, Russian Federation
- Lomonosov Moscow State University, Moscow, Russian Federation
| | | | - Tatiana Grigoreva
- OncoAtlas LLC, 4/1A, Leninskiy Prospect, Moscow, Russian Federation, 119049
- Sechenov First Moscow State Medical University, Moscow, Russian Federation
| | - Alexandra Kavun
- OncoAtlas LLC, 4/1A, Leninskiy Prospect, Moscow, Russian Federation, 119049
| | - Victoria Yudina
- Federal State Budgetary Institution N.N. Blokhin National Medical Research Center of Oncology, Moscow, Russian Federation
| | - Laima Belyaeva
- OncoAtlas LLC, 4/1A, Leninskiy Prospect, Moscow, Russian Federation, 119049
- Sechenov First Moscow State Medical University, Moscow, Russian Federation
| | - Vladislav Nikulin
- Federal State Budgetary Institution N.N. Blokhin National Medical Research Center of Oncology, Moscow, Russian Federation
| | - Vladislav Mileyko
- OncoAtlas LLC, 4/1A, Leninskiy Prospect, Moscow, Russian Federation, 119049
- Sechenov First Moscow State Medical University, Moscow, Russian Federation
| | - Alexey Tryakin
- Federal State Budgetary Institution N.N. Blokhin National Medical Research Center of Oncology, Moscow, Russian Federation
| | - Mikhail Fedyanin
- Federal State Budgetary Institution N.N. Blokhin National Medical Research Center of Oncology, Moscow, Russian Federation
- State Budgetary Institution of Health Care of the City of Moscow "Moscow Multidisciplinary Clinical Center" "Kommunarka" of the Department of Health of the City of Moscow, Moscow, Russian Federation
- Federal State Budgetary Institution "National Medical and Surgical Center named after N.I. Pirogov" of the Ministry of Health of the Russian Federation, Moscow, Russian Federation
| | - Maxim Ivanov
- OncoAtlas LLC, 4/1A, Leninskiy Prospect, Moscow, Russian Federation, 119049.
- Sechenov First Moscow State Medical University, Moscow, Russian Federation.
| |
Collapse
|
6
|
Vogel NA, Rubin JD, Pedersen AG, Sackett PW, Pedersen MW, Renaud G. soibean: High-Resolution Taxonomic Identification of Ancient Environmental DNA Using Mitochondrial Pangenome Graphs. Mol Biol Evol 2024; 41:msae203. [PMID: 39361595 PMCID: PMC11488136 DOI: 10.1093/molbev/msae203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 08/05/2024] [Accepted: 09/27/2024] [Indexed: 10/05/2024] Open
Abstract
Ancient environmental DNA (aeDNA) is becoming a powerful tool to gain insights about past ecosystems, overcoming the limitations of conventional fossil records. However, several methodological challenges remain, particularly for classifying the DNA to species level and conducting phylogenetic analysis. Current methods, primarily tailored for modern datasets, fail to capture several idiosyncrasies of aeDNA, including species mixtures from closely related species and ancestral divergence. We introduce soibean, a novel tool that utilizes mitochondrial pangenomic graphs for identifying species from aeDNA reads. It outperforms existing methods in accurately identifying species from multiple closely related sources within a sample, enhancing phylogenetic analysis for aeDNA. soibean employs a damage-aware likelihood model for precise identification at low coverage with a high damage rate. Additionally, we reconstructed ancestral sequences for soibean's database to handle aeDNA that is highly diverged from modern references. soibean demonstrates effectiveness through simulated data tests and empirical validation. Notably, our method uncovered new empirical results in published datasets, including using porpoise whales as food in a Mesolithic community in Sweden, demonstrating its potential to reveal previously unrecognized findings in aeDNA studies.
Collapse
Affiliation(s)
- Nicola Alexandra Vogel
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Joshua Daniel Rubin
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Anders Gorm Pedersen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Peter Wad Sackett
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Mikkel Winther Pedersen
- Centre For Ancient Environmental Genomics, Globe Institute, University of Copenhagen, Copenhagen K, Denmark
| | - Gabriel Renaud
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kongens Lyngby, Denmark
| |
Collapse
|
7
|
Moreira TG, Cox LM, Da Silva P, Mangani D, De Oliveira MG, Escobar G, Lanser TB, Murphy L, Lobo ELC, Milstein O, Gauthier CD, Clara Guimarāes A, Schwerdtfeger L, Ekwudo MN, Wasén C, Liu S, Menezes GB, Ferreira E, Gabriely G, Anderson AC, Faria AMC, Rezende RM, Weiner HL. Dietary protein modulates intestinal dendritic cells to establish mucosal homeostasis. Mucosal Immunol 2024; 17:911-922. [PMID: 38925529 DOI: 10.1016/j.mucimm.2024.06.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 06/11/2024] [Accepted: 06/16/2024] [Indexed: 06/28/2024]
Abstract
Dietary proteins are taken up by intestinal dendritic cells (DCs), cleaved into peptides, loaded to major histocompatibility complexes, and presented to T cells to generate an immune response. Amino acid (AA)-diets do not have the same effects because AAs cannot bind to major histocompatibility complex to activate T cells. Here, we show that impairment in regulatory T cell generation and loss of tolerance in mice fed a diet lacking whole protein is associated with major transcriptional changes in intestinal DCs including downregulation of genes related to DC maturation, activation and decreased gene expression of immune checkpoint molecules. Moreover, the AA-diet had a profound effect on microbiome composition, including an increase in Akkermansia muciniphilia and Oscillibacter and a decrease in Lactococcus lactis and Bifidobacterium. Although microbiome transfer experiments showed that AA-driven microbiome modulates intestinal DC gene expression, most of the unique transcriptional change in DC was linked to the absence of whole protein in the diet. Our findings highlight the importance of dietary proteins for intestinal DC function and mucosal tolerance.
Collapse
Affiliation(s)
- Thais G Moreira
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
| | - Laura M Cox
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Patrick Da Silva
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Davide Mangani
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Marilia G De Oliveira
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Giulia Escobar
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Toby B Lanser
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Liam Murphy
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Eduardo L C Lobo
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Omer Milstein
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Christian D Gauthier
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Ana Clara Guimarāes
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Luke Schwerdtfeger
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Mellicient N Ekwudo
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Caroline Wasén
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Shirong Liu
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Gustavo B Menezes
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Enio Ferreira
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Galina Gabriely
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Ana C Anderson
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Ana Maria C Faria
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Rafael M Rezende
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Howard L Weiner
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
8
|
Garg V, Bohra A, Mascher M, Spannagl M, Xu X, Bevan MW, Bennetzen JL, Varshney RK. Unlocking plant genetics with telomere-to-telomere genome assemblies. Nat Genet 2024; 56:1788-1799. [PMID: 39048791 DOI: 10.1038/s41588-024-01830-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Accepted: 06/12/2024] [Indexed: 07/27/2024]
Abstract
Contiguous genome sequence assemblies will help us to realize the full potential of crop translational genomics. Recent advances in sequencing technologies, especially long-read sequencing strategies, have made it possible to construct gapless telomere-to-telomere (T2T) assemblies, thus offering novel insights into genome organization and function. Plant genomes pose unique challenges, such as a continuum of ancient to recent polyploidy and abundant highly similar and long repetitive elements. Owing to progress in sequencing approaches, for most crop plants, chromosome-scale reference genome assemblies are available, but T2T assembly construction remains challenging. Here we describe methods for haplotype-resolved, gapless T2T assembly construction in plants, including various crop species. We outline the impact of T2T assemblies in elucidating the roles of repetitive elements in gene regulation, as well as in pangenomics, functional genomics, genome-assisted breeding and targeted genome manipulation. In conjunction with sequence-enriched germplasm repositories, T2T assemblies thus hold great promise for basic and applied plant sciences.
Collapse
Affiliation(s)
- Vanika Garg
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia
| | - Abhishek Bohra
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia
- ICAR-Indian Institute of Pulses Research, Kanpur, India
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research, Gatersleben, Seeland, Germany
| | - Manuel Spannagl
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia
- Plant Genome and Systems Biology, German Research Center for Environmental Health, Helmholtz Zentrum München, Neuherberg, Germany
| | - Xun Xu
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia
- BGI-Shenzhen, Shenzhen, China
| | | | | | - Rajeev K Varshney
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia.
| |
Collapse
|
9
|
Lebedeva A, Veselovsky E, Kavun A, Belova E, Grigoreva T, Orlov P, Subbotovskaya A, Shipunov M, Mashkov O, Bilalov F, Shatalov P, Kaprin A, Shegai P, Diuzhev Z, Migiaev O, Vytnova N, Mileyko V, Ivanov M. Untapped Potential of Poly(ADP-Ribose) Polymerase Inhibitors: Lessons Learned From the Real-World Clinical Homologous Recombination Repair Mutation Testing. World J Oncol 2024; 15:562-578. [PMID: 38993246 PMCID: PMC11236374 DOI: 10.14740/wjon1820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 04/29/2024] [Indexed: 07/13/2024] Open
Abstract
Background Testing for homologous recombination deficiency (HRD) mutations is pivotal to assess individual risk, to proact preventive measures in healthy carriers and to tailor treatments for cancer patients. Increasing prominence of poly(ADP-ribose) polymerase (PARP) inhibitors with remarkable impact on molecular-selected patient survival across diverse nosologies, ingrains testing for BRCA genes and beyond in clinical practice. Nevertheless, testing strategies remain a question of debate. While several pathogenic BRCA1/2 gene variants have been described as founder pathogenic mutations frequently found in patients from Russia, other homologous recombination repair (HRR) genes have not been sufficiently explored. In this study, we present real-world data of routine HRR gene testing in Russia. Methods We evaluated clinical and sequencing data from cancer patients who had germline/somatic next-generation sequencing (NGS) HRR gene testing in Russia (BRCA1/2/ATM/CHEK2, or 15 HRR genes). The primary objectives of this study were to evaluate the frequency of BRCA1/2 and non-BRCA gene mutations in real-world unselected patients from Russia, and to determine whether testing beyond BRCA1/2 is feasible. Results Data of 2,032 patients were collected from February 2021 to February 2023. Most had breast (n = 715, 35.2%), ovarian (n = 259, 12.7%), pancreatic (n = 85, 4.2%), or prostate cancer (n = 58, 2.9%). We observed 586 variants of uncertain significance (VUS) and 372 deleterious variants (DVs) across 487 patients, with 17.6% HRR-mutation positivity. HRR testing identified 120 (11.8%) BRCA1/2-positive, and 172 (16.9%) HRR-positive patients. With 51 DVs identified in 242 formalin-fixed paraffin-embedded (FFPE), testing for variant origin clarification was required in one case (0.4%). Most BRCA1/2 germline variants were DV (121 DVs, 26 VUS); in non-BRCA1/2 genes, VUS were ubiquitous (53 DVs, 132 VUS). In silico prediction identified additional 4.9% HRR and 1.2% BRCA1/2/ATM/CHEK2 mutation patients. Conclusions Our study represents one of the first reports about the incidence of DV and VUS in HRR genes, including genes beyond BRCA1/2, identified in cancer patients from Russia, assessed by NGS. In silico predictions of the observed HRR gene variants suggest that non-BRCA gene testing is likely to result in higher frequency of patients who are candidates for PARP inhibitor therapy. Continuing sequencing efforts should clarify interpretation of frequently observed non-BRCA VUS.
Collapse
Affiliation(s)
- Alexandra Lebedeva
- OncoAtlas LLC, Moscow, Russia
- Sechenov First Moscow State Medical University, Moscow, Russia
| | - Egor Veselovsky
- OncoAtlas LLC, Moscow, Russia
- Department of Evolutionary Genetics of Development, Koltzov Institute of Developmental Biology of the Russian Academy of Sciences, Moscow, Russia
| | | | - Ekaterina Belova
- OncoAtlas LLC, Moscow, Russia
- Sechenov First Moscow State Medical University, Moscow, Russia
- Lomonosov Moscow State University, Moscow, Russia
| | - Tatiana Grigoreva
- OncoAtlas LLC, Moscow, Russia
- Sechenov First Moscow State Medical University, Moscow, Russia
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow, Russia
| | - Pavel Orlov
- The Federal Research Center for Fundamental and Translational Medicine (NIIECM FRC FTM), Novosibirsk, Russia
| | - Anna Subbotovskaya
- The Federal Research Center for Fundamental and Translational Medicine (NIIECM FRC FTM), Novosibirsk, Russia
| | - Maksim Shipunov
- The Federal Research Center for Fundamental and Translational Medicine (NIIECM FRC FTM), Novosibirsk, Russia
| | - Oleg Mashkov
- State Budgetary Institution of Healthcare Republican Medical Genetic Center, Ufa, Russia
| | - Fanil Bilalov
- State Budgetary Institution of Healthcare Republican Medical Genetic Center, Ufa, Russia
| | - Peter Shatalov
- National Medical Research Radiological Centre of the Ministry of Health of the Russian Federation, Obninsk, Russia
| | - Andrey Kaprin
- National Medical Research Radiological Centre of the Ministry of Health of the Russian Federation, Obninsk, Russia
| | - Peter Shegai
- National Medical Research Radiological Centre of the Ministry of Health of the Russian Federation, Obninsk, Russia
| | | | | | | | - Vladislav Mileyko
- OncoAtlas LLC, Moscow, Russia
- Sechenov First Moscow State Medical University, Moscow, Russia
| | - Maxim Ivanov
- OncoAtlas LLC, Moscow, Russia
- Sechenov First Moscow State Medical University, Moscow, Russia
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia
| |
Collapse
|
10
|
Teterina AA, Willis JH, Baer CF, Phillips PC. Pervasive conservation of intron number and other genetic elements revealed by a chromosome-level genomic assembly of the hyper-polymorphic nematode Caenorhabditis brenneri. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.25.600681. [PMID: 38979286 PMCID: PMC11230420 DOI: 10.1101/2024.06.25.600681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
With within-species genetic diversity estimates that span the gambit of that seen across the entirety of animals, the Caenorhabditis genus of nematodes holds unique potential to provide insights into how population size and reproductive strategies influence gene and genome organization and evolution. Our study focuses on Caenorhabditis brenneri, currently known as one of the most genetically diverse nematodes within its genus and metazoan phyla. Here, we present a high-quality gapless genome assembly and annotation for C. brenneri, revealing a common nematode chromosome arrangement characterized by gene-dense central regions and repeat rich peripheral parts. Comparison of C. brenneri with other nematodes from the 'Elegans' group revealed conserved macrosynteny but a lack of microsynteny, characterized by frequent rearrangements and low correlation iof orthogroup sizes, indicative of high rates of gene turnover. We also assessed genome organization within corresponding syntenic blocks in selfing and outcrossing species, affirming that selfing species predominantly experience loss of both genes and intergenic DNA. Comparison of gene structures revealed strikingly small number of shared introns across species, yet consistent distributions of intron number and length, regardless of population size or reproductive mode, suggesting that their evolutionary dynamics are primarily reflective of functional constraints. Our study provides valuable insights into genome evolution and expands the nematode genome resources with the highly genetically diverse C. brenneri, facilitating research into various aspects of nematode biology and evolutionary processes.
Collapse
Affiliation(s)
- Anastasia A Teterina
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, USA
- Center of Parasitology, Severtsov Institute of Ecology and Evolution RAS, Moscow, Russia
| | - John H Willis
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, USA
| | - Charles F Baer
- Department of Biology, University of Florida, Gainesville, USA
| | - Patrick C Phillips
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, USA
| |
Collapse
|
11
|
Mahajan S, Chakraborty A, Bisht MS, Sil T, Sharma VK. Genome sequencing and functional analysis of a multipurpose medicinal herb Tinospora cordifolia (Giloy). Sci Rep 2024; 14:2799. [PMID: 38307917 PMCID: PMC10837142 DOI: 10.1038/s41598-024-53176-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 01/29/2024] [Indexed: 02/04/2024] Open
Abstract
Tinospora cordifolia (Willd.) Hook.f. & Thomson, also known as Giloy, is among the most important medicinal plants that have numerous therapeutic applications in human health due to the production of a diverse array of secondary metabolites. To gain genomic insights into the medicinal properties of T. cordifolia, the genome sequencing was carried out using 10× Genomics linked read and Nanopore long-read technologies. The draft genome assembly of T. cordifolia was comprised of 1.01 Gbp, which is the genome sequenced from the plant family Menispermaceae. We also performed the genome size estimation for T. cordifolia, which was found to be 1.13 Gbp. The deep sequencing of transcriptome from the leaf tissue was also performed. The genome and transcriptome assemblies were used to construct the gene set, resulting in 17,245 coding gene sequences. Further, the phylogenetic position of T. cordifolia was also positioned as basal eudicot by constructing a genome-wide phylogenetic tree using multiple species. Further, a comprehensive comparative evolutionary analysis of gene families contraction/expansion and multiple signatures of adaptive evolution was performed. The genes involved in benzyl iso-quinoline alkaloid, terpenoid, lignin and flavonoid biosynthesis pathways were found with signatures of adaptive evolution. These evolutionary adaptations in genes provide genomic insights into the presence of diverse medicinal properties of this plant. The genes involved in the common symbiosis signalling pathway associated with endosymbiosis (Arbuscular Mycorrhiza) were found to be adaptively evolved. The genes involved in adventitious root formation, peroxisome biogenesis, biosynthesis of phytohormones, and tolerance against abiotic and biotic stresses were also found to be adaptively evolved in T. cordifolia.
Collapse
Affiliation(s)
- Shruti Mahajan
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, Madhya Pradesh, 462066, India
| | - Abhisek Chakraborty
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, Madhya Pradesh, 462066, India
| | - Manohar S Bisht
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, Madhya Pradesh, 462066, India
| | - Titas Sil
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, Madhya Pradesh, 462066, India
| | - Vineet K Sharma
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, Madhya Pradesh, 462066, India.
| |
Collapse
|
12
|
Cui R, Wu J, Yan K, Luo S, Hu Y, Feng W, Lu B, Wang J. Phased genome assemblies reveal haplotype-specific genetic load in the critically endangered Chinese Bahaba (Teleostei, Sciaenidae). Mol Ecol 2024; 33:e17250. [PMID: 38179694 DOI: 10.1111/mec.17250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 12/06/2023] [Accepted: 12/11/2023] [Indexed: 01/06/2024]
Abstract
While haplotype-specific genetic load shapes the evolutionary trajectory of natural and captive populations, mixed-haplotype assembly and genotyping hindered its characterization in diploids. Herein, we produced two phased genome assemblies of the critically endangered fish Chinese Bahaba (Bahaba taipingensis, Sciaenidae, Teleostei) and resequenced 20 whole genomes to quantify population genetic load at a haplotype level. We identified frame-shifting variants as the most deleterious type, followed by mutations in the 5'-UTR, 3'-UTR and missense mutations at conserved amino acids. Phased haplotypes revealed gene deletions and high-impact deleterious variants. We estimated ~1.12% of genes missing or interrupted per haplotype, with a significant overlap of disrupted genes (30.35%) between haplotype sets. Relative proportions of deleterious variant categories differed significantly between haplotypes. Simulations suggested that purifying selection struggled to purge slightly deleterious genetic load in captive breeding compared to genotyping interventions, and that higher inter-haplotypic variance of genetic load predicted more efficient purging by artificial selection. Combining the knowledge of haplotype-resolved genetic load with predictive modelling will be immensely useful for understanding the evolution of deleterious variants and guiding conservation planning.
Collapse
Affiliation(s)
- Rongfeng Cui
- School of Ecology & State Key Laboratory of Biocontrol, Sun Yat-sen University, Shenzhen, China
- Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, China
| | - Jinxian Wu
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, Guangdong-Macao Joint Laboratory for Aquaculture Breeding Development and Innovation, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Kuoqiu Yan
- Huangjing Marine Biotechnology Co. Ltd., Huizhou, China
| | - Sujun Luo
- Dongguan Forestry Affairs Center, Dongguan, China
| | - Yuting Hu
- Dongguan Forestry Affairs Center, Dongguan, China
| | - Wei Feng
- Dongguan Forestry Affairs Center, Dongguan, China
| | - Bingqian Lu
- Dongguan Forestry Affairs Center, Dongguan, China
| | - Junjie Wang
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, Guangdong-Macao Joint Laboratory for Aquaculture Breeding Development and Innovation, School of Life Sciences, South China Normal University, Guangzhou, China
| |
Collapse
|
13
|
Khurana S, Katiyar A, Puraswani M, Sharma D, Walia K, Malhotra R, Mathur P. Molecular mechanisms of colistin- and multidrug-resistance in bacteria among patients with hospital-acquired infections. Future Sci OA 2023; 9:FSO896. [PMID: 37753358 PMCID: PMC10518808 DOI: 10.2144/fsoa-2022-0055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 08/08/2023] [Indexed: 09/28/2023] Open
Abstract
Aim The increasing burden of resistance in Gram-negative bacteria (GNB) is becoming a major issue for hospital-acquired infections. Therefore, understanding the molecular mechanisms is important. Methodology Resistance genes of phenotypically colistin-resistant GNB (n = 60) were determined using whole genome sequencing. Antimicrobial susceptibility patterns were detected by Vitek®2 & broth microdilution. Results Of these phenotypically colistin-resistant isolates, 78% were also genetically resistant to colistin. Activation of efflux pumps, and point-mutations in pmrB, and MgrB genes conferred colistin resistance among GNB. Eight different strains of K. pneumoniae were identified and ST43 was the most prominent strain with capsular type-specific (cps) gene KL30. Discussion These results, in combination with rapid diagnostic methods, will help us better advice appropriate antimicrobial regimens.
Collapse
Affiliation(s)
- Surbhi Khurana
- Department of Laboratory Medicine, Jai Prakash Narayan Apex Trauma Centre, All India Institute of Medical Sciences, New Delhi
| | - Amit Katiyar
- Centralized Core Research Facility, Bioinformatics Facility, All India Institute of Medical Sciences, New Delhi, India
| | - Mamta Puraswani
- Department of Laboratory Medicine, Jai Prakash Narayan Apex Trauma Centre, All India Institute of Medical Sciences, New Delhi
| | - Divya Sharma
- Department of Laboratory Medicine, Jai Prakash Narayan Apex Trauma Centre, All India Institute of Medical Sciences, New Delhi
| | - Kamini Walia
- Epidemiology & Communicable Diseases, Indian Council of Medical Research
| | - Rajesh Malhotra
- Department of Orthopedics, Chief, JPNA Trauma Centre, All India Institute of Medical Sciences, New Delhi
| | - Purva Mathur
- Department of Laboratory Medicine, Jai Prakash Narayan Apex Trauma Centre, All India Institute of Medical Sciences, New Delhi
| |
Collapse
|
14
|
Martins IM, Seribelli AA, Machado Ribeiro TR, da Silva P, Lustri BC, Hernandes RT, Falcão JP, Moreira CG. Invasive non-typhoidal Salmonella (iNTS) aminoglycoside-resistant ST313 isolates feature unique pathogenic mechanisms to reach the bloodstream. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2023; 116:105519. [PMID: 37890808 DOI: 10.1016/j.meegid.2023.105519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 10/19/2023] [Accepted: 10/24/2023] [Indexed: 10/29/2023]
Abstract
Invasive non-typhoidal Salmonella (iNTS) from the clonal type ST313 (S. Typhimurium ST313) is the primary cause of invasive salmonellosis in Africa. Recently, in Brazil, iNTS ST313 strains have been isolated from different sources, but there is a lack of understanding of the mechanisms behind how these gut bacteria can break the gut barrier and reach the patient's bloodstream. Here, we compare 13 strains of S. Typhimurium ST313, previously unreported isolates, from human blood cultures, investigating aspects of virulence and mechanisms of resistance. Initially, RNAseq analyses between ST13-blood isolate and SL1344 (ST19) prototype revealed 15 upregulated genes directly related to cellular invasion and replication, such as sopD2, sifB, and pipB. Limited information is available about S. Typhimurium ST313 pathogenesis and epidemiology, especially related to the global distribution of strains. Herein, the correlation of strains isolated from different sources in Brazil was employed to compare clinical and non-clinical isolates, a total of 22 genomes were studied by single nucleotide polymorphism (SNPs). The epidemiological analysis of 22 genomes of S. Typhimurium ST313 strains grouped them into three distinct clusters (A, B, and C) by SNP analysis, where cluster A comprised five, group B six, and group C 11. The 13 clinical blood isolates were all resistant to streptomycin, 92.3% of strains were resistant to ampicillin and 15.39% were resistant to kanamycin. The resistance genes acrA, acrB, mdtK, emrB, emrR, mdsA, and mdsB related to the production of efflux pumps were detected in all (100%) strains studied, similar to pathogenic traits investigated. In conclusion, we evidenced that S. Typhimurium ST313 strains isolated in Brazil have unique epidemiology. The elevated frequencies of virulence genes such as sseJ, sopD2, and pipB are a major concern in these Brazilian isolates, showing a higher pathogenic potential.
Collapse
Affiliation(s)
- Isabela Mancini Martins
- Faculdade de Ciências Farmacêuticas de Araraquara, Universidade Estadual Paulista- UNESP- Departamento de Ciências Biológicas, Araraquara, SP, Brazil
| | - Amanda Aparecida Seribelli
- Faculdade de Ciências Farmacêuticas de Ribeirão Preto, Universidade de São Paulo- USP, Ribeirão Preto, SP, Brazil
| | - Tamara R Machado Ribeiro
- Faculdade de Ciências Farmacêuticas de Araraquara, Universidade Estadual Paulista- UNESP- Departamento de Ciências Biológicas, Araraquara, SP, Brazil
| | - Patrick da Silva
- Faculdade de Ciências Farmacêuticas de Araraquara, Universidade Estadual Paulista- UNESP- Departamento de Ciências Biológicas, Araraquara, SP, Brazil
| | - Bruna Cardinali Lustri
- Faculdade de Ciências Farmacêuticas de Araraquara, Universidade Estadual Paulista- UNESP- Departamento de Ciências Biológicas, Araraquara, SP, Brazil
| | - Rodrigo T Hernandes
- Instituto de Biociências, Universidade Estadual Paulista- UNESP, Botucatu, SP, Brazil
| | - Juliana Pfrimer Falcão
- Faculdade de Ciências Farmacêuticas de Ribeirão Preto, Universidade de São Paulo- USP, Ribeirão Preto, SP, Brazil.
| | - Cristiano Gallina Moreira
- Faculdade de Ciências Farmacêuticas de Araraquara, Universidade Estadual Paulista- UNESP- Departamento de Ciências Biológicas, Araraquara, SP, Brazil; Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA.
| |
Collapse
|
15
|
Crosby WB, Karisch BB, Hiott LM, Pinnell LJ, Pittman A, Frye JG, Jackson CR, Loy JD, Epperson WB, Blanton J, Capik SF, Morley PS, Woolums AR. Tulathromycin metaphylaxis increases nasopharyngeal isolation of multidrug resistant Mannheimia haemolytica in stocker heifers. Front Vet Sci 2023; 10:1256997. [PMID: 38053814 PMCID: PMC10694364 DOI: 10.3389/fvets.2023.1256997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 10/25/2023] [Indexed: 12/07/2023] Open
Abstract
Bovine respiratory disease (BRD) is a leading cause of disease in feedlot and stocker calves with Mannheimia haemolytica (MH) as one of the most common etiologies. One of the most effective means of controlling BRD is through metaphylaxis, which involves administering antimicrobials to all animals at high risk of developing BRD. However, increasing prevalence of multidrug resistant (MDR) MH may reduce efficacy of metaphylaxis due to decreased susceptibility to drugs used for metaphylaxis. Primarily, this study aimed to determine the effect of tulathromycin metaphylaxis and subsequent BRD treatment on antimicrobial resistance (AMR) in MH isolated from stocker calves. Secondary objectives included evaluating the effect of metaphylaxis and treatment for BRD on animal health and comparing the genetic relationship of MH isolated. Crossbred beef heifers (n = 331, mean weight = 232, SD = 17.8 kg) at high risk for BRD were randomly assigned to receive tulathromycin metaphylaxis (META, n = 167) or not (NO META, n = 164). Nasopharyngeal swabs were collected for MH isolation, antimicrobial susceptibility testing and whole genome sequencing at arrival and 3 (WK3) and 10 (WK10) weeks later. Mixed-effects logistic regression was used to identify risk factors for isolation of MH and MDR MH (resistant to ≥3 antimicrobial drug classes) at 3 and 10 weeks, BRD morbidity, and crude mortality. Animals in the META group had higher odds of isolation of MDR MH at 3 weeks [OR (95% CI) = 13.08 (5-30.9), p < 0.0001] and 10 weeks [OR (95% CI) = 5.92 (1.34-26.14), p = 0.019] after arrival. There was no difference in risk of isolation of any MH (resistant or susceptible) between META and NO META groups at all timepoints. Animals in the NO META group had 3 times higher odds of being treated for BRD [WK3: OR (95% CI) = 3.07 (1.70-5.52), p = 0.0002; WK10: OR (95% CI) = 2.76 (1.59-4.80), p = 0.0002]. Antimicrobial resistance genes found within isolates were associated with integrative conjugative element (ICE) genes. Tulathromycin metaphylaxis increased risk of isolation of MDR MH and in this population, the increase in MDR MH appeared to be associated with ICE containing antimicrobial resistance genes for multiple antimicrobial classes. This may have important implications for future efficacy of antimicrobials for control and treatment of BRD.
Collapse
Affiliation(s)
- William B. Crosby
- Department of Pathobiology and Population Medicine, College of Veterinary Medicine, Mississippi State University, Mississippi State, MS, United States
| | - Brandi B. Karisch
- Department of Animal and Dairy Sciences, College of Agriculture and Life Sciences, Mississippi State University, Mississippi State, MS, United States
| | - Lari M. Hiott
- Poultry Microbiological Safety and Processing Research Unit, U.S. National Poultry Research Center, United States Department of Agriculture-Agricultural Research Service, Athens, GA, United States
| | - Lee J. Pinnell
- VERO Program, School of Veterinary Medicine and Biomedical Sciences, Texas A&M University, Canyon, TX, United States
| | - Alexandra Pittman
- Department of Animal and Dairy Sciences, College of Agriculture and Life Sciences, Mississippi State University, Mississippi State, MS, United States
| | - Jonathan G. Frye
- Poultry Microbiological Safety and Processing Research Unit, U.S. National Poultry Research Center, United States Department of Agriculture-Agricultural Research Service, Athens, GA, United States
| | - Charlene R. Jackson
- Poultry Microbiological Safety and Processing Research Unit, U.S. National Poultry Research Center, United States Department of Agriculture-Agricultural Research Service, Athens, GA, United States
| | - John Dustin Loy
- Nebraska Veterinary Diagnostic Center, School of Veterinary Medicine and Biomedical Sciences, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - William B. Epperson
- Department of Pathobiology and Population Medicine, College of Veterinary Medicine, Mississippi State University, Mississippi State, MS, United States
| | - John Blanton
- Department of Animal Sciences, College of Agriculture, Purdue University, West Lafayette, IN, United States
| | - Sarah F. Capik
- Tumbleweed Veterinary Services, PLLC, Amarillo, TX, United States
| | - Paul S. Morley
- VERO Program, School of Veterinary Medicine and Biomedical Sciences, Texas A&M University, Canyon, TX, United States
| | - Amelia R. Woolums
- Department of Pathobiology and Population Medicine, College of Veterinary Medicine, Mississippi State University, Mississippi State, MS, United States
| |
Collapse
|
16
|
Samantray D, Tanwar AS, Murali TS, Brand A, Satyamoorthy K, Paul B. A Comprehensive Bioinformatics Resource Guide for Genome-Based Antimicrobial Resistance Studies. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2023; 27:445-460. [PMID: 37861712 DOI: 10.1089/omi.2023.0140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2023]
Abstract
The use of high-throughput sequencing technologies and bioinformatic tools has greatly transformed microbial genome research. With the help of sophisticated computational tools, it has become easier to perform whole genome assembly, identify and compare different species based on their genomes, and predict the presence of genes responsible for proteins, antimicrobial resistance, and toxins. These bioinformatics resources are likely to continuously improve in quality, become more user-friendly to analyze the multiple genomic data, efficient in generating information and translating it into meaningful knowledge, and enhance our understanding of the genetic mechanism of AMR. In this manuscript, we provide an essential guide for selecting the popular resources for microbial research, such as genome assembly and annotation, antibiotic resistance gene profiling, identification of virulence factors, and drug interaction studies. In addition, we discuss the best practices in computer-oriented microbial genome research, emerging trends in microbial genomic data analysis, integration of multi-omics data, the appropriate use of machine-learning algorithms, and open-source bioinformatics resources for genome data analytics.
Collapse
Affiliation(s)
- Debyani Samantray
- Department of Bioinformatics, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Ankit Singh Tanwar
- United Nations University-Maastricht Economic and Social Research Institute on Innovation and Technology (UNU-MERIT), Maastricht, The Netherlands
- Faculty of Health, Medicine and Life Sciences (FHML), Maastricht University, Maastricht, The Netherlands
| | - Thokur Sreepathy Murali
- Department of Biotechnology, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Angela Brand
- United Nations University-Maastricht Economic and Social Research Institute on Innovation and Technology (UNU-MERIT), Maastricht, The Netherlands
- Faculty of Health, Medicine and Life Sciences (FHML), Maastricht University, Maastricht, The Netherlands
- Department of Health Information, Prasanna School of Public Health (PSPH), Manipal Academy of Higher Education, Manipal, India
| | - Kapaettu Satyamoorthy
- SDM College of Medical Sciences and Hospital, Shri Dharmasthala Manjunatheshwara (SDM) University, Dharwad, India
| | - Bobby Paul
- Department of Bioinformatics, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, India
| |
Collapse
|
17
|
Reyes-Umana VM, Coates JD. A description of the genus Denitromonas nom. rev.: Denitromonas iodatirespirans sp. nov., a novel iodate-reducing bacterium, and two novel perchlorate-reducing bacteria, Denitromonas halophila and Denitromonas ohlonensis, isolated from San Francisco Bay intertidal mudflats. Microbiol Spectr 2023; 11:e0091523. [PMID: 37772843 PMCID: PMC10581121 DOI: 10.1128/spectrum.00915-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 08/04/2023] [Indexed: 09/30/2023] Open
Abstract
The genus Denitromonas is currently a non-validated taxon that has been identified in several recent publications as members of microbial communities arising from marine environments. Very little is known about the biology of Denitromonas spp., and no pure cultures are presently found in any culture collections. The current epitaph of Denitromonas was given to the organism under the assumption that all members of this genus are denitrifying bacteria. This study performs phenotypic and genomic analyses on three new Denitromonas spp. isolated from tidal mudflats in the San Francisco Bay. We demonstrate that Denitromonas spp. are indeed all facultative denitrifying bacteria that utilize a variety of carbon sources such as acetate, lactate, and succinate. In addition, individual strains also use the esoteric electron acceptors perchlorate, chlorate, and iodate. Both 16S and Rps/Rpl phylogenetic analyses place Denitromonas spp. as a deep branching clade in the family Zoogloeaceae, separate from either Thauera spp., Azoarcus spp., or Aromatoleum spp. Genome sequencing reveals a G + C content ranging from 63.72% to 66.54%, and genome sizes range between 4.39 and 5.18 Mb. Genes for salt tolerance and denitrification are distinguishing features that separate Denitromonas spp. from the closely related Azoarcus and Aromatoleum genera. IMPORTANCE The genus Denitromonas is currently a non-validated taxon that has been identified in several recent publications as members of microbial communities arising from marine environments. Very little is known about the biology of Denitromonas spp., and no pure cultures are presently found in any culture collections. The current epitaph of Denitromonas was given to the organism under the assumption that all members of this genus are denitrifying bacteria. This study performs phenotypic and genomic analyses on three Denitromonas spp., Denitromonas iodatirespirans sp. nov.-a novel iodate-reducing bacterium-and two novel perchlorate-reducing bacteria, Denitromonas halophila and Denitromonas ohlonensis, isolated from San Francisco Bay intertidal mudflats.
Collapse
Affiliation(s)
- Victor M. Reyes-Umana
- Department of Plant and Microbial Biology, University of California, Berkeley, California, USA
| | - John D. Coates
- Department of Plant and Microbial Biology, University of California, Berkeley, California, USA
| |
Collapse
|
18
|
Moura FT, Helene LCF, Klepa MS, Ribeiro RA, Nogueira MA, Hungria M. Genomes of two type strains of the Rhizobium tropici group: R. calliandrae CCGE524 T and R. mayense CCGE526 T. Microbiol Resour Announc 2023; 12:e0047223. [PMID: 37540013 PMCID: PMC10508132 DOI: 10.1128/mra.00472-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 06/26/2023] [Indexed: 08/05/2023] Open
Abstract
The genome sequences of two nitrogen-fixing type strains of the Rhizobium tropici group were obtained: Rhizobium calliandrae CCGE524T and R. mayense CCGE526T. Genomic analyses confirmed their taxonomic position and identified three complete sequences of the repABC genes, indicative of three plasmids, one of them carrying symbiotic genes.
Collapse
Affiliation(s)
- Fernanda Terezinha Moura
- Department of Biochemistry and Biotechnology, Universidade Estadual de Londrina, Londrina, Paraná, Brazil
- Embrapa Soja, Soil Biotechnology Laboratory, Londrina, Paraná, Brazil
- CAPES, SBN, Brasília, Distrito Federal, Brazil
| | | | - Milena Serenato Klepa
- Embrapa Soja, Soil Biotechnology Laboratory, Londrina, Paraná, Brazil
- CNPq, Brasília, Distrito Federal, Brazil
| | | | - Marco Antonio Nogueira
- Embrapa Soja, Soil Biotechnology Laboratory, Londrina, Paraná, Brazil
- CNPq, Brasília, Distrito Federal, Brazil
| | - Mariangela Hungria
- Department of Biochemistry and Biotechnology, Universidade Estadual de Londrina, Londrina, Paraná, Brazil
- Embrapa Soja, Soil Biotechnology Laboratory, Londrina, Paraná, Brazil
- CNPq, Brasília, Distrito Federal, Brazil
| |
Collapse
|
19
|
Moura FT, Helene LCF, Ribeiro RA, Nogueira MA, Hungria M. The outstanding diversity of rhizobia microsymbionts of common bean (Phaseolus vulgaris L.) in Mato Grosso do Sul, central-western Brazil, revealing new Rhizobium species. Arch Microbiol 2023; 205:325. [PMID: 37659972 DOI: 10.1007/s00203-023-03667-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 08/17/2023] [Accepted: 08/20/2023] [Indexed: 09/04/2023]
Abstract
Common bean is considered a legume of great socioeconomic importance, capable of establishing symbioses with a wide variety of rhizobial species. However, the legume has also been recognized for its low efficiency in fixing atmospheric nitrogen. Brazil is a hotspot of biodiversity, and in a previous study, we identified 13 strains isolated from common bean (Phaseolus vulgaris) nodules in three biomes of Mato Grosso do Sul state, central-western Brazil, that might represent new phylogenetic groups, deserving further polyphasic characterization. The phylogenetic tree of the 16S rRNA gene split the 13 strains into two large clades, seven in the R. etli and six in the R. tropici clade. The MLSA with four housekeeping genes (glnII, gyrB, recA, and rpoA) confirmed the phylogenetic allocation. Genomic comparisons indicated eight strains in five putative new species and the remaining five as R. phaseoli. The average nucleotide identity (ANI) and digital DNA-DNA hybridization (dDDH) comparing the putative new species and the closest neighbors ranged from 81.84 to 92.50% and 24.0 to 50.7%, respectively. Other phenotypic, genotypic, and symbiotic features were evaluated. Interestingly, some strains of both R. etli and R. tropici clades lost their nodulation capacity. The data support the description of the new species Rhizobium cerradonense sp. nov. (CNPSo 3464T), Rhizobium atlanticum sp. nov. (CNPSo 3490T), Rhizobium aureum sp. nov. (CNPSo 3968T), Rhizobium pantanalense sp. nov. (CNPSo 4039T), and Rhizobium centroccidentale sp. nov. (CNPSo 4062T).
Collapse
Affiliation(s)
- Fernanda Terezinha Moura
- Department of Biochemistry and Biotechnology, Universidade Estadual de Londrina, PR-445, Km 380, Cx. Postal 6001, Londrina, Paraná, CP 86.051-970, Brazil
- Soil Biotechnology Laboratory, Embrapa Soja, Cx. Postal 4006, Londrina, Paraná, 86.085-981, Brazil
- Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), SBN, Quadra 2, Bloco L, Lote 06, Edifício Capes, Brasília, Distrito Federal, 70.040-020, Brazil
| | - Luisa Caroline Ferraz Helene
- Soil Biotechnology Laboratory, Embrapa Soja, Cx. Postal 4006, Londrina, Paraná, 86.085-981, Brazil
- Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), SHIS QI 1 Conjunto B, Blocos A, B, C e D, Lago Sul, Brasília, Distrito Federal, 71605-001, Brazil
- Vittia Fertilizantes e Biológicos, São Joaquim da Barra, São Paulo, Brazil
| | - Renan Augusto Ribeiro
- Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), SHIS QI 1 Conjunto B, Blocos A, B, C e D, Lago Sul, Brasília, Distrito Federal, 71605-001, Brazil
| | - Marco Antonio Nogueira
- Soil Biotechnology Laboratory, Embrapa Soja, Cx. Postal 4006, Londrina, Paraná, 86.085-981, Brazil
- Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), SHIS QI 1 Conjunto B, Blocos A, B, C e D, Lago Sul, Brasília, Distrito Federal, 71605-001, Brazil
| | - Mariangela Hungria
- Department of Biochemistry and Biotechnology, Universidade Estadual de Londrina, PR-445, Km 380, Cx. Postal 6001, Londrina, Paraná, CP 86.051-970, Brazil.
- Soil Biotechnology Laboratory, Embrapa Soja, Cx. Postal 4006, Londrina, Paraná, 86.085-981, Brazil.
- Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), SHIS QI 1 Conjunto B, Blocos A, B, C e D, Lago Sul, Brasília, Distrito Federal, 71605-001, Brazil.
| |
Collapse
|
20
|
Mahajan S, Bisht MS, Chakraborty A, Sharma VK. Genome of Phyllanthus emblica: the medicinal plant Amla with super antioxidant properties. FRONTIERS IN PLANT SCIENCE 2023; 14:1210078. [PMID: 37727852 PMCID: PMC10505619 DOI: 10.3389/fpls.2023.1210078] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 08/15/2023] [Indexed: 09/21/2023]
Abstract
Phyllanthus emblica or Indian gooseberry, commonly known as amla, is an important medicinal horticultural plant used in traditional and modern medicines. It bears stone fruits with immense antioxidant properties due to being one of the richest natural sources of vitamin C and numerous flavonoids. This study presents the first genome sequencing of this species performed using 10x Genomics and Oxford Nanopore Technology. The draft genome assembly was 519 Mbp in size and consisted of 4,384 contigs, N50 of 597 Kbp, 98.4% BUSCO score, and 37,858 coding sequences. This study also reports the genome-wide phylogeny of this species with 26 other plant species that resolved the phylogenetic position of P. emblica. The presence of three ascorbate biosynthesis pathways including L-galactose, galacturonate, and myo-inositol pathways was confirmed in this genome. A comprehensive comparative evolutionary genomic analysis including gene family expansion/contraction and identification of multiple signatures of adaptive evolution provided evolutionary insights into ascorbate and flavonoid biosynthesis pathways and stone fruit formation through lignin biosynthesis. The availability of this genome will be beneficial for its horticultural, medicinal, dietary, and cosmetic applications and will also help in comparative genomics analysis studies.
Collapse
Affiliation(s)
| | | | | | - Vineet K. Sharma
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, Madhya Pradesh, India
| |
Collapse
|
21
|
Espinosa E, Bautista R, Fernandez I, Larrosa R, Zapata EL, Plata O. Comparing assembly strategies for third-generation sequencing technologies across different genomes. Genomics 2023; 115:110700. [PMID: 37598732 DOI: 10.1016/j.ygeno.2023.110700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 08/07/2023] [Accepted: 08/16/2023] [Indexed: 08/22/2023]
Abstract
The recent advent of long-read sequencing technologies, such as Pacific Biosciences (PacBio) and Oxford Nanopore technology (ONT), has led to substantial accuracy and computational cost improvements. However, de novo whole-genome assembly still presents significant challenges related to the computational cost and the quality of the results. Accordingly, sequencing accuracy and throughput continue to improve, and many tools are constantly emerging. Therefore, selecting the correct sequencing platform, the proper sequencing depth and the assembly tools are necessary to perform high-quality assembly. This paper evaluates the primary assembly reconstruction from recent hybrid and non-hybrid pipelines on different genomes. We find that using PacBio high-fidelity long-read (HiFi) plays an essential role in haplotype construction with respect to ONT reads. However, we observe a substantial improvement in the correctness of the assembly from high-fidelity ONT datasets and combining it with HiFi or short-reads.
Collapse
Affiliation(s)
- Elena Espinosa
- Department of Computer Architecture, University of Malaga, Louis Pasteur, 35, Campus de Teatinos, Malaga 29071, Spain.
| | - Rocio Bautista
- Supercomputing and Bioinnovation Center, University of Malaga, C. Severo Ochoa, 34, Malaga 29590, Spain.
| | - Ivan Fernandez
- Department of Computer Architecture, University of Malaga, Louis Pasteur, 35, Campus de Teatinos, Malaga 29071, Spain; Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, C. Jordi Girona, 1-3, Barcelona 08034, Spain.
| | - Rafael Larrosa
- Department of Computer Architecture, University of Malaga, Louis Pasteur, 35, Campus de Teatinos, Malaga 29071, Spain; Supercomputing and Bioinnovation Center, University of Malaga, C. Severo Ochoa, 34, Malaga 29590, Spain.
| | - Emilio L Zapata
- Department of Computer Architecture, University of Malaga, Louis Pasteur, 35, Campus de Teatinos, Malaga 29071, Spain; Supercomputing and Bioinnovation Center, University of Malaga, C. Severo Ochoa, 34, Malaga 29590, Spain.
| | - Oscar Plata
- Department of Computer Architecture, University of Malaga, Louis Pasteur, 35, Campus de Teatinos, Malaga 29071, Spain.
| |
Collapse
|
22
|
Cavattoni M, Comin M. ClassGraph: Improving Metagenomic Read Classification with Overlap Graphs. J Comput Biol 2023. [PMID: 37023405 DOI: 10.1089/cmb.2022.0208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023] Open
Abstract
ABSTRACT Current technologies allow the sequencing of microbial communities directly from the environment without prior culturing. One of the major problems when analyzing a microbial sample is to taxonomically annotate its reads to identify the species it contains. Most methods that are currently available focus on the classification of reads using a set of reference genomes and their k-mers. While in terms of precision these methods have reached percentages of correctness close to perfection, in terms of sensitivity (the actual number of classified reads), the performance is often poor. One reason is that the reads in a sample can be very different from the corresponding reference genomes; for example, viral genomes are usually highly mutated. To address this issue, in this article, we propose ClassGraph, a new taxonomic classification method that makes use of the read overlap graph and applies a label propagation algorithm to refine the results of existing tools. We evaluated its performance on simulated and real datasets with several taxonomic classification tools, and the results showed an improved sensitivity and F-measure, while maintaining high precision. ClassGraph is capable of improving the classification accuracy, especially in difficult cases such as virus and real datasets, where traditional tools can classify <40% of reads.
Collapse
Affiliation(s)
| | - Matteo Comin
- Department of Information Engineering, University of Padova, Padova, Italy
| |
Collapse
|
23
|
Rosa IF, Peçanha APB, Carvalho TRB, Alexandre LS, Ferreira VG, Doretto LB, Souza BM, Nakajima RT, da Silva P, Barbosa AP, Gomes-de-Pontes L, Bomfim CG, Machado-Santelli GM, Condino-Neto A, Guzzo CR, Peron JPS, Andrade-Silva M, Câmara NOS, Garnique AMB, Medeiros RJ, Ferraris FK, Barcellos LJG, Correia-Junior JD, Galindo-Villegas J, Machado MFR, Castoldi A, Oliveira SL, Costa CC, Belo MAA, Galdino G, Sgro GG, Bueno NF, Eto SF, Veras FP, Fernandes BHV, Sanches PRS, Cilli EM, Malafaia G, Nóbrega RH, Garcez AS, Carrilho E, Charlie-Silva I. Photobiomodulation Reduces the Cytokine Storm Syndrome Associated with COVID-19 in the Zebrafish Model. Int J Mol Sci 2023; 24:ijms24076104. [PMID: 37047078 PMCID: PMC10094635 DOI: 10.3390/ijms24076104] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 03/11/2023] [Accepted: 03/14/2023] [Indexed: 04/14/2023] Open
Abstract
Although the exact mechanism of the pathogenesis of coronavirus SARS-CoV-2 (COVID-19) is not fully understood, oxidative stress and the release of pro-inflammatory cytokines have been highlighted as playing a vital role in the pathogenesis of the disease. In this sense, alternative treatments are needed to reduce the level of inflammation caused by COVID-19. Therefore, this study aimed to investigate the potential effect of red photobiomodulation (PBM) as an attractive therapy to downregulate the cytokine storm caused by COVID-19 in a zebrafish model. RT-qPCR analyses and protein-protein interaction prediction among SARS-CoV-2 and Danio rerio proteins showed that recombinant Spike protein (rSpike) was responsible for generating systemic inflammatory processes with significantly increased levels of pro-inflammatory (il1b, il6, tnfa, and nfkbiab), oxidative stress (romo1) and energy metabolism (slc2a1a and coa1) mRNA markers, with a pattern similar to those observed in COVID-19 cases in humans. On the other hand, PBM treatment was able to decrease the mRNA levels of these pro-inflammatory and oxidative stress markers compared with rSpike in various tissues, promoting an anti-inflammatory response. Conversely, PBM promotes cellular and tissue repair of injured tissues and significantly increases the survival rate of rSpike-inoculated individuals. Additionally, metabolomics analysis showed that the most-impacted metabolic pathways between PBM and the rSpike treated groups were related to steroid metabolism, immune system, and lipid metabolism. Together, our findings suggest that the inflammatory process is an incisive feature of COVID-19 and red PBM can be used as a novel therapeutic agent for COVID-19 by regulating the inflammatory response. Nevertheless, the need for more clinical trials remains, and there is a significant gap to overcome before clinical trials can commence.
Collapse
Affiliation(s)
- Ivana F Rosa
- Department of Structural and Functional Biology, Institute of Biosciences, São Paulo State University (UNESP), Botucatu 01049-010, Brazil
| | - Ana P B Peçanha
- Department of Orthodontics, São Leopoldo Mandic College, Campinas 13045-755, Brazil
| | - Tábata R B Carvalho
- Department of Orthodontics, São Leopoldo Mandic College, Campinas 13045-755, Brazil
| | - Leonardo S Alexandre
- Instituto de Química de São Carlos, Universidade de São Paulo, São Carlos 13566-590, Brazil
- The National Institute of Science and Technology in Bioanalyses, INCTBio, Campinas 13083-970, Brazil
| | - Vinícius G Ferreira
- Instituto de Química de São Carlos, Universidade de São Paulo, São Carlos 13566-590, Brazil
- The National Institute of Science and Technology in Bioanalyses, INCTBio, Campinas 13083-970, Brazil
| | - Lucas B Doretto
- Department of Structural and Functional Biology, Institute of Biosciences, São Paulo State University (UNESP), Botucatu 01049-010, Brazil
| | - Beatriz M Souza
- Department of Structural and Functional Biology, Institute of Biosciences, São Paulo State University (UNESP), Botucatu 01049-010, Brazil
| | - Rafael T Nakajima
- Department of Structural and Functional Biology, Institute of Biosciences, São Paulo State University (UNESP), Botucatu 01049-010, Brazil
| | - Patrick da Silva
- Institute of Biomedical Sciences, University of São Paulo (USP), São Paulo 05508-220, Brazil
| | - Ana P Barbosa
- Institute of Biomedical Sciences, University of São Paulo (USP), São Paulo 05508-220, Brazil
| | - Leticia Gomes-de-Pontes
- Institute of Biomedical Sciences, University of São Paulo (USP), São Paulo 05508-220, Brazil
| | - Camila G Bomfim
- Institute of Biomedical Sciences, University of São Paulo (USP), São Paulo 05508-220, Brazil
| | | | - Antonio Condino-Neto
- Institute of Biomedical Sciences, University of São Paulo (USP), São Paulo 05508-220, Brazil
| | - Cristiane R Guzzo
- Institute of Biomedical Sciences, University of São Paulo (USP), São Paulo 05508-220, Brazil
| | - Jean P S Peron
- Institute of Biomedical Sciences, University of São Paulo (USP), São Paulo 05508-220, Brazil
| | - Magaiver Andrade-Silva
- Institute of Biomedical Sciences, University of São Paulo (USP), São Paulo 05508-220, Brazil
| | - Niels O S Câmara
- Institute of Biomedical Sciences, University of São Paulo (USP), São Paulo 05508-220, Brazil
| | - Anali M B Garnique
- Institute of Biomedical Sciences, University of São Paulo (USP), São Paulo 05508-220, Brazil
| | | | | | - Leonardo J G Barcellos
- Laboratório de Fisiologia de Peixes, Programa de Pós-Graduação em Bioexperimentação, Escola de Ciências Agrárias, Inovação e Negócios, Universidade de Passo Fundo, Passo Fundo 99052-900, Brazil
| | - Jose D Correia-Junior
- Institute of Biomedical Sciences, Federal University Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - Jorge Galindo-Villegas
- Department of Genomics, Faculty of Biosciences and Aquaculture, Nord University, 8026 Bodø, Norway
| | - Mônica F R Machado
- Biological Sciences Special Academic Unit, Federal University of Jatai, Jatai 75804-020, Brazil
| | - Angela Castoldi
- Keizo Asami Institute, Federal University of Pernambuco, Recife 50670-901, Brazil
| | - Susana L Oliveira
- School of Agricultural and Veterinary Sciences, São Paulo State University (UNESP), Jaboticabal 14884-900, Brazil
| | - Camila C Costa
- School of Agricultural and Veterinary Sciences, São Paulo State University (UNESP), Jaboticabal 14884-900, Brazil
| | - Marco A A Belo
- School of Agricultural and Veterinary Sciences, São Paulo State University (UNESP), Jaboticabal 14884-900, Brazil
| | - Giovane Galdino
- Institute of Motricity Sciences, Department of Physical Therapy, Federal University of Alfenas, Alfenas 37133-840, Brazil
| | - Germán G Sgro
- Departamento de Ciências Biomoleculares, Faculdade de Ciências Farmacêuticas de Ribeirão Preto, Universidade de São Paulo, São Paulo 14040-900, Brazil
| | - Natalia F Bueno
- Integrated Structural Biology Platform, Carlos Chagas Institute, FIOCRUZ Paraná, Curitiba 81310-020, Brazil
| | - Silas F Eto
- Center of Innovation and Development, Laboratory of Development and Innovation Butantan Institute, São Paulo 69310-000, Brazil
| | - Flávio P Veras
- Faculty of Medicine, University of São Paulo (USP), Ribeirão Preto 14040-900, Brazil
| | - Bianca H V Fernandes
- Laboratory of Genetic and Sanitary Control, Technical Board of Support for Teaching and Research, Faculty of Medicine, University of Sao Paulo, São Paulo 01246-903, Brazil
| | - Paulo R S Sanches
- Department of Biochemistry and Organic Chemistry, Institute of Chemistry, São Paulo State University (UNESP), Araraquara 14800-060, Brazil
| | - Eduardo M Cilli
- Department of Biochemistry and Organic Chemistry, Institute of Chemistry, São Paulo State University (UNESP), Araraquara 14800-060, Brazil
| | - Guilherme Malafaia
- Laboratory of Toxicology Applied to the Environment, Goiano Federal Institute, Urutaí Campus, Urutaí 75790-000, Brazil
| | - Rafael H Nóbrega
- Department of Structural and Functional Biology, Institute of Biosciences, São Paulo State University (UNESP), Botucatu 01049-010, Brazil
| | - Aguinaldo S Garcez
- Department of Orthodontics, São Leopoldo Mandic College, Campinas 13045-755, Brazil
| | - Emanuel Carrilho
- Instituto de Química de São Carlos, Universidade de São Paulo, São Carlos 13566-590, Brazil
- The National Institute of Science and Technology in Bioanalyses, INCTBio, Campinas 13083-970, Brazil
| | - Ives Charlie-Silva
- Department of Biochemistry and Organic Chemistry, Institute of Chemistry, São Paulo State University (UNESP), Araraquara 14800-060, Brazil
| |
Collapse
|
24
|
Krinos AI, Cohen NR, Follows MJ, Alexander H. Reverse engineering environmental metatranscriptomes clarifies best practices for eukaryotic assembly. BMC Bioinformatics 2023; 24:74. [PMID: 36869298 PMCID: PMC9983209 DOI: 10.1186/s12859-022-05121-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 12/21/2022] [Indexed: 03/05/2023] Open
Abstract
BACKGROUND Diverse communities of microbial eukaryotes in the global ocean provide a variety of essential ecosystem services, from primary production and carbon flow through trophic transfer to cooperation via symbioses. Increasingly, these communities are being understood through the lens of omics tools, which enable high-throughput processing of diverse communities. Metatranscriptomics offers an understanding of near real-time gene expression in microbial eukaryotic communities, providing a window into community metabolic activity. RESULTS Here we present a workflow for eukaryotic metatranscriptome assembly, and validate the ability of the pipeline to recapitulate real and manufactured eukaryotic community-level expression data. We also include an open-source tool for simulating environmental metatranscriptomes for testing and validation purposes. We reanalyze previously published metatranscriptomic datasets using our metatranscriptome analysis approach. CONCLUSION We determined that a multi-assembler approach improves eukaryotic metatranscriptome assembly based on recapitulated taxonomic and functional annotations from an in-silico mock community. The systematic validation of metatranscriptome assembly and annotation methods provided here is a necessary step to assess the fidelity of our community composition measurements and functional content assignments from eukaryotic metatranscriptomes.
Collapse
Affiliation(s)
- Arianna I Krinos
- MIT-WHOI Joint Program in Oceanography and Applied Ocean Science and Engineering, Cambridge and Woods Hole, MA, USA.
- Department of Biology, Woods Hole Oceanographic Institution, Woods Hole, MA, USA.
- Department of Earth, Atmospheric, and Planetary Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Natalie R Cohen
- Skidaway Institute of Oceanography, University of Georgia, Savannah, GA, USA
| | - Michael J Follows
- Department of Earth, Atmospheric, and Planetary Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Harriet Alexander
- Department of Biology, Woods Hole Oceanographic Institution, Woods Hole, MA, USA.
| |
Collapse
|
25
|
Thippabhotla S, Liu B, Podgorny A, Yooseph S, Yang Y, Zhang J, Zhong C. Integrated de novo gene prediction and peptide assembly of metagenomic sequencing data. NAR Genom Bioinform 2023; 5:lqad023. [PMID: 36915411 PMCID: PMC10006731 DOI: 10.1093/nargab/lqad023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 12/03/2022] [Accepted: 02/18/2023] [Indexed: 03/16/2023] Open
Abstract
Metagenomics is the study of all genomic content contained in given microbial communities. Metagenomic functional analysis aims to quantify protein families and reconstruct metabolic pathways from the metagenome. It plays a central role in understanding the interaction between the microbial community and its host or environment. De novo functional analysis, which allows the discovery of novel protein families, remains challenging for high-complexity communities. There are currently three main approaches for recovering novel genes or proteins: de novo nucleotide assembly, gene calling and peptide assembly. Unfortunately, their information dependency has been overlooked, and each has been formulated as an independent problem. In this work, we develop a sophisticated workflow called integrated Metagenomic Protein Predictor (iMPP), which leverages the information dependencies for better de novo functional analysis. iMPP contains three novel modules: a hybrid assembly graph generation module, a graph-based gene calling module, and a peptide assembly-based refinement module. iMPP significantly improved the existing gene calling sensitivity on unassembled metagenomic reads, achieving a 92-97% recall rate at a high precision level (>85%). iMPP further allowed for more sensitive and accurate peptide assembly, recovering more reference proteins and delivering more hypothetical protein sequences. The high performance of iMPP can provide a more comprehensive and unbiased view of the microbial communities under investigation. iMPP is freely available from https://github.com/Sirisha-t/iMPP.
Collapse
Affiliation(s)
- Sirisha Thippabhotla
- Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, KS 66045, USA
| | - Ben Liu
- Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, KS 66045, USA
| | - Adam Podgorny
- Center for Computational Biology, The University of Kansas, Lawrence, KS 66045, USA
| | - Shibu Yooseph
- Department of Computer Science, Genomics and Bioinformatics Cluster, University of Central Florida, Orlando, FL 32816, USA
| | - Youngik Yang
- National Marine Biodiversity Institute of Korea, 101-75, Jangsan-ro, Janghang-eup, Seochun-gun, Chungchungnam-do, 33662, South Korea
| | - Jun Zhang
- Division of Medical Oncology, Department of Internal Medicine, University of Kansas Medical Center, Kansas City, KS 66160, USA.,Department of Cancer Biology, University of Kansas Cancer Center; Kansas City, KS 66160, USA
| | - Cuncong Zhong
- Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, KS 66045, USA
| |
Collapse
|
26
|
The genome of a Far Eastern isolate of Diaporthe caulivora, a soybean fungal pathogen. Appl Microbiol Biotechnol 2023; 107:1311-1327. [PMID: 36650392 DOI: 10.1007/s00253-023-12370-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 12/26/2022] [Accepted: 12/31/2022] [Indexed: 01/19/2023]
Abstract
Diaporthe caulivora is an economically important fungal pathogen and a causal agent of soybean stem canker and seed decay. Here, the genome of a Russian Far Eastern isolate of D. caulivora was sequenced, assembled, and announced. Assembly quality was enough for advanced annotation, including prediction of potential disease-related genes encoding virulence factors and molecular determinants contributing to pathogen-host selection, interactions, and adaptation. Comparative analysis of 15 Diaporthe species was conducted regarding general genome properties, collinearity, and proteomes, and included detailed investigation of interspersed repeats. A notable feature of this analysis is a high recombinant variability of Diaporthe genomes, determined by the number and distribution of interspersed repeats, which also proved to be responsible for the diversity of GC content and genome size. This variability is assumed the main determinant of the divergence of Diaporthe genomes. A Bayesian multi-gene phylogeny was inferred for the 15 Diaporthe species on the basis of twenty thousand polymorphic sites of > 100 orthologous genes using independently adjusted evolutionary models. This allowed for the most accurate determination of evolutionary relationships and species boundaries for effective reporting about these plant pathogens. The evidence, obtained by different genome analysis techniques, implies the host-independent evolution of Diaporthe species. KEY POINTS: • The genome of a Far Eastern isolate of D. caulivora was announced. • A high degree of recombinant variability determines genomic divergence in Diaporthe genus. • The multi-gene phylogeny implies host-independent evolution of Diaporthe species.
Collapse
|
27
|
Weist P, Jentoft S, Tørresen OK, Schade FM, Pampoulie C, Krumme U, Hanel R. The role of genomic signatures of directional selection and demographic history in the population structure of a marine teleost with high gene flow. Ecol Evol 2022; 12:e9602. [PMID: 36514551 PMCID: PMC9731920 DOI: 10.1002/ece3.9602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 11/14/2022] [Accepted: 11/21/2022] [Indexed: 12/13/2022] Open
Abstract
Recent studies have uncovered patterns of genomic divergence in marine teleosts where panmixia due to high gene flow has been the general paradigm. These signatures of divergent selection are often impacted by structural variants, acting as "supergenes" facilitating local adaptation. The highly dispersing European plaice (Pleuronectes platessa)-in which putative structural variants (i.e., inversions) have been identified-has successfully colonized the brackish water ecosystem of the Baltic Sea. Thus, the species represents an ideal opportunity to investigate how the interplay of gene flow, structural variants, natural selection, past demographic history, and gene flow impacts on population (sub)structuring in marine systems. Here, we report on the generation of an annotated draft plaice genome assembly in combination with population sequencing data-following the salinity gradient from the Baltic Sea into the North Sea together with samples from Icelandic waters-to illuminate genome-wide patterns of divergence. Neutral markers pointed at large-scale panmixia across the European continental shelf associated with high gene flow and a common postglacial colonization history of shelf populations. However, based on genome-wide outlier loci, we uncovered signatures of population substructuring among the European continental shelf populations, i.e., suggesting signs of ongoing selection. Genome-wide selection analyses (xp-EHH) and the identification of genes within genomic regions of recent selective sweeps-overlapping with the outlier loci-suggest that these represent the signs of divergent selection. Our findings provide support for genomic divergence driven by local adaptation in the face of high gene flow and elucidate the relative importance of demographic history versus adaptive divergence in shaping the contemporary population genetic structure of a marine teleost. The role of the putative inversion(s) in the substructuring-and potentially ongoing adaptation-was seemingly not substantial.
Collapse
Affiliation(s)
- Peggy Weist
- Thünen Institute of Fisheries EcologyBremerhavenGermany
| | - Sissel Jentoft
- Department of Biosciences, Centre for Ecological and Evolutionary SynthesisUniversity of OsloOsloNorway
| | - Ole K. Tørresen
- Department of Biosciences, Centre for Ecological and Evolutionary SynthesisUniversity of OsloOsloNorway
| | | | | | - Uwe Krumme
- Thünen Institute of Baltic Sea FisheriesRostockGermany
| | | |
Collapse
|
28
|
Rayamajhi N, Cheng CHC, Catchen JM. Evaluating Illumina-, Nanopore-, and PacBio-based genome assembly strategies with the bald notothen, Trematomus borchgrevinki. G3 (BETHESDA, MD.) 2022; 12:jkac192. [PMID: 35904764 PMCID: PMC9635638 DOI: 10.1093/g3journal/jkac192] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 07/18/2022] [Indexed: 11/16/2022]
Abstract
For any genome-based research, a robust genome assembly is required. De novo assembly strategies have evolved with changes in DNA sequencing technologies and have been through at least 3 phases: (1) short-read only, (2) short- and long-read hybrid, and (3) long-read only assemblies. Each of the phases has its own error model. We hypothesized that hidden short-read scaffolding errors and erroneous long-read contigs degrade the quality of short- and long-read hybrid assemblies. We assembled the genome of Trematomus borchgrevinki from data generated during each of the 3 phases and assessed the quality problems we encountered. We developed strategies such as k-mer-assembled region replacement, parameter optimization, and long-read sampling to address the error models. We demonstrated that a k-mer-based strategy improved short-read assemblies as measured by Benchmarking Universal Single-Copy Ortholog while mate-pair libraries introduced hidden scaffolding errors and perturbed Benchmarking Universal Single-Copy Ortholog scores. Furthermore, we found that although hybrid assemblies can generate higher contiguity they tend to suffer from lower quality. In addition, we found long-read-only assemblies can be optimized for contiguity by subsampling length-restricted raw reads. Our results indicate that long-read contig assembly is the current best choice and that assemblies from phase I and phase II were of lower quality.
Collapse
Affiliation(s)
- Niraj Rayamajhi
- Department of Evolution, Ecology, and Behavior, University of Illinois, Urbana-Champaign, Champaign, IL 61801, USA
| | - Chi-Hing Christina Cheng
- Department of Evolution, Ecology, and Behavior, University of Illinois, Urbana-Champaign, Champaign, IL 61801, USA
| | - Julian M Catchen
- Department of Evolution, Ecology, and Behavior, University of Illinois, Urbana-Champaign, Champaign, IL 61801, USA
| |
Collapse
|
29
|
Midekso FD, Yi G. RFfiller: a robust and fast statistical algorithm for gap filling in draft genomes. PeerJ 2022; 10:e14186. [PMID: 36262414 PMCID: PMC9575681 DOI: 10.7717/peerj.14186] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 09/14/2022] [Indexed: 01/24/2023] Open
Abstract
Numerous published genomes contain gaps or unknown sequences. Gap filling is a critical final step in de novo genome assembly, particularly for large genomes. While certain computational approaches partially address the problem, others have shortcomings regarding the draft genome's dependability and correctness (high rates of mis-assembly at gap-closing sites and high error rates). While it is well established that genomic repeats result in gaps, many sequence reads originating from repeat-related gaps are typically missed by existing approaches. A fast and reliable statistical algorithm for closing gaps in a draft genome is presented in this paper. It utilizes the alignment statistics between scaffolds, contigs, and paired-end reads to generate a Markov chain that appropriately assigns contigs or long reads to scaffold gap regions (only corrects candidate regions), resulting in accurate and efficient gap closure. To reconstruct the missing component between the two ends of the same insert, the RFfiller meticulously searches for valid overlaps (in repeat regions) and generates transition tables for similar reads, allowing it to make a statistical guess at the missing sequence. Finally, in our experiments, we show that the RFfiller's gap-closing accuracy is better than that of other publicly available tools when sequence data from various organisms are used. Assembly benchmarks were used to validate RFfiller. Our findings show that RFfiller efficiently fills gaps and that it is especially effective when the gap length is longer. We also show that the RFfiller outperforms other gap closing tools currently on the market.
Collapse
Affiliation(s)
- Firaol Dida Midekso
- Department of Multimedia Engineering, Dongguk University, Seoul, South Korea
| | - Gangman Yi
- Department of Multimedia Engineering, Dongguk University, Seoul, South Korea
| |
Collapse
|
30
|
Development of Genomic Resources in Mexican Bursera (Section: Bullockia: Burseraceae): Genome Assembly, Annotation, and Marker Discovery for Three Copal Species. Genes (Basel) 2022; 13:genes13101741. [PMID: 36292626 PMCID: PMC9601875 DOI: 10.3390/genes13101741] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 09/21/2022] [Accepted: 09/23/2022] [Indexed: 11/17/2022] Open
Abstract
Bursera comprises ~100 tropical shrub and tree species, with the center of the species diversification in Mexico. The genomic resources developed for the genus are scarce, and this has limited the study of the gene flow, local adaptation, and hybridization dynamics. In this study, based on ~155 million Illumina paired-end reads per species, we performed a de novo genome assembly and annotation of three Bursera species of the Bullockia section: Bursera bipinnata, Bursera cuneata, and Bursera palmeri. The total lengths of the genome assemblies were 253, 237, and 229 Mb for B. cuneata, B. palmeri, and B. bipinnata, respectively. The assembly of B. palmeri retrieved the most complete and single-copy BUSCOs (87.3%) relative to B. cuneata (86.5%) and B. bipinnata (76.6%). The ab initio gene prediction recognized between 21,000 and 32,000 protein-coding genes. Other genomic features, such as simple sequence repeats (SSRs), were also detected. Using the de novo genome assemblies as a reference, we identified single-nucleotide polymorphisms (SNPs) for a set of 43 Bursera individuals. Moreover, we mapped the filtered reads of each Bursera species against the chloroplast genomes of five Burseraceae species, obtaining consensus sequences ranging from 156 to 160 kb in length. Our work contributes to the generation of genomic resources for an important but understudied genus of tropical-dry-forest species.
Collapse
|
31
|
Yu Y, Zhang Z, Dong X, Yang R, Duan Z, Xiang Z, Li J, Li G, Yan F, Xue H, Jiao D, Lu J, Lu H, Zhang W, Wei Y, Fan S, Li J, Jia J, Zhang J, Ji J, Liu P, Lu H, Zhao H, Chen S, Wei C, Chen H, Zhu Z. Pangenomic analysis of Chinese gastric cancer. Nat Commun 2022; 13:5412. [PMID: 36109518 PMCID: PMC9477819 DOI: 10.1038/s41467-022-33073-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 08/31/2022] [Indexed: 11/25/2022] Open
Abstract
Pangenomic study might improve the completeness of human reference genome (GRCh38) and promote precision medicine. Here, we use an automated pipeline of human pangenomic analysis to build gastric cancer pan-genome for 185 paired deep sequencing data (370 samples), and characterize the gene presence-absence variations (PAVs) at whole genome level. Genes ACOT1, GSTM1, SIGLEC14 and UGT2B17 are identified as highly absent genes in gastric cancer population. A set of genes from unaligned sequences with GRCh38 are predicted. We successfully locate one of predicted genes GC0643 on chromosome 9q34.2. Overexpression of GC0643 significantly inhibits cell growth, cell migration and invasion, cell cycle progression, and induces cell apoptosis in cancer cells. The tumor suppressor functions can be reversed by shGC0643 knockdown. The GC0643 is approved by NCBI database (GenBank: MW194843.1). Collectively, the robust pan-genome strategy provides a deeper understanding of the gene PAVs in the human cancer genome. Human pan-genomics are increasing our knowledge of genomic diversity and genetic factors in disease. Here, the authors built a gastric cancer pan-genome that included the sequences of Chinese Han patients, and predicted putative and previously unaligned genes associated with gastric cancer.
Collapse
|
32
|
Polonio CM, da Silva P, Russo FB, Hyppolito BRN, Zanluqui NG, Benazzato C, Beltrão-Braga PCB, Muxel SM, Peron JPS. microRNAs Control Antiviral Immune Response, Cell Death and Chemotaxis Pathways in Human Neuronal Precursor Cells (NPCs) during Zika Virus Infection. Int J Mol Sci 2022; 23:ijms231810282. [PMID: 36142200 PMCID: PMC9499039 DOI: 10.3390/ijms231810282] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 08/23/2022] [Accepted: 08/24/2022] [Indexed: 11/16/2022] Open
Abstract
Viral infections have always been a serious burden to public health, increasing morbidity and mortality rates worldwide. Zika virus (ZIKV) is a flavivirus transmitted by the Aedes aegypti vector and the causative agent of severe fetal neuropathogenesis and microcephaly. The virus crosses the placenta and reaches the fetal brain, mainly causing the death of neuronal precursor cells (NPCs), glial inflammation, and subsequent tissue damage. Genetic differences, mainly related to the antiviral immune response and cell death pathways greatly influence the susceptibility to infection. These components are modulated by many factors, including microRNAs (miRNAs). MiRNAs are small noncoding RNAs that regulate post-transcriptionally the overall gene expression, including genes for the neurodevelopment and the formation of neural circuits. In this context, we investigated the pathways and target genes of miRNAs modulated in NPCs infected with ZIKV. We observed downregulation of miR-302b, miR-302c and miR-194, whereas miR-30c was upregulated in ZIKV infected human NPCs in vitro. The analysis of a public dataset of ZIKV-infected human NPCs evidenced 262 upregulated and 3 downregulated genes, of which 142 were the target of the aforementioned miRNAs. Further, we confirmed a correlation between miRNA and target genes affecting pathways related to antiviral immune response, cell death and immune cells chemotaxis, all of which could contribute to the establishment of microcephaly and brain lesions. Here, we suggest that miRNAs target gene expression in infected NPCs, directly contributing to the pathogenesis of fetal microcephaly.
Collapse
Affiliation(s)
- Carolina M. Polonio
- Neuroimmune Interactions Laboratory, Department of Immunology, University of São Paulo, São Paulo 05508-000, Brazil
- Scientific Platform Pasteur-USP (SPPU), University of São Paulo, São Paulo 05508-000, Brazil
| | - Patrick da Silva
- Neuroimmune Interactions Laboratory, Department of Immunology, University of São Paulo, São Paulo 05508-000, Brazil
- Scientific Platform Pasteur-USP (SPPU), University of São Paulo, São Paulo 05508-000, Brazil
| | - Fabiele B. Russo
- Scientific Platform Pasteur-USP (SPPU), University of São Paulo, São Paulo 05508-000, Brazil
- Disease Modeling Laboratory at Department of Microbiology, Institute of Biomedical Sciences, São Paulo 05508-000, Brazil
| | - Brendo R. N. Hyppolito
- Neuroimmune Interactions Laboratory, Department of Immunology, University of São Paulo, São Paulo 05508-000, Brazil
- Immunopathology and Allergy Post Graduate Program, School of Medicine, University of São Paulo, São Paulo 05508-000, Brazil
| | - Nagela G. Zanluqui
- Neuroimmune Interactions Laboratory, Department of Immunology, University of São Paulo, São Paulo 05508-000, Brazil
- Scientific Platform Pasteur-USP (SPPU), University of São Paulo, São Paulo 05508-000, Brazil
- Immunopathology and Allergy Post Graduate Program, School of Medicine, University of São Paulo, São Paulo 05508-000, Brazil
| | - Cecília Benazzato
- Disease Modeling Laboratory at Department of Microbiology, Institute of Biomedical Sciences, São Paulo 05508-000, Brazil
| | - Patrícia C. B. Beltrão-Braga
- Scientific Platform Pasteur-USP (SPPU), University of São Paulo, São Paulo 05508-000, Brazil
- Disease Modeling Laboratory at Department of Microbiology, Institute of Biomedical Sciences, São Paulo 05508-000, Brazil
| | - Sandra M. Muxel
- Neuroimmune Interactions Laboratory, Department of Immunology, University of São Paulo, São Paulo 05508-000, Brazil
- Scientific Platform Pasteur-USP (SPPU), University of São Paulo, São Paulo 05508-000, Brazil
- Correspondence: (S.M.M.); (J.P.S.P.)
| | - Jean Pierre S. Peron
- Neuroimmune Interactions Laboratory, Department of Immunology, University of São Paulo, São Paulo 05508-000, Brazil
- Scientific Platform Pasteur-USP (SPPU), University of São Paulo, São Paulo 05508-000, Brazil
- Immunopathology and Allergy Post Graduate Program, School of Medicine, University of São Paulo, São Paulo 05508-000, Brazil
- Correspondence: (S.M.M.); (J.P.S.P.)
| |
Collapse
|
33
|
Metagenomic methylation patterns resolve bacterial genomes of unusual size and structural complexity. THE ISME JOURNAL 2022; 16:1921-1931. [PMID: 35459792 PMCID: PMC9296519 DOI: 10.1038/s41396-022-01242-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 04/05/2022] [Accepted: 04/08/2022] [Indexed: 01/01/2023]
Abstract
The plasticity of bacterial and archaeal genomes makes examining their ecological and evolutionary dynamics both exciting and challenging. The same mechanisms that enable rapid genomic change and adaptation confound current approaches for recovering complete genomes from metagenomes. Here, we use strain-specific patterns of DNA methylation to resolve complex bacterial genomes from long-read metagenomic data of a marine microbial consortium, the “pink berries” of the Sippewissett Marsh (USA). Unique combinations of restriction-modification (RM) systems encoded by the bacteria produced distinctive methylation profiles that were used to accurately bin and classify metagenomic sequences. Using this approach, we finished the largest and most complex circularized bacterial genome ever recovered from a metagenome (7.9 Mb with >600 transposons), the finished genome of Thiohalocapsa sp. PB-PSB1 the dominant bacteria in the consortia. From genomes binned by methylation patterns, we identified instances of horizontal gene transfer between sulfur-cycling symbionts (Thiohalocapsa sp. PB-PSB1 and Desulfofustis sp. PB-SRB1), phage infection, and strain-level structural variation. We also linked the methylation patterns of each metagenome-assembled genome with encoded DNA methyltransferases and discovered new RM defense systems, including novel associations of RM systems with RNase toxins.
Collapse
|
34
|
Wang Q, Purrafee Dizaj L, Huang J, Kumar Sarker K, Kevrekidis C, Reichenbacher B, Reza Esmaeili H, Straube N, Moritz T, Li C. Molecular phylogenetics of the Clupeiformes based on exon-capture data and a new classification of the order. Mol Phylogenet Evol 2022; 175:107590. [PMID: 35850406 DOI: 10.1016/j.ympev.2022.107590] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 06/21/2022] [Accepted: 07/12/2022] [Indexed: 10/17/2022]
Abstract
The Clupeiformes, including among others herrings, anchovies, shads and menhadens are ecologically and commercially important, yet their phylogenetic relationships are still controversial. Previous classification of Clupeiformes were based on morphological characters or lack of synapomorphic characters. More recent studies based on molecular data as well as new morphological evidence are keeping challenging their phylogenetic relations and there is still no consensus on many interrelationships within the Clupeiformes. In this study, we collected nuclear sequence data from 4,434 single-copy protein coding loci using a gene-capture method. We obtained a robust phylogeny based on 1,165 filtered loci with less than 30 % missing data. Our major findings include: 1) reconfirmation of monophyly of the Clupeiformes, that is, Denticipitidae is sister to all other clupeiforms; 2) the polyphyletic nature of dussumieriids and early branching of Spratelloididae from all other clupeoids were confirmed using datasets curated for less missing data and more balanced base composition in the respective taxa. The next branching clade is the monophyletic Engraulidae. Pristigasteridae also is monophyletic, but it was nested in the previously defined "Clupeidae". Within Pristigasteridae there is no support for monophyletic Pelloninae. Chirocentrus is close to Dussumieria and not to engraulids. The miniaturized Sundasalanx is placed close to the ehiravine Clupeonella, however, with a relatively deep split. The genus Clupea, is not part of the diverse "Clupeidae", but part of a clade containing additionally Sprattus and Etrumeus. Within the crown group clades, Alosidae and Dorosomatidae are retrieved as sister clades. Based on new fossil calibration points, we found that major lineages of the clupeiforms diverged in the late Cretaceous and early Paleogene. The extinction event at the end of the Cretaceous may have created ecological niches, which could have fueled the diversification of clupeiform fishes. Based on the strong evidence of the present study, we propose an updated classification of Clupeiformes consisting of ten families: Denticipitidae; Spratelloididae; Engraulidae (Engraulinae + Coiliinae); Clupeidae; Chirocentridae; Dussumieriidae; Pristigasteridae; Ehiravidae; Alosidae, Dorosomatidae.
Collapse
Affiliation(s)
- Qian Wang
- East China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Shanghai 200090, China; Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai Ocean University, Shanghai 201306, China.
| | - Leyli Purrafee Dizaj
- Ichthyology and Molecular Systematics Research Laboratory, Zoology Section, Department of Biology, School of Science, Shiraz University, Shiraz, Iran.
| | - Junman Huang
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai Ocean University, Shanghai 201306, China; Engineering Research Center of Environmental DNA and Ecological Water Health Assessment, Shanghai Ocean University, Shanghai 201306, China.
| | - Kishor Kumar Sarker
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai Ocean University, Shanghai 201306, China; Engineering Research Center of Environmental DNA and Ecological Water Health Assessment, Shanghai Ocean University, Shanghai 201306, China.
| | - Charalampos Kevrekidis
- Ludwig-Maximilians-Universität München, Department für Geo- und Umweltwissenschaften, Paläontologie & Geobiologie, Richard-Wagner-Str. 10, 80333 Munich, Germany.
| | - Bettina Reichenbacher
- Ludwig-Maximilians-Universität München, Department für Geo- und Umweltwissenschaften, Paläontologie & Geobiologie, Richard-Wagner-Str. 10, 80333 Munich, Germany; GeoBio-Center, Ludwig-Maximilians-Universität München, Munich, Germany.
| | - Hamid Reza Esmaeili
- Ichthyology and Molecular Systematics Research Laboratory, Zoology Section, Department of Biology, School of Science, Shiraz University, Shiraz, Iran.
| | - Nicolas Straube
- University Museum, Department of Natural History, University of Bergen, Norway.
| | - Timo Moritz
- Deutsches Meeresmuseum, Katharinenberg 14-20, 18439 Stralsund, Germany; Institute of Biological Sciences, University of Rostock, Albert-Einstein-Straße 3, 18059 Rostock, Germany.
| | - Chenhong Li
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution, Shanghai Ocean University, Shanghai 201306, China; Engineering Research Center of Environmental DNA and Ecological Water Health Assessment, Shanghai Ocean University, Shanghai 201306, China.
| |
Collapse
|
35
|
Kallenborn F, Cascitti J, Schmidt B. CARE 2.0: reducing false-positive sequencing error corrections using machine learning. BMC Bioinformatics 2022; 23:227. [PMID: 35698033 PMCID: PMC9195321 DOI: 10.1186/s12859-022-04754-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 05/30/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Next-generation sequencing pipelines often perform error correction as a preprocessing step to obtain cleaned input data. State-of-the-art error correction programs are able to reliably detect and correct the majority of sequencing errors. However, they also introduce new errors by making false-positive corrections. These correction mistakes can have negative impact on downstream analysis, such as k-mer statistics, de-novo assembly, and variant calling. This motivates the need for more precise error correction tools. RESULTS We present CARE 2.0, a context-aware read error correction tool based on multiple sequence alignment targeting Illumina datasets. In addition to a number of newly introduced optimizations its most significant change is the replacement of CARE 1.0's hand-crafted correction conditions with a novel classifier based on random decision forests trained on Illumina data. This results in up to two orders-of-magnitude fewer false-positive corrections compared to other state-of-the-art error correction software. At the same time, CARE 2.0 is able to achieve high numbers of true-positive corrections comparable to its competitors. On a simulated full human dataset with 914M reads CARE 2.0 generates only 1.2M false positives (FPs) (and 801.4M true positives (TPs)) at a highly competitive runtime while the best corrections achieved by other state-of-the-art tools contain at least 3.9M FPs and at most 814.5M TPs. Better de-novo assembly and improved k-mer analysis show the applicability of CARE 2.0 to real-world data. CONCLUSION False-positive corrections can negatively influence down-stream analysis. The precision of CARE 2.0 greatly reduces the number of those corrections compared to other state-of-the-art programs including BFC, Karect, Musket, Bcool, SGA, and Lighter. Thus, higher-quality datasets are produced which improve k-mer analysis and de-novo assembly in real-world datasets which demonstrates the applicability of machine learning techniques in the context of sequencing read error correction. CARE 2.0 is written in C++/CUDA for Linux systems and can be run on the CPU as well as on CUDA-enabled GPUs. It is available at https://github.com/fkallen/CARE .
Collapse
Affiliation(s)
- Felix Kallenborn
- Department of Computer Science, Johannes Gutenberg University Mainz, Mainz, Germany.
| | - Julian Cascitti
- Department of Computer Science, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Bertil Schmidt
- Department of Computer Science, Johannes Gutenberg University Mainz, Mainz, Germany
| |
Collapse
|
36
|
Lee SD, Wu M, Lo KW, Yip KY. Accurate reconstruction of viral genomes in human cells from short reads using iterative refinement. BMC Genomics 2022; 23:422. [PMID: 35668367 PMCID: PMC9169298 DOI: 10.1186/s12864-022-08649-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 05/24/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND After an infection, human cells may contain viral genomes in the form of episomes or integrated DNA. Comparing the genomic sequences of different strains of a virus in human cells can often provide useful insights into its behaviour, activity and pathology, and may help develop methods for disease prevention and treatment. To support such comparative analyses, the viral genomes need to be accurately reconstructed from a large number of samples. Previous efforts either rely on customized experimental protocols or require high similarity between the sequenced genomes and a reference, both of which limit the general applicability of these approaches. In this study, we propose a pipeline, named ASPIRE, for reconstructing viral genomes accurately from short reads data of human samples, which are increasingly available from genome projects and personal genomics. ASPIRE contains a basic part that involves de novo assembly, tiling and gap filling, and additional components for iterative refinement, sequence corrections and wrapping. RESULTS Evaluated by the alignment quality of sequencing reads to the reconstructed genomes, these additional components improve the assembly quality in general, and in some particular samples quite substantially, especially when the sequenced genome is significantly different from the reference. We use ASPIRE to reconstruct the genomes of Epstein Barr Virus (EBV) from the whole-genome sequencing data of 61 nasopharyngeal carcinoma (NPC) samples and provide these sequences as a resource for EBV research. CONCLUSIONS ASPIRE improves the quality of the reconstructed EBV genomes in published studies and outperforms TRACESPipe in some samples considered.
Collapse
Affiliation(s)
- Sau-Dan Lee
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
| | - Man Wu
- Department of Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Shatin, Hong Kong
| | - Kwok-Wai Lo
- Department of Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Shatin, Hong Kong
| | - Kevin Y. Yip
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
- Current address: Sanford Burnham Prebys Medical Discovery Institute, La Jolla, 92037 CA USA
| |
Collapse
|
37
|
Du Y, Sun F. HiFine: integrating Hi-c-based and shotgun-based methods to reFine binning of metagenomic contigs. Bioinformatics 2022; 38:2973-2979. [PMID: 35482530 PMCID: PMC9154269 DOI: 10.1093/bioinformatics/btac295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 03/28/2022] [Accepted: 04/21/2022] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Metagenomic binning aims to retrieve microbial genomes directly from ecosystems by clustering metagenomic contigs assembled from short reads into draft genomic bins. Traditional shotgun-based binning methods depend on the contigs' composition and abundance profiles and are impaired by the paucity of enough samples to construct reliable co-abundance profiles. When applied to a single sample, shotgun-based binning methods struggle to distinguish closely related species only using composition information. As an alternative binning approach, Hi-C-based binning employs metagenomic Hi-C technique to measure the proximity contacts between metagenomic fragments. However, spurious inter-species Hi-C contacts inevitably generated by incorrect ligations of DNA fragments between species link the contigs from varying genomes, weakening the purity of final draft genomic bins. Therefore, it is imperative to develop a binning pipeline to overcome the shortcomings of both types of binning methods on a single sample. RESULTS We develop HiFine, a novel binning pipeline to refine the binning results of metagenomic contigs by integrating both Hi-C-based and shotgun-based binning tools. HiFine designs a strategy of fragmentation for the original bin sets derived from the Hi-C-based and shotgun-based binning methods, which considerably increases the purity of initial bins, followed by merging fragmented bins and recruiting unbinned contigs. We demonstrate that HiFine significantly improves the existing binning results of both types of binning methods and achieves better performance in constructing species genomes on publicly available datasets. To the best of our knowledge, HiFine is the first pipeline to integrate different types of tools for the binning of metagenomic contigs. AVAILABILITY HiFine is available at https://github.com/dyxstat/HiFine. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yuxuan Du
- Department of Quantitative and Computational Biology, University of Southern California, USA
| | - Fengzhu Sun
- Department of Quantitative and Computational Biology, University of Southern California, USA
| |
Collapse
|
38
|
Qi W, Lim YW, Patrignani A, Schläpfer P, Bratus-Neuenschwander A, Grüter S, Chanez C, Rodde N, Prat E, Vautrin S, Fustier MA, Pratas D, Schlapbach R, Gruissem W. The haplotype-resolved chromosome pairs of a heterozygous diploid African cassava cultivar reveal novel pan-genome and allele-specific transcriptome features. Gigascience 2022; 11:giac028. [PMID: 35333302 PMCID: PMC8952263 DOI: 10.1093/gigascience/giac028] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Revised: 01/11/2022] [Accepted: 02/22/2022] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Cassava (Manihot esculenta) is an important clonally propagated food crop in tropical and subtropical regions worldwide. Genetic gain by molecular breeding has been limited, partially because cassava is a highly heterozygous crop with a repetitive and difficult-to-assemble genome. FINDINGS Here we demonstrate that Pacific Biosciences high-fidelity (HiFi) sequencing reads, in combination with the assembler hifiasm, produced genome assemblies at near complete haplotype resolution with higher continuity and accuracy compared to conventional long sequencing reads. We present 2 chromosome-scale haploid genomes phased with Hi-C technology for the diploid African cassava variety TME204. With consensus accuracy >QV46, contig N50 >18 Mb, BUSCO completeness of 99%, and 35k phased gene loci, it is the most accurate, continuous, complete, and haplotype-resolved cassava genome assembly so far. Ab initio gene prediction with RNA-seq data and Iso-Seq transcripts identified abundant novel gene loci, with enriched functionality related to chromatin organization, meristem development, and cell responses. During tissue development, differentially expressed transcripts of different haplotype origins were enriched for different functionality. In each tissue, 20-30% of transcripts showed allele-specific expression (ASE) differences. ASE bias was often tissue specific and inconsistent across different tissues. Direction-shifting was observed in <2% of the ASE transcripts. Despite high gene synteny, the HiFi genome assembly revealed extensive chromosome rearrangements and abundant intra-genomic and inter-genomic divergent sequences, with large structural variations mostly related to LTR retrotransposons. We use the reference-quality assemblies to build a cassava pan-genome and demonstrate its importance in representing the genetic diversity of cassava for downstream reference-guided omics analysis and breeding. CONCLUSIONS The phased and annotated chromosome pairs allow a systematic view of the heterozygous diploid genome organization in cassava with improved accuracy, completeness, and haplotype resolution. They will be a valuable resource for cassava breeding and research. Our study may also provide insights into developing cost-effective and efficient strategies for resolving complex genomes with high resolution, accuracy, and continuity.
Collapse
Affiliation(s)
- Weihong Qi
- Functional Genomics Center Zurich, ETH Zurich and University of Zurich, Winterthurerstrasse 190, 8057, Zurich, Switzerland
- Department of Biology, Institute of Molecular Plant Biology, ETH Zurich, Universitätstrasse 2, 8092, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1202, Geneva, Switzerland
| | - Yi-Wen Lim
- Department of Biology, Institute of Molecular Plant Biology, ETH Zurich, Universitätstrasse 2, 8092, Zurich, Switzerland
| | - Andrea Patrignani
- Functional Genomics Center Zurich, ETH Zurich and University of Zurich, Winterthurerstrasse 190, 8057, Zurich, Switzerland
| | - Pascal Schläpfer
- Department of Biology, Institute of Molecular Plant Biology, ETH Zurich, Universitätstrasse 2, 8092, Zurich, Switzerland
| | - Anna Bratus-Neuenschwander
- Functional Genomics Center Zurich, ETH Zurich and University of Zurich, Winterthurerstrasse 190, 8057, Zurich, Switzerland
| | - Simon Grüter
- Functional Genomics Center Zurich, ETH Zurich and University of Zurich, Winterthurerstrasse 190, 8057, Zurich, Switzerland
| | - Christelle Chanez
- Department of Biology, Institute of Molecular Plant Biology, ETH Zurich, Universitätstrasse 2, 8092, Zurich, Switzerland
| | - Nathalie Rodde
- INRAE, CNRGV French Plant Genomic Resource Center, F-31320, Castanet Tolosan, France
| | - Elisa Prat
- INRAE, CNRGV French Plant Genomic Resource Center, F-31320, Castanet Tolosan, France
| | - Sonia Vautrin
- INRAE, CNRGV French Plant Genomic Resource Center, F-31320, Castanet Tolosan, France
| | | | - Diogo Pratas
- Department of Electronics, Telecommunications and Informatics and Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal
- Department of Virology, University of Helsinki, Haartmaninkatu 3, 00014 Helsinki, Finland
| | - Ralph Schlapbach
- Functional Genomics Center Zurich, ETH Zurich and University of Zurich, Winterthurerstrasse 190, 8057, Zurich, Switzerland
| | - Wilhelm Gruissem
- Department of Biology, Institute of Molecular Plant Biology, ETH Zurich, Universitätstrasse 2, 8092, Zurich, Switzerland
- Biotechnology Center, National Chung Hsing University, 145 Xingda Road, Taichung 40227, Taiwan
| |
Collapse
|
39
|
Rajczewski AT, Jagtap PD, Griffin TJ. An overview of technologies for MS-based proteomics-centric multi-omics. Expert Rev Proteomics 2022; 19:165-181. [PMID: 35466851 PMCID: PMC9613604 DOI: 10.1080/14789450.2022.2070476] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
INTRODUCTION Mass spectrometry-based proteomics reveals dynamic molecular signatures underlying phenotypes reflecting normal and perturbed conditions in living systems. Although valuable on its own, the proteome has only one level of moleclar information, with the genome, epigenome, transcriptome, and metabolome, all providing complementary information. Multi-omic analysis integrating information from one or more of these other domains with proteomic information provides a more complete picture of molecular contributors to dynamic biological systems. AREAS COVERED Here, we discuss the improvements to mass spectrometry-based technologies, focused on peptide-based, bottom-up approaches that have enabled deep, quantitative characterization of complex proteomes. These advances are facilitating the integration of proteomics data with other 'omic information, providing a more complete picture of living systems. We also describe the current state of bioinformatics software and approaches for integrating proteomics and other 'omics data, critical for enabling new discoveries driven by multi-omics. EXPERT COMMENTARY Multi-omics, centered on the integration of proteomics information with other 'omic information, has tremendous promise for biological and biomedical studies. Continued advances in approaches for generating deep, reliable proteomic data and bioinformatics tools aimed at integrating data across 'omic domains will ensure the discoveries offered by these multi-omic studies continue to increase.
Collapse
Affiliation(s)
- Andrew T. Rajczewski
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA,Coauthor, Research Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA,Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| |
Collapse
|
40
|
Grealey J, Lannelongue L, Saw WY, Marten J, Méric G, Ruiz-Carmona S, Inouye M. THE CARBON FOOTPRINT OF BIOINFORMATICS. Mol Biol Evol 2022; 39:6526403. [PMID: 35143670 PMCID: PMC8892942 DOI: 10.1093/molbev/msac034] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Bioinformatic research relies on large-scale computational infrastructures which have a nonzero carbon footprint but so far, no study has quantified the environmental costs of bioinformatic tools and commonly run analyses. In this work, we estimate the carbon footprint of bioinformatics (in kilograms of CO2 equivalent units, kgCO2e) using the freely available Green Algorithms calculator (www.green-algorithms.org, last accessed 2022). We assessed 1) bioinformatic approaches in genome-wide association studies (GWAS), RNA sequencing, genome assembly, metagenomics, phylogenetics, and molecular simulations, as well as 2) computation strategies, such as parallelization, CPU (central processing unit) versus GPU (graphics processing unit), cloud versus local computing infrastructure, and geography. In particular, we found that biobank-scale GWAS emitted substantial kgCO2e and simple software upgrades could make it greener, for example, upgrading from BOLT-LMM v1 to v2.3 reduced carbon footprint by 73%. Moreover, switching from the average data center to a more efficient one can reduce carbon footprint by approximately 34%. Memory over-allocation can also be a substantial contributor to an algorithm’s greenhouse gas emissions. The use of faster processors or greater parallelization reduces running time but can lead to greater carbon footprint. Finally, we provide guidance on how researchers can reduce power consumption and minimize kgCO2e. Overall, this work elucidates the carbon footprint of common analyses in bioinformatics and provides solutions which empower a move toward greener research.
Collapse
Affiliation(s)
- Jason Grealey
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Department of Mathematics and Statistics, La Trobe University, Melbourne, Australia
| | - Loïc Lannelongue
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.,British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.,Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| | - Woei-Yuh Saw
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Jonathan Marten
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Guillaume Méric
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Department of Infectious Diseases, Central Clinical School, Monash University, Melbourne, Australia
| | - Sergio Ruiz-Carmona
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Michael Inouye
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.,British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.,Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK.,British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK.,The Alan Turing Institute, London, UK
| |
Collapse
|
41
|
Tang X, Shang J, Sun Y. RdRp-based sensitive taxonomic classification of RNA viruses for metagenomic data. Brief Bioinform 2022; 23:6523411. [PMID: 35136930 PMCID: PMC8921650 DOI: 10.1093/bib/bbac011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 12/24/2021] [Accepted: 01/10/2022] [Indexed: 11/30/2022] Open
Abstract
With advances in library construction protocols and next-generation sequencing technologies, viral metagenomic sequencing has become the major source for novel virus discovery. Conducting taxonomic classification for metagenomic data is an important means to characterize the viral composition in the underlying samples. However, RNA viruses are abundant and highly diverse, jeopardizing the sensitivity of comparison-based classification methods. To improve the sensitivity of read-level taxonomic classification, we developed an RNA-dependent RNA polymerase (RdRp) gene-based read classification tool RdRpBin. It combines alignment-based strategy with machine learning models in order to fully exploit the sequence properties of RdRp. We tested our method and compared its performance with the state-of-the-art tools on the simulated and real sequencing data. RdRpBin competes favorably with all. In particular, when the query RNA viruses share low sequence similarity with the known viruses (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$\sim 0.4$\end{document}), our tool can still maintain a higher F-score than the state-of-the-art tools. The experimental results on real data also showed that RdRpBin can classify more RNA viral reads with a relatively low false-positive rate. Thus, RdRpBin can be utilized to classify novel and diverged RNA viruses.
Collapse
Affiliation(s)
- Xubo Tang
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, China SAR
| | - Jiayu Shang
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, China SAR
| | - Yanni Sun
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, China SAR
| |
Collapse
|
42
|
Genome assembly and annotation. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00013-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
43
|
Roscito JG, Sameith K, Kirilenko BM, Hecker N, Winkler S, Dahl A, Rodrigues MT, Hiller M. Convergent and lineage-specific genomic differences in limb regulatory elements in limbless reptile lineages. Cell Rep 2022; 38:110280. [DOI: 10.1016/j.celrep.2021.110280] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 11/24/2021] [Accepted: 12/27/2021] [Indexed: 01/02/2023] Open
|
44
|
Smith AR, Mueller R, Fisk MR, Colwell FS. Ancient Metabolisms of a Thermophilic Subseafloor Bacterium. Front Microbiol 2021; 12:764631. [PMID: 34925271 PMCID: PMC8671834 DOI: 10.3389/fmicb.2021.764631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 10/22/2021] [Indexed: 11/13/2022] Open
Abstract
The ancient origins of metabolism may be rooted deep in oceanic crust, and these early metabolisms may have persisted in the habitable thermal anoxic aquifer where conditions remain similar to those when they first appeared. The Wood–Ljungdahl pathway for acetogenesis is a key early biosynthetic pathway with the potential to influence ocean chemistry and productivity, but its contemporary role in oceanic crust is not well established. Here, we describe the genome of a novel acetogen from a thermal suboceanic aquifer olivine biofilm in the basaltic crust of the Juan de Fuca Ridge (JdFR) whose genome suggests it may utilize an ancient chemosynthetic lifestyle. This organism encodes the genes for the complete canonical Wood–Ljungdahl pathway, but is potentially unable to use sulfate and certain organic carbon sources such as lipids and carbohydrates to supplement its energy requirements, unlike other known acetogens. Instead, this organism may use peptides and amino acids for energy or as organic carbon sources. Additionally, genes involved in surface adhesion, the import of metallic cations found in Fe-bearing minerals, and use of molecular hydrogen, a product of serpentinization reactions between water and olivine, are prevalent within the genome. These adaptations are likely a reflection of local environmental micro-niches, where cells are adapted to life in biofilms using ancient chemosynthetic metabolisms dependent on H2 and iron minerals. Since this organism is phylogenetically distinct from a related acetogenic group of Clostridiales, we propose it as a new species, Candidatus Acetocimmeria pyornia.
Collapse
Affiliation(s)
- Amy R Smith
- Department of Science, Mathematics, and Computing, Bard College at Simon's Rock, Great Barrington, MA, United States.,Department of Marine Chemistry and Geochemistry, Woods Hole Oceanographic Institution, Woods Hole, MA, United States.,College of Earth, Ocean, and Atmospheric Sciences, Oregon State University, Corvallis, OR, United States
| | - Ryan Mueller
- College of Earth, Ocean, and Atmospheric Sciences, Oregon State University, Corvallis, OR, United States
| | - Martin R Fisk
- College of Earth, Ocean, and Atmospheric Sciences, Oregon State University, Corvallis, OR, United States
| | - Frederick S Colwell
- College of Earth, Ocean, and Atmospheric Sciences, Oregon State University, Corvallis, OR, United States
| |
Collapse
|
45
|
Prunier J, Carrier A, Gilbert I, Poisson W, Albert V, Taillon J, Bourret V, Côté SD, Droit A, Robert C. CNVs with adaptive potential in Rangifer tarandus: genome architecture and new annotated assembly. Life Sci Alliance 2021; 5:5/3/e202101207. [PMID: 34911809 PMCID: PMC8711850 DOI: 10.26508/lsa.202101207] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Revised: 11/29/2021] [Accepted: 11/29/2021] [Indexed: 01/13/2023] Open
Abstract
Rangifer tarandus has experienced recent drastic population size reductions throughout its circumpolar distribution and preserving the species implies genetic diversity conservation. To facilitate genomic studies of the species populations, we improved the genome assembly by combining long read and linked read and obtained a new highly accurate and contiguous genome assembly made of 13,994 scaffolds (L90 = 131 scaffolds). Using de novo transcriptome assembly of RNA-sequencing reads and similarity with annotated human gene sequences, 17,394 robust gene models were identified. As copy number variations (CNVs) likely play a role in adaptation, we additionally investigated these variations among 20 genomes representing three caribou ecotypes (migratory, boreal and mountain). A total of 1,698 large CNVs (length > 1 kb) showing a genome distribution including hotspots were identified. 43 large CNVs were particularly distinctive of the migratory and sedentary ecotypes and included genes annotated for functions likely related to the expected adaptations. This work includes the first publicly available annotation of the caribou genome and the first assembly allowing genome architecture analyses, including the likely adaptive CNVs reported here.
Collapse
Affiliation(s)
- Julien Prunier
- Département de Médecine Moléculaire, Faculté de Médecine, Université Laval, Quebec City, Canada
| | - Alexandra Carrier
- Département des sciences animales, Faculté des Sciences de l'Agriculture et de l'Alimentation, Université Laval, Quebec City, Canada
| | - Isabelle Gilbert
- Département des sciences animales, Faculté des Sciences de l'Agriculture et de l'Alimentation, Université Laval, Quebec City, Canada
| | - William Poisson
- Département des sciences animales, Faculté des Sciences de l'Agriculture et de l'Alimentation, Université Laval, Quebec City, Canada
| | - Vicky Albert
- Ministère des Forêts, de la Faune et des Parcs du Québec, Quebec City, Canada
| | - Joëlle Taillon
- Ministère des Forêts, de la Faune et des Parcs du Québec, Quebec City, Canada
| | - Vincent Bourret
- Ministère des Forêts, de la Faune et des Parcs du Québec, Quebec City, Canada
| | - Steeve D Côté
- Caribou Ungava, département de biologie, Faculté des Sciences et de Génie, Université Laval, Quebec City, Canada
| | - Arnaud Droit
- Département de Médecine Moléculaire, Faculté de Médecine, Université Laval, Quebec City, Canada
| | - Claude Robert
- Département des sciences animales, Faculté des Sciences de l'Agriculture et de l'Alimentation, Université Laval, Quebec City, Canada
| |
Collapse
|
46
|
Nürnberger B, Baird SJE, Čížková D, Bryjová A, Mudd AB, Blaxter ML, Szymura JM. A dense linkage map for a large repetitive genome: discovery of the sex-determining region in hybridizing fire-bellied toads (Bombina bombina and Bombina variegata). G3 (BETHESDA, MD.) 2021; 11:6353606. [PMID: 34849761 PMCID: PMC8664441 DOI: 10.1093/g3journal/jkab286] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Accepted: 08/16/2021] [Indexed: 12/20/2022]
Abstract
Genomic analysis of hybrid zones offers unique insights into emerging reproductive isolation and the dynamics of introgression. Because hybrid genomes consist of blocks inherited from one or the other parental taxon, linkage information is essential. In most cases, the spectrum of local ancestry tracts can be efficiently uncovered from dense linkage maps. Here, we report the development of such a map for the hybridizing toads, Bombina bombina and Bombina variegata (Anura: Bombinatoridae). Faced with the challenge of a large (7–10 Gb), repetitive genome, we set out to identify a large number of Mendelian markers in the nonrepetitive portion of the genome that report B. bombina vs B. variegata ancestry with appropriately quantified statistical support. Bait sequences for targeted enrichment were selected from a draft genome assembly, after filtering highly repetitive sequences. We developed a novel approach to infer the most likely diplotype per sample and locus from the raw read mapping data, which is robust to over-merging and obviates arbitrary filtering thresholds. Validation of the resulting map with 4755 markers underscored the large-scale synteny between Bombina and Xenopus tropicalis. By assessing the sex of late-stage F2 tadpoles from histological sections, we identified the sex-determining region in the Bombina genome to 7 cM on LG5, which is homologous to X. tropicalis chromosome 5, and inferred male heterogamety. Interestingly, chromosome 5 has been repeatedly recruited as a sex chromosome in anurans with XY sex determination.
Collapse
Affiliation(s)
- Beate Nürnberger
- Research Facility Studenec, Institute of Vertebrate Biology, Czech Academy of Sciences, 603 65 Brno, Czech Republic
| | - Stuart J E Baird
- Research Facility Studenec, Institute of Vertebrate Biology, Czech Academy of Sciences, 603 65 Brno, Czech Republic
| | - Dagmar Čížková
- Research Facility Studenec, Institute of Vertebrate Biology, Czech Academy of Sciences, 603 65 Brno, Czech Republic
| | - Anna Bryjová
- Research Facility Studenec, Institute of Vertebrate Biology, Czech Academy of Sciences, 603 65 Brno, Czech Republic
| | - Austin B Mudd
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, 94720 CA, USA
| | - Mark L Blaxter
- Tree of Life Programme, Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
| | - Jacek M Szymura
- Department of Comparative Anatomy, Jagiellonian University, 30-387 Kraków, Poland
| |
Collapse
|
47
|
Rahman A, Pachter L. SWALO: scaffolding with assembly likelihood optimization. Nucleic Acids Res 2021; 49:e117. [PMID: 34417615 PMCID: PMC8599790 DOI: 10.1093/nar/gkab717] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 06/16/2021] [Accepted: 08/16/2021] [Indexed: 01/01/2023] Open
Abstract
Scaffolding, i.e. ordering and orienting contigs is an important step in genome assembly. We present a method for scaffolding using second generation sequencing reads based on likelihoods of genome assemblies. A generative model for sequencing is used to obtain maximum likelihood estimates of gaps between contigs and to estimate whether linking contigs into scaffolds would lead to an increase in the likelihood of the assembly. We then link contigs if they can be unambiguously joined or if the corresponding increase in likelihood is substantially greater than that of other possible joins of those contigs. The method is implemented in a tool called Swalo with approximations to make it efficient and applicable to large datasets. Analysis on real and simulated datasets reveals that it consistently makes more or similar number of correct joins as other scaffolders while linking very few contigs incorrectly, thus outperforming other scaffolders and demonstrating that substantial improvement in genome assembly may be achieved through the use of statistical models. Swalo is freely available for download at https://atifrahman.github.io/SWALO/.
Collapse
Affiliation(s)
- Atif Rahman
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720, USA.,Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh
| | - Lior Pachter
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720, USA.,Departments of Mathematics and Molecular & Cell Biology, University of California, Berkeley, CA 94720, USA.,Departments of Biology and Computing & Mathematical Sciences, California Institute of Technology, Pasadena, CA 91103, USA
| |
Collapse
|
48
|
Neubert K, Zuchantke E, Leidenfrost RM, Wünschiers R, Grützke J, Malorny B, Brendebach H, Al Dahouk S, Homeier T, Hotzel H, Reinert K, Tomaso H, Busch A. Testing assembly strategies of Francisella tularensis genomes to infer an evolutionary conservation analysis of genomic structures. BMC Genomics 2021; 22:822. [PMID: 34773979 PMCID: PMC8590783 DOI: 10.1186/s12864-021-08115-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 10/12/2021] [Indexed: 02/08/2023] Open
Abstract
Background We benchmarked sequencing technology and assembly strategies for short-read, long-read, and hybrid assemblers in respect to correctness, contiguity, and completeness of assemblies in genomes of Francisella tularensis. Benchmarking allowed in-depth analyses of genomic structures of the Francisella pathogenicity islands and insertion sequences. Five major high-throughput sequencing technologies were applied, including next-generation “short-read” and third-generation “long-read” sequencing methods. Results We focused on short-read assemblers, hybrid assemblers, and analysis of the genomic structure with particular emphasis on insertion sequences and the Francisella pathogenicity island. The A5-miseq pipeline performed best for MiSeq data, Mira for Ion Torrent data, and ABySS for HiSeq data from eight short-read assembly methods. Two approaches were applied to benchmark long-read and hybrid assembly strategies: long-read-first assembly followed by correction with short reads (Canu/Pilon, Flye/Pilon) and short-read-first assembly along with scaffolding based on long reads (Unicyler, SPAdes). Hybrid assembly can resolve large repetitive regions best with a “long-read first” approach. Conclusions Genomic structures of the Francisella pathogenicity islands frequently showed misassembly. Insertion sequences (IS) could be used to perform an evolutionary conservation analysis. A phylogenetic structure of insertion sequences and the evolution within the clades elucidated the clade structure of the highly conservative F. tularensis. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08115-x.
Collapse
Affiliation(s)
- Kerstin Neubert
- Department of Mathematics and Computer Science, Algorithmic Bioinformatics, Freie Universität Berlin, Institute of Computer Science, Takustr. 9, 14195, Berlin, Germany.,German Federal Institute for Risk Assessment, Diedersdorfer Weg 1, 12277, Berlin, Germany
| | - Eric Zuchantke
- Friedrich-Loeffler-Institut, Institute of Bacterial Infections and Zoonoses, Naumburger Str. 96a, 07749, Jena, Germany
| | - Robert Maximilian Leidenfrost
- Department of Biotechnology and Chemistry, Mittweida University of Applied Sciences, Technikumplatz 17a, 09648, Mittweida, Germany
| | - Röbbe Wünschiers
- Department of Biotechnology and Chemistry, Mittweida University of Applied Sciences, Technikumplatz 17a, 09648, Mittweida, Germany
| | - Josephine Grützke
- German Federal Institute for Risk Assessment, Diedersdorfer Weg 1, 12277, Berlin, Germany
| | - Burkhard Malorny
- German Federal Institute for Risk Assessment, Diedersdorfer Weg 1, 12277, Berlin, Germany
| | - Holger Brendebach
- German Federal Institute for Risk Assessment, Diedersdorfer Weg 1, 12277, Berlin, Germany
| | - Sascha Al Dahouk
- German Federal Institute for Risk Assessment, Diedersdorfer Weg 1, 12277, Berlin, Germany
| | - Timo Homeier
- Friedrich-Loeffler-Institut, Institute of Epidemiology, Südufer, 10 17493, Greifswald, Insel Riems, Germany
| | - Helmut Hotzel
- Friedrich-Loeffler-Institut, Institute of Bacterial Infections and Zoonoses, Naumburger Str. 96a, 07749, Jena, Germany
| | - Knut Reinert
- Department of Mathematics and Computer Science, Algorithmic Bioinformatics, Freie Universität Berlin, Institute of Computer Science, Takustr. 9, 14195, Berlin, Germany
| | - Herbert Tomaso
- Friedrich-Loeffler-Institut, Institute of Bacterial Infections and Zoonoses, Naumburger Str. 96a, 07749, Jena, Germany
| | - Anne Busch
- Friedrich-Loeffler-Institut, Institute of Bacterial Infections and Zoonoses, Naumburger Str. 96a, 07749, Jena, Germany. .,Department of Anaesthesiology and Intensive Care Medicine, University Hospital Jena, Jena, Germany.
| |
Collapse
|
49
|
Music of metagenomics-a review of its applications, analysis pipeline, and associated tools. Funct Integr Genomics 2021; 22:3-26. [PMID: 34657989 DOI: 10.1007/s10142-021-00810-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 09/25/2021] [Accepted: 10/03/2021] [Indexed: 10/20/2022]
Abstract
This humble effort highlights the intricate details of metagenomics in a simple, poetic, and rhythmic way. The paper enforces the significance of the research area, provides details about major analytical methods, examines the taxonomy and assembly of genomes, emphasizes some tools, and concludes by celebrating the richness of the ecosystem populated by the "metagenome."
Collapse
|
50
|
Chakraborty A, Mahajan S, Jaiswal SK, Sharma VK. Genome sequencing of turmeric provides evolutionary insights into its medicinal properties. Commun Biol 2021; 4:1193. [PMID: 34654884 PMCID: PMC8521574 DOI: 10.1038/s42003-021-02720-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 08/13/2021] [Indexed: 12/28/2022] Open
Abstract
Curcuma longa, or turmeric, is traditionally known for its immense medicinal properties and has diverse therapeutic applications. However, the absence of a reference genome sequence is a limiting factor in understanding the genomic basis of the origin of its medicinal properties. In this study, we present the draft genome sequence of C. longa, belonging to Zingiberaceae plant family, constructed using 10x Genomics linked reads and Oxford Nanopore long reads. For comprehensive gene set prediction and for insights into its gene expression, transcriptome sequencing of leaf tissue was also performed. The draft genome assembly had a size of 1.02 Gbp with ~70% repetitive sequences, and contained 50,401 coding gene sequences. The phylogenetic position of C. longa was resolved through a comprehensive genome-wide analysis including 16 other plant species. Using 5,388 orthogroups, the comparative evolutionary analysis performed across 17 species including C. longa revealed evolution in genes associated with secondary metabolism, plant phytohormones signaling, and various biotic and abiotic stress tolerance responses. These mechanisms are crucial for perennial and rhizomatous plants such as C. longa for defense and environmental stress tolerance via production of secondary metabolites, which are associated with the wide range of medicinal properties in C. longa.
Collapse
Affiliation(s)
- Abhisek Chakraborty
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, India
| | - Shruti Mahajan
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, India
| | - Shubham K Jaiswal
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, India
| | - Vineet K Sharma
- MetaBioSys Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Bhopal, India.
| |
Collapse
|