1
|
Islam S, Peart C, Kehlmaier C, Sun YH, Lei F, Dahl A, Klemroth S, Alexopoulou D, Del Mar Delgado M, Laiolo P, Carlos Illera J, Dirren S, Hille S, Lkhagvasuren D, Töpfer T, Kaiser M, Gebauer A, Martens J, Paetzold C, Päckert M. Museomics help resolving the phylogeny of snowfinches (Aves, Passeridae, Montifringilla and allies). Mol Phylogenet Evol 2024; 198:108135. [PMID: 38925425 DOI: 10.1016/j.ympev.2024.108135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 03/25/2024] [Accepted: 06/16/2024] [Indexed: 06/28/2024]
Abstract
Historical specimens from museum collections provide a valuable source of material also from remote areas or regions of conflict that are not easily accessible to scientists today. With this study, we are providing a taxon-complete phylogeny of snowfinches using historical DNA from whole skins of an endemic species from Afghanistan, the Afghan snowfinch, Pyrgilauda theresae. To resolve the strong conflict between previous phylogenetic hypotheses, we generated novel mitogenome sequences for selected taxa and genome-wide SNP data using ddRAD sequencing for all extant snowfinch species endemic to the Qinghai-Tibet Plateau (QTP) and for an extended intraspecific sampling of the sole Central and Western Palearctic snowfinch species (Montifringilla nivalis). Our phylogenetic reconstructions unanimously refuted the previously suggested paraphyly of genus Pyrgilauda. Misplacement of one species-level taxon (Onychostruthus tazcanowskii) in previous snowfinch phylogenies was undoubtedly inferred from chimeric mitogenomes that included heterospecific sequence information. Furthermore, comparison of novel and previously generated sequence data showed that the presumed sister-group relationship between M. nivalis and the QTP endemic M. henrici was suggested based on flawed taxonomy. Our phylogenetic reconstructions based on genome-wide SNP data and on mitogenomes were largely congruent and supported reciprocal monophyly of genera Montifringilla and Pyrgilauda with monotypic Onychostruthus being sister to the latter. The Afghan endemic P. theresae likely originated from a rather ancient Pliocene out-of-Tibet dispersal probably from a common ancestor with P. ruficollis. Our extended trans-Palearctic sampling for the white-winged snowfinch, M. nivalis, confirmed strong lineage divergence between an Asian and a European clade dated to 1.5 - 2.7 million years ago (mya). Genome-wide SNP data suggested subtle divergence among European samples from the Alps and from the Cantabrian mountains.
Collapse
Affiliation(s)
- Safiqul Islam
- Senckenberg Natural History Collections, Museum of Zoology, Königsbrücker Landstraße 159, 01109 Dresden, Germany; Max Planck-Genome-Centre Cologne, Max Planck Institute for Plant Breeding Research, Carl-von-Linne-Weg 10, 50829 Köln, Germany; Division of Systematic Zoology, Faculty of Biology, LMU Munich, Biocenter, Großhaderner Str. 2, 82152 Planegg-Martinsried, Germany
| | - Claire Peart
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Biocenter, Großhaderner Str. 2, 82152 Planegg-Martinsried, Germany
| | - Christian Kehlmaier
- Senckenberg Natural History Collections, Museum of Zoology, Königsbrücker Landstraße 159, 01109 Dresden, Germany
| | - Yue-Hua Sun
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Fumin Lei
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Andreas Dahl
- Dresden-Concept Genome Center, c/o Center for Molecular and Cellular Bioengineering (CMCB), Technische Universität Dresden, Fetscherstraße 105, 1307 Dresden, Germany
| | - Sylvia Klemroth
- Dresden-Concept Genome Center, c/o Center for Molecular and Cellular Bioengineering (CMCB), Technische Universität Dresden, Fetscherstraße 105, 1307 Dresden, Germany
| | - Dimitra Alexopoulou
- Dresden-Concept Genome Center, c/o Center for Molecular and Cellular Bioengineering (CMCB), Technische Universität Dresden, Fetscherstraße 105, 1307 Dresden, Germany
| | - Maria Del Mar Delgado
- Biodiversity Research Institute (IMIB, Universidad de Oviedo, CSIC, Principality of Asturias) - Campus de Mieres, Edificio de Investigación - 5ª planta, C. Gonzalo Gutiérrez Quirós s/n, 33600 Mieres, Spain
| | - Paola Laiolo
- Biodiversity Research Institute (IMIB, Universidad de Oviedo, CSIC, Principality of Asturias) - Campus de Mieres, Edificio de Investigación - 5ª planta, C. Gonzalo Gutiérrez Quirós s/n, 33600 Mieres, Spain
| | - Juan Carlos Illera
- Biodiversity Research Institute (IMIB, Universidad de Oviedo, CSIC, Principality of Asturias) - Campus de Mieres, Edificio de Investigación - 5ª planta, C. Gonzalo Gutiérrez Quirós s/n, 33600 Mieres, Spain
| | | | - Sabine Hille
- University of Natural Resources and Life Sciences, Vienna, Gregor Mendel-Strasse 33, 1180 Vienna, Austria
| | - Davaa Lkhagvasuren
- Department of Biology, School of Arts and Sciences, National University of Mongolia, P.O.Box 46A-546, Ulaanbaatar 210646, Mongolia
| | - Till Töpfer
- Leibniz Institute for the Analysis of Biodiversity Change, Zoologisches Forschungsmuseum Alexander Koenig, Adenauerallee, Bonn, Germany
| | | | | | - Jochen Martens
- Institute of Organismic and Molecular Evolution (iomE), Johannes Gutenberg University, 55099 Mainz, Germany
| | - Claudia Paetzold
- Senckenberg Natural History Collections, Museum of Zoology, Königsbrücker Landstraße 159, 01109 Dresden, Germany
| | - Martin Päckert
- Senckenberg Natural History Collections, Museum of Zoology, Königsbrücker Landstraße 159, 01109 Dresden, Germany.
| |
Collapse
|
2
|
Tavakoli N, Gibney D, Aluru S. GraphSlimmer: Preserving Read Mappability with the Minimum Number of Variants. J Comput Biol 2024; 31:616-637. [PMID: 38990757 DOI: 10.1089/cmb.2024.0601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024] Open
Abstract
Modern genomic datasets, like those generated under the 1000 Genome Project, contain millions of variants belonging to known haplotypes. Although these datasets are more representative than a single reference sequence and can alleviate issues like reference bias, they are significantly more computationally burdensome to work with, often involving large-indexed genome graph data structures for tasks such as read mapping. The construction, preprocessing, and mapping algorithms can require substantial computational resources depending on the size of these variant sets. Moreover, the accuracy of mapping algorithms has been shown to decrease when working with complete variant sets. Therefore, a drastically reduced set of variants that preserves important properties of the original set is desirable. This work provides a technique for finding a minimal subset of variants S such that for given parameters α and δ, all substrings up to length α in the haplotypes are guaranteed to be still alignable to the appropriate locations with either Hamming or edit distance at most δ, using only S . Our contributions include showing the NP-hardness and inapproximability of these optimization problems and providing Integer Linear Programming (ILP) formulations. Our edit distance ILP formulation carefully decomposes the problem according to variant locations, which allows it to scale to support all of chromosome 22's variants from the 1000 Genome Project. Our experiments also demonstrate a significant reduction in the number of variants. For example, for moderately long reads, e.g., α = 1000, over 75% of the variants can be removed while preserving read mappability with edit distance at most one.
Collapse
Affiliation(s)
- Neda Tavakoli
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Gxeorgia, USA
| | - Daniel Gibney
- Department of Computer Science, University of Texas at Dallas, Richardson, Texas, USA
| | - Srinivas Aluru
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Gxeorgia, USA
| |
Collapse
|
3
|
Hemstrom W, Grummer JA, Luikart G, Christie MR. Next-generation data filtering in the genomics era. Nat Rev Genet 2024:10.1038/s41576-024-00738-6. [PMID: 38877133 DOI: 10.1038/s41576-024-00738-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2024] [Indexed: 06/16/2024]
Abstract
Genomic data are ubiquitous across disciplines, from agriculture to biodiversity, ecology, evolution and human health. However, these datasets often contain noise or errors and are missing information that can affect the accuracy and reliability of subsequent computational analyses and conclusions. A key step in genomic data analysis is filtering - removing sequencing bases, reads, genetic variants and/or individuals from a dataset - to improve data quality for downstream analyses. Researchers are confronted with a multitude of choices when filtering genomic data; they must choose which filters to apply and select appropriate thresholds. To help usher in the next generation of genomic data filtering, we review and suggest best practices to improve the implementation, reproducibility and reporting standards for filter types and thresholds commonly applied to genomic datasets. We focus mainly on filters for minor allele frequency, missing data per individual or per locus, linkage disequilibrium and Hardy-Weinberg deviations. Using simulated and empirical datasets, we illustrate the large effects of different filtering thresholds on common population genetics statistics, such as Tajima's D value, population differentiation (FST), nucleotide diversity (π) and effective population size (Ne).
Collapse
Affiliation(s)
- William Hemstrom
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| | - Jared A Grummer
- Flathead Lake Biological Station, Wildlife Biology Program and Division of Biological Sciences, University of Montana, Missoula, MT, USA
| | - Gordon Luikart
- Flathead Lake Biological Station, Wildlife Biology Program and Division of Biological Sciences, University of Montana, Missoula, MT, USA
| | - Mark R Christie
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Forestry and Natural Resources, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
4
|
Hernandez SI, Berezin CT, Miller KM, Peccoud SJ, Peccoud J. Sequencing Strategy to Ensure Accurate Plasmid Assembly. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.25.586694. [PMID: 38585828 PMCID: PMC10996661 DOI: 10.1101/2024.03.25.586694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Despite the wide use of plasmids in research and clinical production, the need to verify plasmid sequences is a bottleneck that is too often underestimated in the manufacturing process. Although sequencing platforms continue to improve, the method and assembly pipeline chosen still influence the final plasmid assembly sequence. Furthermore, few dedicated tools exist for plasmid assembly, especially for de novo assembly. Here, we evaluated short-read, long-read, and hybrid (both short and long reads) de novo assembly pipelines across three replicates of a 24-plasmid library. Consistent with previous characterizations of each sequencing technology, short-read assemblies had issues resolving GC-rich regions, and long-read assemblies commonly had small insertions and deletions, especially in repetitive regions. The hybrid approach facilitated the most accurate, consistent assembly generation and identified mutations relative to the reference sequence. Although Sanger sequencing can be used to verify specific regions, some GC-rich and repetitive regions were difficult to resolve using any method, suggesting that easily sequenced genetic parts should be prioritized in the design of new genetic constructs.
Collapse
Affiliation(s)
- Sarah I. Hernandez
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, 80523, United States of America
| | - Casey-Tyler Berezin
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, 80523, United States of America
| | - Katie M. Miller
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, 80523, United States of America
| | - Samuel J. Peccoud
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, 80523, United States of America
| | - Jean Peccoud
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, 80523, United States of America
| |
Collapse
|
5
|
Choi SS, Mc Cartney A, Park D, Roberts H, Brav-Cubitt T, Mitchell C, Buckley TR. Multiple hybridization events and repeated evolution of homoeologue expression bias in parthenogenetic, polyploid New Zealand stick insects. Mol Ecol 2024:e17422. [PMID: 38842022 DOI: 10.1111/mec.17422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 03/03/2024] [Accepted: 04/17/2024] [Indexed: 06/07/2024]
Abstract
During hybrid speciation, homoeologues combine in a single genome. Homoeologue expression bias (HEB) occurs when one homoeologue has higher gene expression than another. HEB has been well characterized in plants but rarely investigated in animals, especially invertebrates. Consequently, we have little idea as to the role that HEB plays in allopolyploid invertebrate genomes. If HEB is constrained by features of the parental genomes, then we predict repeated evolution of similar HEB patterns among hybrid genomes formed from the same parental lineages. To address this, we reconstructed the history of hybridization between the New Zealand stick insect genera Acanthoxyla and Clitarchus using a high-quality genome assembly from Clitarchus hookeri to call variants and phase alleles. These analyses revealed the formation of three independent diploid and triploid hybrid lineages between these genera. RNA sequencing revealed a similar magnitude and direction of HEB among these hybrid lineages, and we observed that many enriched functions and pathways were also shared among lineages, consistent with repeated evolution due to parental genome constraints. In most hybrid lineages, a slight majority of the genes involved in mitochondrial function showed HEB towards the maternal homoeologues, consistent with only weak effects of mitonuclear incompatibility. We also observed a proteasome functional enrichment in most lineages and hypothesize this may result from the need to maintain proteostasis in hybrid genomes. Reference bias was a pervasive problem, and we caution against relying on HEB estimates from a single parental reference genome.
Collapse
Affiliation(s)
- Seung-Sub Choi
- Manaaki Whenua - Landcare Research, Auckland, New Zealand
- School of Biological Sciences, The University of Auckland, Auckland, New Zealand
| | - Ann Mc Cartney
- Manaaki Whenua - Landcare Research, Auckland, New Zealand
| | - Duckchul Park
- Manaaki Whenua - Landcare Research, Auckland, New Zealand
| | - Hester Roberts
- Manaaki Whenua - Landcare Research, Auckland, New Zealand
| | | | | | | |
Collapse
|
6
|
Peccoud S, Berezin CT, Hernandez SI, Peccoud J. PlasCAT: Plasmid Cloud Assembly Tool. Bioinformatics 2024; 40:btae299. [PMID: 38696761 PMCID: PMC11101281 DOI: 10.1093/bioinformatics/btae299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 04/04/2024] [Accepted: 04/30/2024] [Indexed: 05/04/2024] Open
Abstract
SUMMARY PlasCAT (Plasmid Cloud Assembly Tool) is an easy-to-use cloud-based bioinformatics tool that enables de novo plasmid sequence assembly from raw sequencing data. Nontechnical users can now assemble sequences from long reads and short reads without ever touching a line of code. PlasCAT uses high-performance computing servers to reduce run times on assemblies and deliver results faster. AVAILABILITY AND IMPLEMENTATION PlasCAT is freely available on the web at https://sequencing.genofab.com. The assembly pipeline source code and server code are available for download at https://bitbucket.org/genofabinc/workspace/projects/PLASCAT. Click the Cancel button to access the source code without authenticating. Web servers implemented in React.js and Python, with all major browsers supported.
Collapse
Affiliation(s)
| | - Casey-Tyler Berezin
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, CO 80523, United States
| | - Sarah I Hernandez
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, CO 80523, United States
| | - Jean Peccoud
- GenoFAB, Inc., Fort Collins, CO 80528, United States
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, CO 80523, United States
| |
Collapse
|
7
|
Duchen D, Clipman SJ, Vergara C, Thio CL, Thomas DL, Duggal P, Wojcik GL. A hepatitis B virus (HBV) sequence variation graph improves alignment and sample-specific consensus sequence construction. PLoS One 2024; 19:e0301069. [PMID: 38669259 PMCID: PMC11051683 DOI: 10.1371/journal.pone.0301069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 03/09/2024] [Indexed: 04/28/2024] Open
Abstract
Nearly 300 million individuals live with chronic hepatitis B virus (HBV) infection (CHB), for which no curative therapy is available. As viral diversity is associated with pathogenesis and immunological control of infection, improved methods to characterize this diversity could aid drug development efforts. Conventionally, viral sequencing data are mapped/aligned to a reference genome, and only the aligned sequences are retained for analysis. Thus, reference selection is critical, yet selecting the most representative reference a priori remains difficult. We investigate an alternative pangenome approach which can combine multiple reference sequences into a graph which can be used during alignment. Using simulated short-read sequencing data generated from publicly available HBV genomes and real sequencing data from an individual living with CHB, we demonstrate alignment to a phylogenetically representative 'genome graph' can improve alignment, avoid issues of reference ambiguity, and facilitate the construction of sample-specific consensus sequences more genetically similar to the individual's infection. Graph-based methods can, therefore, improve efforts to characterize the genetics of viral pathogens, including HBV, and have broader implications in host-pathogen research.
Collapse
Affiliation(s)
- Dylan Duchen
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States of America
- Center for Biomedical Data Science, Yale School of Medicine, New Haven, CT, United States of America
| | - Steven J. Clipman
- Division of Infectious Diseases, Johns Hopkins University School of Medicine, Baltimore, MD, United States of America
| | - Candelaria Vergara
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States of America
| | - Chloe L. Thio
- Division of Infectious Diseases, Johns Hopkins University School of Medicine, Baltimore, MD, United States of America
| | - David L. Thomas
- Division of Infectious Diseases, Johns Hopkins University School of Medicine, Baltimore, MD, United States of America
| | - Priya Duggal
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States of America
| | - Genevieve L. Wojcik
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States of America
| |
Collapse
|
8
|
Majander K, Pla-Díaz M, du Plessis L, Arora N, Filippini J, Pezo-Lanfranco L, Eggers S, González-Candelas F, Schuenemann VJ. Redefining the treponemal history through pre-Columbian genomes from Brazil. Nature 2024; 627:182-188. [PMID: 38267579 PMCID: PMC10917687 DOI: 10.1038/s41586-023-06965-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 12/12/2023] [Indexed: 01/26/2024]
Abstract
The origins of treponemal diseases have long remained unknown, especially considering the sudden onset of the first syphilis epidemic in the late 15th century in Europe and its hypothesized arrival from the Americas with Columbus' expeditions1,2. Recently, ancient DNA evidence has revealed various treponemal infections circulating in early modern Europe and colonial-era Mexico3-6. However, there has been to our knowledge no genomic evidence of treponematosis recovered from either the Americas or the Old World that can be reliably dated to the time before the first trans-Atlantic contacts. Here, we present treponemal genomes from nearly 2,000-year-old human remains from Brazil. We reconstruct four ancient genomes of a prehistoric treponemal pathogen, most closely related to the bejel-causing agent Treponema pallidum endemicum. Contradicting the modern day geographical niche of bejel in the arid regions of the world, the results call into question the previous palaeopathological characterization of treponeme subspecies and showcase their adaptive potential. A high-coverage genome is used to improve molecular clock date estimations, placing the divergence of modern T. pallidum subspecies firmly in pre-Columbian times. Overall, our study demonstrates the opportunities within archaeogenetics to uncover key events in pathogen evolution and emergence, paving the way to new hypotheses on the origin and spread of treponematoses.
Collapse
Affiliation(s)
- Kerttu Majander
- Institute of Evolutionary Medicine, University of Zurich, Zurich, Switzerland.
- Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria.
- Department of Environmental Sciences, University of Basel, Basel, Switzerland.
| | - Marta Pla-Díaz
- Unidad Mixta Infección y Salud Pública, FISABIO/Universidad de Valencia-I2SysBio, Valencia, Spain
- CIBER in Epidemiology and Public Health, Instituto de Salud Carlos III, Madrid, Spain
| | - Louis du Plessis
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- Swiss Institute of Bioinformatics, Quartier Sorge, Lausanne, Switzerland
| | - Natasha Arora
- Zurich Institute of Forensic Medicine, University of Zurich, Zurich, Switzerland
| | - Jose Filippini
- Department of Genetic and Evolutionary Biology, University of São Paulo, São Paulo, Brazil
| | - Luis Pezo-Lanfranco
- Department of Genetic and Evolutionary Biology, University of São Paulo, São Paulo, Brazil
- Institute of Environmental Science and Technology (ICTA) and Prehistory Department, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Sabine Eggers
- Department of Genetic and Evolutionary Biology, University of São Paulo, São Paulo, Brazil
- Department of Anthropology, Natural History Museum Vienna, Vienna, Austria
| | - Fernando González-Candelas
- Unidad Mixta Infección y Salud Pública, FISABIO/Universidad de Valencia-I2SysBio, Valencia, Spain.
- CIBER in Epidemiology and Public Health, Instituto de Salud Carlos III, Madrid, Spain.
| | - Verena J Schuenemann
- Institute of Evolutionary Medicine, University of Zurich, Zurich, Switzerland.
- Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria.
- Department of Environmental Sciences, University of Basel, Basel, Switzerland.
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna, Vienna, Austria.
| |
Collapse
|
9
|
Furuta T, Yamamoto T. MCPtaggR: R package for accurate genotype calling in reduced representation sequencing data by eliminating error-prone markers based on genome comparison. DNA Res 2024; 31:dsad027. [PMID: 38134958 PMCID: PMC10799318 DOI: 10.1093/dnares/dsad027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 12/11/2023] [Accepted: 12/18/2023] [Indexed: 12/24/2023] Open
Abstract
Reduced representation sequencing (RRS) offers cost-effective, high-throughput genotyping platforms such as genotyping-by-sequencing (GBS). RRS reads are typically mapped onto a reference genome. However, mapping reads harbouring mismatches against the reference can potentially result in mismapping and biased mapping, leading to the detection of error-prone markers that provide incorrect genotype information. We established a genotype-calling pipeline named mappable collinear polymorphic tag genotyping (MCPtagg) to achieve accurate genotyping by eliminating error-prone markers. MCPtagg was designed for the RRS-based genotyping of a population derived from a biparental cross. The MCPtagg pipeline filters out error-prone markers prior to genotype calling based on marker collinearity information obtained by comparing the genome sequences of the parents of a population to be genotyped. A performance evaluation on real GBS data from a rice F2 population confirmed its effectiveness. Furthermore, our performance test using a genome assembly that was obtained by genome sequence polishing on an available genome assembly suggests that our pipeline performs well with converted genomes, rather than necessitating de novo assembly. This demonstrates its flexibility and scalability. The R package, MCPtaggR, was developed to provide functions for the pipeline and is available at https://github.com/tomoyukif/MCPtaggR.
Collapse
Affiliation(s)
- Tomoyuki Furuta
- Institute of Plant Science and Resources, Okayama University, Kurashiki, Okayama, Japan
| | - Toshio Yamamoto
- Institute of Plant Science and Resources, Okayama University, Kurashiki, Okayama, Japan
| |
Collapse
|
10
|
Reding C, Satapoomin N, Avison MB. Hound: a novel tool for automated mapping of genotype to phenotype in bacterial genomes assembled de novo. Brief Bioinform 2024; 25:bbae057. [PMID: 38385882 PMCID: PMC10883467 DOI: 10.1093/bib/bbae057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 01/11/2024] [Accepted: 01/26/2024] [Indexed: 02/23/2024] Open
Abstract
Increasing evidence suggests that microbial species have a strong within species genetic heterogeneity. This can be problematic for the analysis of prokaryote genomes, which commonly relies on a reference genome to guide the assembly process. Differences between reference and sample genomes will therefore introduce errors in final assembly, jeopardizing the detection from structural variations to point mutations-critical for genomic surveillance of antibiotic resistance. Here we present Hound, a pipeline that integrates publicly available tools to assemble prokaryote genomes de novo, detect user-given genes by similarity to report mutations found in the coding sequence, promoter, as well as relative gene copy number within the assembly. Importantly, Hound can use the query sequence as a guide to merge contigs, and reconstruct genes that were fragmented by the assembler. To showcase Hound, we screened through 5032 bacterial whole-genome sequences isolated from farmed animals and human infections, using the amino acid sequence encoded by blaTEM-1, to detect and predict resistance to amoxicillin/clavulanate which is driven by over-expression of this gene. We believe this tool can facilitate the analysis of prokaryote species that currently lack a reference genome, and can be scaled either up to build automated systems for genomic surveillance or down to integrate into antibiotic susceptibility point-of-care diagnostics.
Collapse
Affiliation(s)
- Carlos Reding
- University of Bristol School of Cellular and Molecular Medicine, University Walk, Bristol, BS8 1TD Bristol, UK
| | - Naphat Satapoomin
- University of Bristol School of Cellular and Molecular Medicine, University Walk, Bristol, BS8 1TD Bristol, UK
| | - Matthew B Avison
- University of Bristol School of Cellular and Molecular Medicine, University Walk, Bristol, BS8 1TD Bristol, UK
| |
Collapse
|
11
|
Ferguson S, Jones A, Murray K, Andrew RL, Schwessinger B, Bothwell H, Borevitz J. Exploring the role of polymorphic interspecies structural variants in reproductive isolation and adaptive divergence in Eucalyptus. Gigascience 2024; 13:giae029. [PMID: 38869149 PMCID: PMC11170218 DOI: 10.1093/gigascience/giae029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Revised: 03/11/2024] [Accepted: 05/14/2024] [Indexed: 06/14/2024] Open
Abstract
Structural variations (SVs) play a significant role in speciation and adaptation in many species, yet few studies have explored the prevalence and impact of different categories of SVs. We conducted a comparative analysis of long-read assembled reference genomes of closely related Eucalyptus species to identify candidate SVs potentially influencing speciation and adaptation. Interspecies SVs can be either fixed differences or polymorphic in one or both species. To describe SV patterns, we employed short-read whole-genome sequencing on over 600 individuals of Eucalyptus melliodora and Eucalyptus sideroxylon, along with recent high-quality genome assemblies. We aligned reads and genotyped interspecies SVs predicted between species reference genomes. Our results revealed that 49,756 of 58,025 and 39,536 of 47,064 interspecies SVs could be typed with short reads in E. melliodora and E. sideroxylon, respectively. Focusing on inversions and translocations, symmetric SVs that are readily genotyped within both populations, 24 were found to be structural divergences, 2,623 structural polymorphisms, and 928 shared structural polymorphisms. We assessed the functional significance of fixed interspecies SVs by examining differences in estimated recombination rates and genetic differentiation between species, revealing a complex history of natural selection. Shared structural polymorphisms displayed enrichment of potentially adaptive genes. Understanding how different classes of genetic mutations contribute to genetic diversity and reproductive barriers is essential for understanding how organisms enhance fitness, adapt to changing environments, and diversify. Our findings reveal the prevalence of interspecies SVs and elucidate their role in genetic differentiation, adaptive evolution, and species divergence within and between populations.
Collapse
Affiliation(s)
- Scott Ferguson
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2600 Australia
| | - Ashley Jones
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2600 Australia
| | - Kevin Murray
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2600 Australia
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, 72076 Germany
| | - Rose L Andrew
- Botany & N.C.W. Beadle Herbarium, School of Environmental and Rural Science, University of New England, Armidale, NSW 2351, Australia
| | - Benjamin Schwessinger
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2600 Australia
| | - Helen Bothwell
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2600 Australia
- Warnell School of Forestry & Natural Resources, University of Georgia, Athens 30602 GA, United States
| | - Justin Borevitz
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2600 Australia
| |
Collapse
|
12
|
Chun Giok K, Menon RK. The Microbiome of Peri-Implantitis: A Systematic Review of Next-Generation Sequencing Studies. Antibiotics (Basel) 2023; 12:1610. [PMID: 37998812 PMCID: PMC10668804 DOI: 10.3390/antibiotics12111610] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 10/26/2023] [Accepted: 10/26/2023] [Indexed: 11/25/2023] Open
Abstract
(1) Introduction: Current evidence shows that mechanical debridement augmented with systemic and topical antibiotics may be beneficial for the treatment of peri-implantitis. The microbial profile of peri-implantitis plays a key role in identifying the most suitable antibiotics to be used for the treatment and prevention of peri-implantitis. This systematic review aimed to summarize and critically analyze the methodology and findings of studies which have utilized sequencing techniques to elucidate the microbial profiles of peri-implantitis. (2) Results: Fusobacterium, Treponema, and Porphyromonas sp. are associated with peri-implantitis. Veillonella sp. are associated with healthy implant sites and exhibit a reduced prevalence in deeper pockets and with greater severity of disease progression. Streptococcus sp. have been identified both in diseased and healthy sites. Neisseria sp. have been associated with healthy implants and negatively correlate with the probing depth. Methanogens and AAGPRs were also detected in peri-implantitis sites. (3) Methods: The study was registered with the International Prospective Register of Systematic Reviews (PROSPERO) (CRD42023459266). The PRISMA criteria were used to select articles retrieved from a systematic search of the Scopus, Cochrane, and Medline databases until 1 August 2023. Title and abstract screening was followed by a full-text review of the included articles. Thirty-two articles were included in the final qualitative analysis. (4) Conclusions: A distinct microbial profile could not be identified from studies employing sequencing techniques to identify the microbiome. Further studies are needed with more standardization to allow a comparison of findings. A universal clinical parameter for the diagnosis of peri-implantitis should be implemented in all future studies to minimize confounding factors. The subject pool should also be more diverse and larger to compensate for individual differences, and perhaps a distinct microbial profile can be seen with a larger sample size.
Collapse
Affiliation(s)
- Koay Chun Giok
- School of Dentistry, International Medical University, Kuala Lumpur 57000, Malaysia;
| | | |
Collapse
|
13
|
Dutta A, McDonald BA, Croll D. Combined reference-free and multi-reference based GWAS uncover cryptic variation underlying rapid adaptation in a fungal plant pathogen. PLoS Pathog 2023; 19:e1011801. [PMID: 37972199 PMCID: PMC10688896 DOI: 10.1371/journal.ppat.1011801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 11/30/2023] [Accepted: 11/06/2023] [Indexed: 11/19/2023] Open
Abstract
Microbial pathogens often harbor substantial functional diversity driven by structural genetic variation. Rapid adaptation from such standing variation threatens global food security and human health. Genome-wide association studies (GWAS) provide a powerful approach to identify genetic variants underlying recent pathogen adaptation. However, the reliance on single reference genomes and single nucleotide polymorphisms (SNPs) obscures the true extent of adaptive genetic variation. Here, we show quantitatively how a combination of multiple reference genomes and reference-free approaches captures substantially more relevant genetic variation compared to single reference mapping. We performed reference-genome based association mapping across 19 reference-quality genomes covering the diversity of the species. We contrasted the results with a reference-free (i.e., k-mer) approach using raw whole-genome sequencing data in a panel of 145 strains collected across the global distribution range of the fungal wheat pathogen Zymoseptoria tritici. We mapped the genetic architecture of 49 life history traits including virulence, reproduction and growth in multiple stressful environments. The inclusion of additional reference genome SNP datasets provides a nearly linear increase in additional loci mapped through GWAS. Variants detected through the k-mer approach explained a higher proportion of phenotypic variation than a reference genome-based approach and revealed functionally confirmed loci that classic GWAS approaches failed to map. The power of GWAS in microbial pathogens can be significantly enhanced by comprehensively capturing structural genetic variation. Our approach is generalizable to a large number of species and will uncover novel mechanisms driving rapid adaptation of pathogens.
Collapse
Affiliation(s)
- Anik Dutta
- Plant Pathology, Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
| | - Bruce A. McDonald
- Plant Pathology, Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
| | - Daniel Croll
- Laboratory of Evolutionary Genetics, Institute of Biology, University of Neuchâtel, Neuchâtel, Switzerland
| |
Collapse
|
14
|
Thorburn DMJ, Sagonas K, Binzer-Panchal M, Chain FJJ, Feulner PGD, Bornberg-Bauer E, Reusch TBH, Samonte-Padilla IE, Milinski M, Lenz TL, Eizaguirre C. Origin matters: Using a local reference genome improves measures in population genomics. Mol Ecol Resour 2023; 23:1706-1723. [PMID: 37489282 DOI: 10.1111/1755-0998.13838] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 05/10/2023] [Accepted: 06/02/2023] [Indexed: 07/26/2023]
Abstract
Genome sequencing enables answering fundamental questions about the genetic basis of adaptation, population structure and epigenetic mechanisms. Yet, we usually need a suitable reference genome for mapping population-level resequencing data. In some model systems, multiple reference genomes are available, giving the challenging task of determining which reference genome best suits the data. Here, we compared the use of two different reference genomes for the three-spined stickleback (Gasterosteus aculeatus), one novel genome derived from a European gynogenetic individual and the published reference genome of a North American individual. Specifically, we investigated the impact of using a local reference versus one generated from a distinct lineage on several common population genomics analyses. Through mapping genome resequencing data of 60 sticklebacks from across Europe and North America, we demonstrate that genetic distance among samples and the reference genomes impacts downstream analyses. Using a local reference genome increased mapping efficiency and genotyping accuracy, effectively retaining more and better data. Despite comparable distributions of the metrics generated across the genome using SNP data (i.e. π, Tajima's D and FST ), window-based statistics using different references resulted in different outlier genes and enriched gene functions. A marker-based analysis of DNA methylation distributions had a comparably high overlap in outlier genes and functions, yet with distinct differences depending on the reference genome. Overall, our results highlight how using a local reference genome decreases reference bias to increase confidence in downstream analyses of the data. Such results have significant implications in all reference-genome-based population genomic analyses.
Collapse
Affiliation(s)
- Doko-Miles J Thorburn
- School of Biological and Chemical Sciences, Queen Mary University of London, London, UK
- Department of Life Sciences, Imperial College London, London, UK
| | - Kostas Sagonas
- School of Biological and Chemical Sciences, Queen Mary University of London, London, UK
- Department of Zoology, School of Biology, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Mahesh Binzer-Panchal
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, National Bioinformatics Infrastructure Sweden (NBIS), Uppsala University, Uppsala, Sweden
| | - Frederic J J Chain
- Department of Biological Sciences, University of Massachusetts Lowell, Lowell, Massachusetts, USA
| | - Philine G D Feulner
- Department of Fish Ecology and Evolution, Centre of Ecology, Evolution and Biogeochemistry, EAWAG Swiss Federal Institute of Aquatic Science and Technology, Kastanienbaum, Switzerland
- Division of Aquatic Ecology and Evolution, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
| | - Erich Bornberg-Bauer
- Evolutionary Bioinformatics, Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Thorsten B H Reusch
- Marine Evolutionary Ecology, GEOMAR Helmholtz Centre for Ocean Research, Kiel, Germany
| | - Irene E Samonte-Padilla
- Department of Evolutionary Ecology, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Manfred Milinski
- Department of Evolutionary Ecology, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Tobias L Lenz
- Research Group for Evolutionary Immunogenomics, Max Planck Institute for Evolutionary Biology, Plön, Germany
- Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, Hamburg, Germany
| | - Christophe Eizaguirre
- School of Biological and Chemical Sciences, Queen Mary University of London, London, UK
| |
Collapse
|
15
|
Donkpegan ASL, Bernard A, Barreneche T, Quero-García J, Bonnet H, Fouché M, Le Dantec L, Wenden B, Dirlewanger E. Genome-wide association mapping in a sweet cherry germplasm collection ( Prunus avium L.) reveals candidate genes for fruit quality traits. HORTICULTURE RESEARCH 2023; 10:uhad191. [PMID: 38239559 PMCID: PMC10794993 DOI: 10.1093/hr/uhad191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 09/12/2023] [Indexed: 01/22/2024]
Abstract
In sweet cherry (Prunus avium L.), large variability exists for various traits related to fruit quality. There is a need to discover the genetic architecture of these traits in order to enhance the efficiency of breeding strategies for consumer and producer demands. With this objective, a germplasm collection consisting of 116 sweet cherry accessions was evaluated for 23 agronomic fruit quality traits over 2-6 years, and characterized using a genotyping-by-sequencing approach. The SNP coverage collected was used to conduct a genome-wide association study using two multilocus models and three reference genomes. We identified numerous SNP-trait associations for global fruit size (weight, width, and thickness), fruit cracking, fruit firmness, and stone size, and we pinpointed several candidate genes involved in phytohormone, calcium, and cell wall metabolisms. Finally, we conducted a precise literature review focusing on the genetic architecture of fruit quality traits in sweet cherry to compare our results with potential colocalizations of marker-trait associations. This study brings new knowledge of the genetic control of important agronomic traits related to fruit quality, and to the development of marker-assisted selection strategies targeted towards the facilitation of breeding efforts.
Collapse
Affiliation(s)
- Armel S L Donkpegan
- UMR BFP, INRAE, University of Bordeaux, 71 Avenue Edouard
Bourlaux, F-33882 Villenave d’Ornon, France
- UMR BOA, SYSAAF, Centre INRAE Val de Loire, 37380
Nouzilly, France
| | - Anthony Bernard
- UMR BFP, INRAE, University of Bordeaux, 71 Avenue Edouard
Bourlaux, F-33882 Villenave d’Ornon, France
| | - Teresa Barreneche
- UMR BFP, INRAE, University of Bordeaux, 71 Avenue Edouard
Bourlaux, F-33882 Villenave d’Ornon, France
| | - José Quero-García
- UMR BFP, INRAE, University of Bordeaux, 71 Avenue Edouard
Bourlaux, F-33882 Villenave d’Ornon, France
| | - Hélène Bonnet
- UMR BFP, INRAE, University of Bordeaux, 71 Avenue Edouard
Bourlaux, F-33882 Villenave d’Ornon, France
| | - Mathieu Fouché
- UMR BFP, INRAE, University of Bordeaux, 71 Avenue Edouard
Bourlaux, F-33882 Villenave d’Ornon, France
| | - Loïck Le Dantec
- UMR BFP, INRAE, University of Bordeaux, 71 Avenue Edouard
Bourlaux, F-33882 Villenave d’Ornon, France
| | - Bénédicte Wenden
- UMR BFP, INRAE, University of Bordeaux, 71 Avenue Edouard
Bourlaux, F-33882 Villenave d’Ornon, France
| | - Elisabeth Dirlewanger
- UMR BFP, INRAE, University of Bordeaux, 71 Avenue Edouard
Bourlaux, F-33882 Villenave d’Ornon, France
| |
Collapse
|
16
|
Scott NE, Edwin Erayil S, Kline SE, Selmecki A. Rapid Evolution of Multidrug Resistance in a Candida lusitaniae Infection during Micafungin Monotherapy. Antimicrob Agents Chemother 2023; 67:e0054323. [PMID: 37428075 PMCID: PMC10433866 DOI: 10.1128/aac.00543-23] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 06/13/2023] [Indexed: 07/11/2023] Open
Abstract
Candida (Clavispora) lusitaniae is a rare, emerging non-albicans Candida species that can cause life-threatening invasive infections, spread within hospital settings, and rapidly acquire antifungal drug resistance, including multidrug resistance. The frequency and spectrum of mutations causing antifungal drug resistance in C. lusitaniae are poorly understood. Analyses of serial clinical isolates of any Candida species are uncommon and often analyze a limited number of samples collected over months of antifungal therapy with multiple drug classes, limiting the ability to understand relationships between drug classes and specific mutations. Here, we performed comparative genomic and phenotypic analysis of 20 serial C. lusitaniae bloodstream isolates collected daily from an individual patient treated with micafungin monotherapy during a single 11-day hospital admission. We identified isolates with decreased micafungin susceptibility 4 days after initiation of antifungal therapy and a single isolate with increased cross-resistance to micafungin and fluconazole, despite no history of azole therapy in this patient. Only 14 unique single nucleotide polymorphisms (SNPs) were identified between all 20 samples, including three different FKS1 alleles among isolates with decreased micafungin susceptibility and an ERG3 missense mutation found only in the isolate with increased cross-resistance to both micafungin and fluconazole. This is the first clinical evidence of an ERG3 mutation in C. lusitaniae that occurred during echinocandin monotherapy and is associated with cross-resistance to multiple drug classes. Overall, the evolution of multidrug resistance in C. lusitaniae is rapid and can emerge during treatment with only first-line antifungal therapy.
Collapse
Affiliation(s)
- Nancy E. Scott
- University of Minnesota, Bioinformatics and Computational Biology Program, Minneapolis, Minnesota, USA
- University of Minnesota, Department of Microbiology and Immunology, Minneapolis, Minnesota, USA
| | - Serin Edwin Erayil
- University of Minnesota Medical School, Department of Medicine, Division of Infectious Diseases and International Medicine, Minneapolis, Minnesota, USA
| | - Susan E. Kline
- University of Minnesota Medical School, Department of Medicine, Division of Infectious Diseases and International Medicine, Minneapolis, Minnesota, USA
| | - Anna Selmecki
- University of Minnesota, Bioinformatics and Computational Biology Program, Minneapolis, Minnesota, USA
- University of Minnesota, Department of Microbiology and Immunology, Minneapolis, Minnesota, USA
| |
Collapse
|
17
|
Robinson ML, Johnson J, Naik S, Patil S, Kulkarni R, Kinikar A, Dohe V, Mudshingkar S, Kagal A, Smith RM, Westercamp M, Randive B, Kadam A, Babiker A, Kulkarni V, Karyakarte R, Mave V, Gupta A, Milstone AM, Manabe YC. Maternal Colonization Versus Nosocomial Transmission as the Source of Drug-Resistant Bloodstream Infection in an Indian Neonatal Intensive Care Unit: A Prospective Cohort Study. Clin Infect Dis 2023; 77:S38-S45. [PMID: 37406039 PMCID: PMC10321698 DOI: 10.1093/cid/ciad282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2023] Open
Abstract
BACKGROUND Drug-resistant gram-negative (GN) pathogens are a common cause of neonatal sepsis in low- and middle-income countries. Identifying GN transmission patterns is vital to inform preventive efforts. METHODS We conducted a prospective cohort study, 12 October 2018 to 31 October 2019 to describe the association of maternal and environmental GN colonization with bloodstream infection (BSI) among neonates admitted to a neonatal intensive care unit (NICU) in Western India. We assessed rectal and vaginal colonization in pregnant women presenting for delivery and colonization in neonates and the environment using culture-based methods. We also collected data on BSI for all NICU patients, including neonates born to unenrolled mothers. Organism identification, antibiotic susceptibility testing, and next-generation sequencing (NGS) were performed to compare BSI and related colonization isolates. RESULTS Among 952 enrolled women who delivered, 257 neonates required NICU admission, and 24 (9.3%) developed BSI. Among mothers of neonates with GN BSI (n = 21), 10 (47.7%) had rectal, 5 (23.8%) had vaginal, and 10 (47.7%) had no colonization with resistant GN organisms. No maternal isolates matched the species and resistance pattern of associated neonatal BSI isolates. Thirty GN BSI were observed among neonates born to unenrolled mothers. Among 37 of 51 BSI with available NGS data, 21 (57%) showed a single nucleotide polymorphism distance of ≤5 to another BSI isolate. CONCLUSIONS Prospective assessment of maternal GN colonization did not demonstrate linkage to neonatal BSI. Organism-relatedness among neonates with BSI suggests nosocomial spread, highlighting the importance of NICU infection prevention and control practices to reduce GN BSI.
Collapse
Affiliation(s)
- Matthew L Robinson
- Division of Infectious Diseases, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Julia Johnson
- Division of Neonatology, Department of Pediatrics, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
- Department of International Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Shilpa Naik
- Department of Obstetrics, Byramjee Jeejeebhoy Government Medical College, Pune, India
| | - Sunil Patil
- Department of Obstetrics, Byramjee Jeejeebhoy Government Medical College, Pune, India
| | - Rajesh Kulkarni
- Department of Pediatrics, Byramjee Jeejeebhoy Government Medical College, Pune, India
| | - Aarti Kinikar
- Department of Pediatrics, Byramjee Jeejeebhoy Government Medical College, Pune, India
| | - Vaishali Dohe
- Department of Microbiology, Byramjee Jeejeebhoy Government Medical College, Pune, India
| | - Swati Mudshingkar
- Department of Microbiology, Byramjee Jeejeebhoy Government Medical College, Pune, India
| | - Anju Kagal
- Department of Microbiology, Byramjee Jeejeebhoy Government Medical College, Pune, India
| | - Rachel M Smith
- Centers for Disease Control and Prevention, Atlanta, Georgia, USA
| | | | - Bharat Randive
- Byramjee Jeejeebhoy Government Medical College, Johns Hopkins University Clinical Research Site, Pune, India
| | - Abhay Kadam
- Byramjee Jeejeebhoy Government Medical College, Johns Hopkins University Clinical Research Site, Pune, India
| | - Ahmed Babiker
- Division of Infectious Diseases, Department of Medicine, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Vandana Kulkarni
- Byramjee Jeejeebhoy Government Medical College, Johns Hopkins University Clinical Research Site, Pune, India
| | - Rajesh Karyakarte
- Department of Microbiology, Byramjee Jeejeebhoy Government Medical College, Pune, India
| | - Vidya Mave
- Division of Infectious Diseases, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
- Byramjee Jeejeebhoy Government Medical College, Johns Hopkins University Clinical Research Site, Pune, India
| | - Amita Gupta
- Division of Infectious Diseases, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Aaron M Milstone
- Division of Pediatric Infectious Diseases, Department of Pediatrics, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Yukari C Manabe
- Division of Infectious Diseases, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| |
Collapse
|
18
|
Izydorczyk C, Waddell BJ, Thornton CS, Conly JM, Rabin HR, Somayaji R, Surette MG, Church DL, Parkins MD. Stenotrophomonas maltophilia natural history and evolution in the airways of adults with cystic fibrosis. Front Microbiol 2023; 14:1205389. [PMID: 37396351 PMCID: PMC10308010 DOI: 10.3389/fmicb.2023.1205389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 05/22/2023] [Indexed: 07/04/2023] Open
Abstract
Introduction Stenotrophomonas maltophilia is an opportunistic pathogen infecting persons with cystic fibrosis (pwCF) and portends a worse prognosis. Studies of S. maltophilia infection dynamics have been limited by cohort size and follow-up. We investigated the natural history, transmission potential, and evolution of S. maltophilia in a large Canadian cohort of 321 pwCF over a 37-year period. Methods One-hundred sixty-two isolates from 74 pwCF (23%) were typed by pulsed-field gel electrophoresis, and shared pulsotypes underwent whole-genome sequencing. Results S. maltophilia was recovered at least once in 82 pwCF (25.5%). Sixty-four pwCF were infected by unique pulsotypes, but shared pulsotypes were observed between 10 pwCF. In chronic carriage, longer time periods between positive sputum cultures increased the likelihood that subsequent isolates were unrelated. Isolates from individual pwCF were largely clonal, with differences in gene content being the primary source of genetic diversity objectified by gene content differences. Disproportionate progression of CF lung disease was not observed amongst those infected with multiple strains over time (versus a single) or amongst those with shared clones (versus strains only infecting one patient). We did not observe evidence of patient-to-patient transmission despite relatedness between isolates. Twenty-four genes with ≥ 2 mutations accumulated over time were identified across 42 sequenced isolates from all 11 pwCF with ≥ 2 sequenced isolates, suggesting a potential role for these genes in adaptation of S. maltophilia to the CF lung. Discussion Genomic analyses suggested common, indirect sources as the origins of S. maltophilia infections in the clinic population. The information derived from a genomics-based understanding of the natural history of S. maltophilia infection within CF provides unique insight into its potential for in-host evolution.
Collapse
Affiliation(s)
- Conrad Izydorczyk
- Department of Microbiology, Immunology, and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Barbara J. Waddell
- Department of Microbiology, Immunology, and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Christina S. Thornton
- Department of Microbiology, Immunology, and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
- Department of Medicine, Cumming School of Medicine, University of Calgary and Alberta Health Services, Calgary, AB, Canada
| | - John M. Conly
- Department of Microbiology, Immunology, and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
- Department of Medicine, Cumming School of Medicine, University of Calgary and Alberta Health Services, Calgary, AB, Canada
- Cumming School of Medicine, Snyder Institute for Chronic Diseases, University of Calgary and Alberta Health Services, Calgary, AB, Canada
- Department of Pathology and Laboratory Medicine, Cumming School of Medicine, University of Calgary and Alberta Health Services, Calgary, AB, Canada
| | - Harvey R. Rabin
- Department of Microbiology, Immunology, and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
- Department of Medicine, Cumming School of Medicine, University of Calgary and Alberta Health Services, Calgary, AB, Canada
| | - Ranjani Somayaji
- Department of Microbiology, Immunology, and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
- Department of Medicine, Cumming School of Medicine, University of Calgary and Alberta Health Services, Calgary, AB, Canada
- Cumming School of Medicine, Snyder Institute for Chronic Diseases, University of Calgary and Alberta Health Services, Calgary, AB, Canada
| | - Michael G. Surette
- Department of Microbiology, Immunology, and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada
| | - Deirdre L. Church
- Department of Medicine, Cumming School of Medicine, University of Calgary and Alberta Health Services, Calgary, AB, Canada
- Cumming School of Medicine, Snyder Institute for Chronic Diseases, University of Calgary and Alberta Health Services, Calgary, AB, Canada
- Department of Pathology and Laboratory Medicine, Cumming School of Medicine, University of Calgary and Alberta Health Services, Calgary, AB, Canada
| | - Michael D. Parkins
- Department of Microbiology, Immunology, and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
- Department of Medicine, Cumming School of Medicine, University of Calgary and Alberta Health Services, Calgary, AB, Canada
- Cumming School of Medicine, Snyder Institute for Chronic Diseases, University of Calgary and Alberta Health Services, Calgary, AB, Canada
| |
Collapse
|
19
|
Mthethwa S, Bester‐van der Merwe AE, Roodt‐Wilding R. Addressing the complex phylogenetic relationship of the Gempylidae fishes using mitogenome data. Ecol Evol 2023; 13:e10217. [PMID: 37351481 PMCID: PMC10283032 DOI: 10.1002/ece3.10217] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 06/06/2023] [Accepted: 06/09/2023] [Indexed: 06/24/2023] Open
Abstract
The Gempylidae (snake mackerels) family, belonging to the order Perciformes, consists of about 24 species described in 16 genera primarily distributed in tropical, subtropical, and temperate seas worldwide. Despite substantial research on this family utilizing morphological and molecular approaches, taxonomy categorization in this group has remained puzzling for decades prompting the need for further investigation into the underlying evolutionary history among the gempylids using molecular tools. In this study, we assembled eight complete novel mitochondrial genomes for five Gempylidae species (Neoepinnula minetomai, Neoepinnula orientalis, Rexea antefurcata, Rexea prometheoides, and Thyrsites atun) using Ion Torrent sequencing to supplement publicly available mitogenome data for gempylids. Using Bayesian inference and maximum-likelihood tree search methods, we investigated the evolutionary relationships of 17 Gempylidae species using mitogenome data. In addition, we estimated divergence times for extant gempylids. We identified two major clades that formed approximately 48.05 (35.89-52.04) million years ago: Gempylidae 1 (Thyrsites atun, Promethichthys prometheus, Nealotus tripes, Diplospinus multistriatus, Paradiplospinus antarcticus, Rexea antefurcata, Rexea nakamurai, Rexea prometheoides, Rexea solandri, Thyrsitoides marleyi, Gempylus serpens, and Nesiarchus nasutus) and Gempylidae 2 (Lepidocybium flavobrunneum, Ruvettus pretiosus, Neoepinnula minetomai, Neoepinnula orientalis, and Epinnula magistralis). The present study demonstrated the superior performance of complete mitogenome data compared with individual genes in phylogenetic reconstruction. By including T. atun individuals from different regions, we demonstrated the potential for the application of mitogenomes in species phylogeography.
Collapse
Affiliation(s)
- Siphesihle Mthethwa
- Molecular Breeding and Biodiversity Group, Department of GeneticsStellenbosch UniversityStellenboschSouth Africa
| | | | - Rouvay Roodt‐Wilding
- Molecular Breeding and Biodiversity Group, Department of GeneticsStellenbosch UniversityStellenboschSouth Africa
| |
Collapse
|
20
|
Foo A, Cerdeira L, Hughes GL, Heinz E. Recovery of metagenomic data from the Aedes aegypti microbiome using a reproducible snakemake pipeline: MINUUR. Wellcome Open Res 2023; 8:131. [PMID: 37577055 PMCID: PMC10412942 DOI: 10.12688/wellcomeopenres.19155.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/18/2023] [Indexed: 08/15/2023] Open
Abstract
Background: Ongoing research of the mosquito microbiome aims to uncover novel strategies to reduce pathogen transmission. Sequencing costs, especially for metagenomics, are however still significant. A resource that is increasingly used to gain insights into host-associated microbiomes is the large amount of publicly available genomic data based on whole organisms like mosquitoes, which includes sequencing reads of the host-associated microbes and provides the opportunity to gain additional value from these initially host-focused sequencing projects. Methods: To analyse non-host reads from existing genomic data, we developed a snakemake workflow called MINUUR (Microbial INsights Using Unmapped Reads). Within MINUUR, reads derived from the host-associated microbiome were extracted and characterised using taxonomic classifications and metagenome assembly followed by binning and quality assessment. We applied this pipeline to five publicly available Aedes aegypti genomic datasets, consisting of 62 samples with a broad range of sequencing depths. Results: We demonstrate that MINUUR recovers previously identified phyla and genera and is able to extract bacterial metagenome assembled genomes (MAGs) associated to the microbiome. Of these MAGS, 42 are high-quality representatives with >90% completeness and <5% contamination. These MAGs improve the genomic representation of the mosquito microbiome and can be used to facilitate genomic investigation of key genes of interest. Furthermore, we show that samples with a high number of KRAKEN2 assigned reads produce more MAGs. Conclusions: Our metagenomics workflow, MINUUR, was applied to a range of Aedes aegypti genomic samples to characterise microbiome-associated reads. We confirm the presence of key mosquito-associated symbionts that have previously been identified in other studies and recovered high-quality bacterial MAGs. In addition, MINUUR and its associated documentation are freely available on GitHub and provide researchers with a convenient workflow to investigate microbiome data included in the sequencing data for any applicable host genome of interest.
Collapse
Affiliation(s)
- Aidan Foo
- Vector Biology and Tropical Disease Biology, Liverpool School of Tropical Medicine, Liverpool, L3 5QA, UK
| | - Louise Cerdeira
- Vector Biology, Liverpool School of Tropical Medicine, Liverpool, L3 5QA, UK
| | - Grant L. Hughes
- Vector Biology and Tropical Disease Biology, Liverpool School of Tropical Medicine, Liverpool, L3 5QA, UK
| | - Eva Heinz
- Vector Biology and Clinical Sciences, Liverpool School of Tropical Medicine, Liverpool, L3 5QA, UK
| |
Collapse
|
21
|
Gali KV, St. Jacques RM, Daniels CID, O'Rourke A, Turner L. Surveillance of carbapenem-resistant organisms using next-generation sequencing. Front Public Health 2023; 11:1184045. [PMID: 37255756 PMCID: PMC10225708 DOI: 10.3389/fpubh.2023.1184045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 04/06/2023] [Indexed: 06/01/2023] Open
Abstract
The genomic data generated from next-generation sequencing (NGS) provides nucleotide-level resolution of bacterial genomes which is critical for disease surveillance and the implementation of prevention strategies to interrupt the spread of antimicrobial resistance (AMR) bacteria. Infection with AMR bacteria, including Gram-negative Carbapenem-Resistant Organisms (CRO), may be acute and recurrent-once they have colonized a patient, they are notoriously difficult to eradicate. Through phylogenetic tools that assess the single nucleotide polymorphisms (SNPs) within a pathogen genome dataset, public health scientists can estimate the genetic identity between isolates. This information is used as an epidemiologic proxy of a putative outbreak. Pathogens with minimal to no differences in SNPs are likely to be the same strain attributable to a common source or transmission between cases. These genomic comparisons enhance public health response by prompting targeted intervention and infection control measures. This methodology overview demonstrates the utility of phenotypic and molecular assays, antimicrobial susceptibility testing (AST), NGS, publicly available genomics databases, and open-source bioinformatics pipelines for a tiered workflow to detect resistance genes and potential clusters of illness. These methods, when used in combination, facilitate a genomic surveillance workflow for detecting potential AMR bacterial outbreaks to inform epidemiologic investigations. Use of this workflow helps to target and focus epidemiologic resources to the cases with the highest likelihood of being related.
Collapse
Affiliation(s)
- Katelin V. Gali
- Division of Consolidated Laboratory Services, Department of General Services, Richmond, VA, United States
| | - Rachael M. St. Jacques
- Division of Consolidated Laboratory Services, Department of General Services, Richmond, VA, United States
| | - Cheyanne I. D. Daniels
- Division of Consolidated Laboratory Services, Department of General Services, Richmond, VA, United States
| | - Allison O'Rourke
- Division of Clinical Epidemiology, Office of Epidemiology, Virginia Department of Health, Richmond, VA, United States
| | - Lauren Turner
- Division of Consolidated Laboratory Services, Department of General Services, Richmond, VA, United States
| |
Collapse
|
22
|
Waizumi R, Tsubota T, Jouraku A, Kuwazaki S, Yokoi K, Iizuka T, Yamamoto K, Sezutsu H. Highly accurate genome assembly of an improved high-yielding silkworm strain, Nichi01. G3 (BETHESDA, MD.) 2023; 13:jkad044. [PMID: 36814357 PMCID: PMC10085791 DOI: 10.1093/g3journal/jkad044] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 01/23/2023] [Accepted: 02/14/2023] [Indexed: 02/24/2023]
Abstract
The silkworm (Bombyx mori) is an important lepidopteran model insect and an industrial domestic animal traditionally used for silk production. Here, we report the genome assembly of an improved Japanese strain Nichi01, in which the cocoon yield is comparable to that of commercial silkworm strains. The integration of PacBio Sequel II long-read and ddRAD-seq-based high-density genetic linkage map achieved the highest quality genome assembly of silkworms to date; 22 of the 28 pseudomolecules contained telomeric repeats at both ends, and only four gaps were present in the assembly. A total of 452 Mbp of the assembly with an N50 of 16.614 Mbp covered 99.3% of the complete orthologs of the lepidopteran core genes. Although the genome sequence of Nichi01 and that of the previously reported low-yielding tropical strain p50T assured their accuracy in most regions, we corrected several regions, misassembled in p50T, in our assembly. A total of 18,397 proteins were predicted using over 95 Gb of mRNA-seq derived from 10 different organs, covering 96.9% of the complete orthologs of the lepidopteran core genes. The final assembly and annotation files are available in KAIKObase (https://kaikobase.dna.affrc.go.jp/index.html) along with a genome browser and BLAST searching service, which would facilitate further studies and the breeding of silkworms and other insects.
Collapse
Affiliation(s)
- Ryusei Waizumi
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Takuya Tsubota
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Akiya Jouraku
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Seigo Kuwazaki
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Kakeru Yokoi
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Tetsuya Iizuka
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Kimiko Yamamoto
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Hideki Sezutsu
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| |
Collapse
|
23
|
Mohamed M, Sabot F, Varoqui M, Mugat B, Audouin K, Pélisson A, Fiston-Lavier AS, Chambeyron S. TrEMOLO: accurate transposable element allele frequency estimation using long-read sequencing data combining assembly and mapping-based approaches. Genome Biol 2023; 24:63. [PMID: 37013657 PMCID: PMC10069131 DOI: 10.1186/s13059-023-02911-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 03/23/2023] [Indexed: 04/05/2023] Open
Abstract
Transposable Element MOnitoring with LOng-reads (TrEMOLO) is a new software that combines assembly- and mapping-based approaches to robustly detect genetic elements called transposable elements (TEs). Using high- or low-quality genome assemblies, TrEMOLO can detect most TE insertions and deletions and estimate their allele frequency in populations. Benchmarking with simulated data revealed that TrEMOLO outperforms other state-of-the-art computational tools. TE detection and frequency estimation by TrEMOLO were validated using simulated and experimental datasets. Therefore, TrEMOLO is a comprehensive and suitable tool to accurately study TE dynamics. TrEMOLO is available under GNU GPL3.0 at https://github.com/DrosophilaGenomeEvolution/TrEMOLO .
Collapse
Affiliation(s)
- Mourdas Mohamed
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| | - François Sabot
- DIADE, University of Montpellier, CIRAD, IRD, Montpellier, France
- IFB - Southgreen Bioversity, CIRAD, INRAE, IRD, Montpellier, France
| | - Marion Varoqui
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| | - Bruno Mugat
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| | | | - Alain Pélisson
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| | - Anna-Sophie Fiston-Lavier
- ISEM, Université Montpellier, CNRS, IRD, CIRAD, EPHE, Montpellier, France.
- Institut Universitaire de France (IUF), Paris, France.
| | - Séverine Chambeyron
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France.
| |
Collapse
|
24
|
Theissinger K, Fernandes C, Formenti G, Bista I, Berg PR, Bleidorn C, Bombarely A, Crottini A, Gallo GR, Godoy JA, Jentoft S, Malukiewicz J, Mouton A, Oomen RA, Paez S, Palsbøll PJ, Pampoulie C, Ruiz-López MJ, Secomandi S, Svardal H, Theofanopoulou C, de Vries J, Waldvogel AM, Zhang G, Jarvis ED, Bálint M, Ciofi C, Waterhouse RM, Mazzoni CJ, Höglund J. How genomics can help biodiversity conservation. Trends Genet 2023:S0168-9525(23)00020-3. [PMID: 36801111 DOI: 10.1016/j.tig.2023.01.005] [Citation(s) in RCA: 40] [Impact Index Per Article: 40.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 11/08/2022] [Accepted: 01/19/2023] [Indexed: 02/18/2023]
Abstract
The availability of public genomic resources can greatly assist biodiversity assessment, conservation, and restoration efforts by providing evidence for scientifically informed management decisions. Here we survey the main approaches and applications in biodiversity and conservation genomics, considering practical factors, such as cost, time, prerequisite skills, and current shortcomings of applications. Most approaches perform best in combination with reference genomes from the target species or closely related species. We review case studies to illustrate how reference genomes can facilitate biodiversity research and conservation across the tree of life. We conclude that the time is ripe to view reference genomes as fundamental resources and to integrate their use as a best practice in conservation genomics.
Collapse
Affiliation(s)
- Kathrin Theissinger
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberg Biodiversity and Climate Research Centre, Georg-Voigt-Str. 14-16, 60325 Frankfurt/Main, Germany
| | - Carlos Fernandes
- CE3C - Centre for Ecology, Evolution and Environmental Changes & CHANGE - Global Change and Sustainability Institute, Departamento de Biologia Animal, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal; Faculdade de Psicologia, Universidade de Lisboa, Alameda da Universidade, 1649-013 Lisboa, Portugal
| | - Giulio Formenti
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA
| | - Iliana Bista
- Naturalis Biodiversity Center, Darwinweg 2, 2333, CR, Leiden, The Netherlands; Wellcome Sanger Institute, Tree of Life, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Paul R Berg
- NIVA - Norwegian Institute for Water Research, Økernveien, 94, 0579 Oslo, Norway; Centre for Coastal Research, University of Agder, Gimlemoen 25j, 4630 Kristiansand, Norway; Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, PO BOX 1066 Blinderm, 0316 Oslo, Norway
| | - Christoph Bleidorn
- University of Göttingen, Department of Animal Evolution and Biodiversity, Untere Karspüle, 2, 37073, Göttingen, Germany
| | | | - Angelica Crottini
- CIBIO/InBio, Centro de Investigação em Biodiversidade e Recursos Genéticos, Rua Padre Armando Quintas, 7, 4485-661, Portugal; Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, 4099-002 Porto, Portugal; BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, 4485-661 Vairão, Portugal
| | - Guido R Gallo
- Department of Biosciences, University of Milan, Milan, Italy
| | - José A Godoy
- Estación Biológica de Doñana, CSIC, Calle Americo Vespucio 26, 41092, Sevillle, Spain
| | - Sissel Jentoft
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, PO BOX 1066 Blinderm, 0316 Oslo, Norway
| | - Joanna Malukiewicz
- Primate Genetics Laborator, German Primate Center, Kellnerweg 4, 37077, Göttingen, Germany
| | - Alice Mouton
- InBios - Conservation Genetics Lab, University of Liege, Chemin de la Vallée 4, 4000, Liege, Belgium
| | - Rebekah A Oomen
- Centre for Coastal Research, University of Agder, Gimlemoen 25j, 4630 Kristiansand, Norway; Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, PO BOX 1066 Blinderm, 0316 Oslo, Norway
| | - Sadye Paez
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA
| | - Per J Palsbøll
- Groningen Institute of Evolutionary Life Sciences, University of Groningen, Nijenborgh, 9747, AG, Groningen, The Netherlands; Center for Coastal Studies, 5 Holway Avenue, Provincetown, MA 02657, USA
| | - Christophe Pampoulie
- Marine and Freshwater Research Institute, Fornubúðir, 5,220, Hanafjörður, Iceland
| | - María J Ruiz-López
- Estación Biológica de Doñana, CSIC, Calle Americo Vespucio 26, 41092, Sevillle, Spain; CIBER de Epidemiología y Salud Pública (CIBERESP), Spain
| | | | - Hannes Svardal
- Department of Biology, University of Antwerp, Universiteitsplein 1, 2610 Wilrijk, Antwerp, Belgium
| | - Constantina Theofanopoulou
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA; Hunter College, City University of New York, NY, USA
| | - Jan de Vries
- University of Goettingen, Institute for Microbiology and Genetics, Department of Applied Bioinformatics, Goettingen Center for Molecular Biosciences (GZMB), Campus Institute Data Science (CIDAS), Goldschmidtstr. 1, 37077, Goettingen, Germany
| | - Ann-Marie Waldvogel
- Institute of Zoology, University of Cologne, Zülpicherstrasse 47b, D-50674, Cologne, Germany
| | - Guojie Zhang
- Evolutionary & Organismal Biology Research Center, Zhejiang University School of Medicine, Hangzhou, 310058, China; Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Denmark; State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Erich D Jarvis
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA
| | - Miklós Bálint
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberg Biodiversity and Climate Research Centre, Georg-Voigt-Str. 14-16, 60325 Frankfurt/Main, Germany
| | - Claudio Ciofi
- University of Florence, Department of Biology, Via Madonna del Piano 6, Sesto Fiorentino, (FI) 50019, Italy
| | - Robert M Waterhouse
- University of Lausanne, Department of Ecology and Evolution, Le Biophore, UNIL-Sorge, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Camila J Mazzoni
- Leibniz Institute for Zoo and Wildlife Research (IZW), Alfred-Kowalke-Str 17, 10315 Berlin, Germany; Berlin Center for Genomics in Biodiversity Research (BeGenDiv), Koenigin-Luise-Str 6-8, 14195 Berlin, Germany
| | - Jacob Höglund
- Department of Ecology and Genetics, Uppsala University, Norbyvägen 18D, 75246, Uppsala, Sweden.
| | | |
Collapse
|
25
|
Duchen D, Clipman S, Vergara C, Thio CL, Thomas DL, Duggal P, Wojcik GL. A hepatitis B virus (HBV) sequence variation graph improves sequence alignment and sample-specific consensus sequence construction for genetic analysis of HBV. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.11.523611. [PMID: 36711598 PMCID: PMC9882026 DOI: 10.1101/2023.01.11.523611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Hepatitis B virus (HBV) remains a global public health concern, with over 250 million individuals living with chronic HBV infection (CHB) and no curative therapy currently available. Viral diversity is associated with CHB pathogenesis and immunological control of infection. Improved methods to characterize the viral genome at both the population and intra-host level could aid drug development efforts. Conventionally, HBV sequencing data are aligned to a linear reference genome and only sequences capable of aligning to the reference are captured for analysis. Reference selection has additional consequences, including sample-specific 'consensus' sequence construction. It remains unclear how to select a reference from available sequences and whether a single reference is sufficient for genetic analyses. Using simulated short-read sequencing data generated from full-length publicly available HBV genome sequences and HBV sequencing data from a longitudinally sampled individual with CHB, we investigate alternative graph-based alignment approaches. We demonstrate that using a phylogenetically representative 'genome graph' for alignment, rather than linear reference sequences, avoids issues of reference ambiguity, improves alignment, and facilitates the construction of sample-specific consensus sequences genetically similar to an individual's infection. Graph-based methods can therefore improve efforts to characterize the genetics of viral pathogens, including HBV, and may have broad implications in host pathogen research.
Collapse
Affiliation(s)
- Dylan Duchen
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Steven Clipman
- Division of Infectious Diseases, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
| | - Candelaria Vergara
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Chloe L Thio
- Division of Infectious Diseases, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
| | - David L Thomas
- Division of Infectious Diseases, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
| | - Priya Duggal
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Genevieve L Wojcik
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| |
Collapse
|
26
|
CRISPR/Cas9-Mediated Enrichment Coupled to Nanopore Sequencing Provides a Valuable Tool for the Precise Reconstruction of Large Genomic Target Regions. Int J Mol Sci 2023; 24:ijms24021076. [PMID: 36674592 PMCID: PMC9863143 DOI: 10.3390/ijms24021076] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 12/23/2022] [Accepted: 12/24/2022] [Indexed: 01/09/2023] Open
Abstract
Complete and accurate identification of genetic variants associated with specific phenotypes can be challenging when there is a high level of genomic divergence between individuals in a study and the corresponding reference genome. We have applied the Cas9-mediated enrichment coupled to nanopore sequencing to perform a targeted de novo assembly and accurately reconstruct a genomic region of interest. This approach was used to reconstruct a 250-kbp target region on chromosome 5 of the common bean genome (Phaseolus vulgaris) associated with the shattering phenotype. Comparing a non-shattering cultivar (Midas) with the reference genome revealed many single-nucleotide variants and structural variants in this region. We cut five 50-kbp tiled sub-regions of Midas genomic DNA using Cas9, followed by sequencing on a MinION device and de novo assembly, generating a single contig spanning the whole 250-kbp region. This assembly increased the number of Illumina reads mapping to genes in the region, improving their genotypability for downstream analysis. The Cas9 tiling approach for target enrichment and sequencing is a valuable alternative to whole-genome sequencing for the assembly of ultra-long regions of interest, improving the accuracy of downstream genotype-phenotype association analysis.
Collapse
|
27
|
Eagle SHC, Robertson J, Bastedo DP, Liu K, Nash JHE. Evaluation of five commercial DNA extraction kits using Salmonella as a model for implementation of rapid Nanopore sequencing in routine diagnostic laboratories. Access Microbiol 2023; 5:000468.v3. [PMID: 36910509 PMCID: PMC9996181 DOI: 10.1099/acmi.0.000468.v3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 11/07/2022] [Indexed: 02/23/2023] Open
Abstract
Oxford Nanopore long-read sequencing offers advantages over Illumina short reads for the identification and characterization of bacterial pathogens for outbreak detection and surveillance activities within a diagnostic public health laboratory context. Compared to Illumina, Nanopore is more cost-effective for small batches, has a lower capital cost and has a faster turnaround time, in addition to the ability to assemble complete bacterial genomes. The quantity and quality of DNA required for Nanopore sequencing are greater than for Illumina, and the DNA extraction methods recommended for obtaining high-molecular-weight DNA are different from those typically used in diagnostic laboratories. Using a Salmonella isolate with a previously closed PacBio genome as a model Enterobacteriaceae organism, we evaluated the quantity, quality and fragmentation of five commercial DNA extraction kits. Nanopore sequencing performance was evaluated for the top three methods: Qiagen EZ1 DNA Tissue, Qiagen DNeasy Blood and Tissue, and a modified, in-house version of the MasterPure Complete DNA and RNA purification. To evaluate the effect of post-extraction DNA purification methods, we subjected extracted DNA from the three selected extraction methods to purification by AMPure beads or ethanol precipitation and compared these outputs with untreated DNA as a control. All methods are suitable for routine whole-genome sequencing (WGS), since all 60 replicates had very high genome recovery rates, with ≥98 % of the reference genome covered by mapped Nanopore reads. For 85 % of the replicates, assembly was able to produce a complete, circular chromosome using either Flye or Canu. In most cases, it is recommended to move directly from extraction to sequencing, as untreated DNA had the highest rates of genome closure regardless of extraction method. Using our evaluation criteria, the Qiagen DNeasy Blood and Tissue kit was found to be the best overall method due to its low cost, ability to scale from single tubes to 96-well plates, and high consistency in yield and sequencing performance.
Collapse
Affiliation(s)
- Shannon H C Eagle
- National Microbiology Laboratory, Public Health Agency of Canada, Guelph, Ontario, Canada
| | - James Robertson
- National Microbiology Laboratory, Public Health Agency of Canada, Guelph, Ontario, Canada
| | - D Patrick Bastedo
- National Microbiology Laboratory, Public Health Agency of Canada, Toronto, Ontario, Canada
| | - Kira Liu
- Patented Medicine Prices Review Board, Ottawa, Ontario, Canada
| | - John H E Nash
- National Microbiology Laboratory, Public Health Agency of Canada, Toronto, Ontario, Canada
| |
Collapse
|
28
|
Deng X, Frandsen PB, Dikow RB, Favre A, Shah DN, Shah RDT, Schneider JV, Heckenhauer J, Pauls SU. The impact of sequencing depth and relatedness of the reference genome in population genomic studies: A case study with two caddisfly species (Trichoptera, Rhyacophilidae, Himalopsyche). Ecol Evol 2022; 12:e9583. [PMID: 36523526 PMCID: PMC9745013 DOI: 10.1002/ece3.9583] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 11/10/2022] [Accepted: 11/16/2022] [Indexed: 12/15/2022] Open
Abstract
Whole genome sequencing for generating SNP data is increasingly used in population genetic studies. However, obtaining genomes for massive numbers of samples is still not within the budgets of many researchers. It is thus imperative to select an appropriate reference genome and sequencing depth to ensure the accuracy of the results for a specific research question, while balancing cost and feasibility. To evaluate the effect of the choice of the reference genome and sequencing depth on downstream analyses, we used five confamilial reference genomes of variable relatedness and three levels of sequencing depth (3.5×, 7.5× and 12×) in a population genomic study on two caddisfly species: Himalopsyche digitata and H. tibetana. Using these 30 datasets (five reference genomes × three depths × two target species), we estimated population genetic indices (inbreeding coefficient, nucleotide diversity, pairwise F ST, and genome-wide distribution of F ST) based on variants and population structure (PCA and admixture) based on genotype likelihood estimates. The results showed that both distantly related reference genomes and lower sequencing depth lead to degradation of resolution. In addition, choosing a more closely related reference genome may significantly remedy the defects caused by low depth. Therefore, we conclude that population genetic studies would benefit from closely related reference genomes, especially as the costs of obtaining a high-quality reference genome continue to decrease. However, to determine a cost-efficient strategy for a specific population genomic study, a trade-off between reference genome relatedness and sequencing depth can be considered.
Collapse
Affiliation(s)
- Xi‐Ling Deng
- Senckenberg Research Institute and Natural History MuseumFrankfurt/MainGermany
- Institute of Insect BiotechnologyJustus‐Liebig‐University GießenGießenGermany
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE‐TBG)Frankfurt/MainGermany
| | - Paul B. Frandsen
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE‐TBG)Frankfurt/MainGermany
- Department of Plant & Wildlife SciencesBrigham Young UniversityProvoUtahUSA
- Data Science Lab, Office of the Chief Information OfficerSmithsonian InstitutionWashingtonDCUSA
| | - Rebecca B. Dikow
- Data Science Lab, Office of the Chief Information OfficerSmithsonian InstitutionWashingtonDCUSA
| | - Adrien Favre
- Senckenberg Research Institute and Natural History MuseumFrankfurt/MainGermany
- Regional Nature Park of the Trient ValleySalvanSwitzerland
| | - Deep Narayan Shah
- Central Department of Environmental ScienceTribhuvan UniversityKirtipurNepal
| | - Ram Devi Tachamo Shah
- Aquatic Ecology Centre, School of ScienceKathmandu UniversityDhulikhelNepal
- Department of Life SciencesSchool of Science, Kathmandu UniversityDhulikhelNepal
| | - Julio V. Schneider
- Senckenberg Research Institute and Natural History MuseumFrankfurt/MainGermany
| | - Jacqueline Heckenhauer
- Senckenberg Research Institute and Natural History MuseumFrankfurt/MainGermany
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE‐TBG)Frankfurt/MainGermany
| | - Steffen U. Pauls
- Senckenberg Research Institute and Natural History MuseumFrankfurt/MainGermany
- Institute of Insect BiotechnologyJustus‐Liebig‐University GießenGießenGermany
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE‐TBG)Frankfurt/MainGermany
| |
Collapse
|
29
|
Muñoz-Barrera A, Rubio-Rodríguez LA, Díaz-de Usera A, Jáspez D, Lorenzo-Salazar JM, González-Montelongo R, García-Olivares V, Flores C. From Samples to Germline and Somatic Sequence Variation: A Focus on Next-Generation Sequencing in Melanoma Research. Life (Basel) 2022; 12:1939. [PMID: 36431075 PMCID: PMC9695713 DOI: 10.3390/life12111939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 11/12/2022] [Accepted: 11/16/2022] [Indexed: 11/24/2022] Open
Abstract
Next-generation sequencing (NGS) applications have flourished in the last decade, permitting the identification of cancer driver genes and profoundly expanding the possibilities of genomic studies of cancer, including melanoma. Here we aimed to present a technical review across many of the methodological approaches brought by the use of NGS applications with a focus on assessing germline and somatic sequence variation. We provide cautionary notes and discuss key technical details involved in library preparation, the most common problems with the samples, and guidance to circumvent them. We also provide an overview of the sequence-based methods for cancer genomics, exposing the pros and cons of targeted sequencing vs. exome or whole-genome sequencing (WGS), the fundamentals of the most common commercial platforms, and a comparison of throughputs and key applications. Details of the steps and the main software involved in the bioinformatics processing of the sequencing results, from preprocessing to variant prioritization and filtering, are also provided in the context of the full spectrum of genetic variation (SNVs, indels, CNVs, structural variation, and gene fusions). Finally, we put the emphasis on selected bioinformatic pipelines behind (a) short-read WGS identification of small germline and somatic variants, (b) detection of gene fusions from transcriptomes, and (c) de novo assembly of genomes from long-read WGS data. Overall, we provide comprehensive guidance across the main methodological procedures involved in obtaining sequencing results for the most common short- and long-read NGS platforms, highlighting key applications in melanoma research.
Collapse
Affiliation(s)
- Adrián Muñoz-Barrera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Luis A. Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Ana Díaz-de Usera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - David Jáspez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - José M. Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Rafaela González-Montelongo
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Víctor García-Olivares
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Carlos Flores
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, 28029 Madrid, Spain
- Facultad de Ciencias de la Salud, Universidad Fernando de Pessoa Canarias, 35450 Las Palmas de Gran Canaria, Spain
| |
Collapse
|
30
|
Marcet-Houben M, Alvarado M, Ksiezopolska E, Saus E, de Groot PWJ, Gabaldón T. Chromosome-level assemblies from diverse clades reveal limited structural and gene content variation in the genome of Candida glabrata. BMC Biol 2022; 20:226. [PMID: 36209154 PMCID: PMC9548116 DOI: 10.1186/s12915-022-01412-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Accepted: 09/20/2022] [Indexed: 11/30/2022] Open
Abstract
Background Candida glabrata is an opportunistic yeast pathogen thought to have a large genetic and phenotypic diversity and a highly plastic genome. However, the lack of chromosome-level genome assemblies representing this diversity limits our ability to accurately establish how chromosomal structure and gene content vary across strains. Results Here, we expanded publicly available assemblies by using long-read sequencing technologies in twelve diverse strains, obtaining a final set of twenty-one chromosome-level genomes spanning the known C. glabrata diversity. Using comparative approaches, we inferred variation in chromosome structure and determined the pan-genome, including an analysis of the adhesin gene repertoire. Our analysis uncovered four new adhesin orthogroups and inferred a rich ancestral adhesion repertoire, which was subsequently shaped through a still ongoing process of gene loss, gene duplication, and gene conversion. Conclusions C. glabrata has a largely stable pan-genome except for a highly variable subset of genes encoding cell wall-associated functions. Adhesin repertoire was established for each strain and showed variability among clades. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-022-01412-1.
Collapse
Affiliation(s)
- Marina Marcet-Houben
- Barcelona Supercomputing Centre (BSC-CNS), Plaça Eusebi Güell, 1-3, 08034, Barcelona, Spain.,Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028, Barcelona, Spain
| | - María Alvarado
- Regional Center for Biomedical Research, University of Castilla-La Mancha, E-02008, Albacete, Spain
| | - Ewa Ksiezopolska
- Barcelona Supercomputing Centre (BSC-CNS), Plaça Eusebi Güell, 1-3, 08034, Barcelona, Spain.,Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028, Barcelona, Spain
| | - Ester Saus
- Barcelona Supercomputing Centre (BSC-CNS), Plaça Eusebi Güell, 1-3, 08034, Barcelona, Spain.,Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028, Barcelona, Spain
| | - Piet W J de Groot
- Regional Center for Biomedical Research, University of Castilla-La Mancha, E-02008, Albacete, Spain.,Castilla-La Mancha Science & Technology Park, E-02006, Albacete, Spain
| | - Toni Gabaldón
- Barcelona Supercomputing Centre (BSC-CNS), Plaça Eusebi Güell, 1-3, 08034, Barcelona, Spain. .,Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028, Barcelona, Spain. .,Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain. .,Centro Investigación Biomédica En Red de Enfermedades Infecciosas, Barcelona, Spain.
| |
Collapse
|
31
|
Lagos AC, Sundqvist M, Dyrkell F, Stegger M, Söderquist B, Mölling P. Evaluation of within-host evolution of methicillin-resistant Staphylococcus aureus (MRSA) by comparing cgMLST and SNP analysis approaches. Sci Rep 2022; 12:10541. [PMID: 35732699 PMCID: PMC9214674 DOI: 10.1038/s41598-022-14640-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 06/09/2022] [Indexed: 11/17/2022] Open
Abstract
Whole genome sequencing (WGS) of methicillin-resistant Staphylococcus aureus (MRSA) provides high-resolution typing, facilitating surveillance and outbreak investigations. The aim of this study was to evaluate the genomic variation rate in MRSA, by comparing commonly used core genome multilocus sequencing (cgMLST) against single nucleotide polymorphism (SNP) analyses. WGS was performed on 95 MRSA isolates, collected from 20 carriers during years 2003–2019. To assess variation and methodological-related differences, two different cgMLST schemes were obtained using Ridom SeqSphere+ and the cloud-based 1928 platform. In addition, two SNP methods, 1928 platform and Northern Arizona SNP Pipeline (NASP) were used. The cgMLST using Ridom SeqSphere+ and 1928 showed a median of 5.0 and 2.0 allele variants/year, respectively. In the SNP analysis, performed with two reference genomes COL and Newman, 1928 showed a median of 13 and 24 SNPs (including presumed recombination) and 3.8 respectively 4.0 SNPs (without recombination) per individual/year. Accordantly, NASP showed a median of 5.5 and 5.8 SNPs per individual/year. In conclusion, an estimated genomic variation rate of 2.0–5.8 genetic events per year (without recombination), is suggested as a general guideline to be used at clinical laboratories for surveillance and outbreak investigations independently of analysis approach used.
Collapse
Affiliation(s)
- Amaya Campillay Lagos
- Department of Laboratory Medicine, Clinical Microbiology, Faculty of Medicine and Health, Örebro University, Örebro, Sweden.
| | - Martin Sundqvist
- Department of Laboratory Medicine, Clinical Microbiology, Faculty of Medicine and Health, Örebro University, Örebro, Sweden
| | | | - Marc Stegger
- Department of Laboratory Medicine, Clinical Microbiology, Faculty of Medicine and Health, Örebro University, Örebro, Sweden.,Department of Bacteria, Parasites and Fungi, Statens Serum Institut, Copenhagen, Denmark
| | - Bo Söderquist
- Department of Laboratory Medicine, Clinical Microbiology, Faculty of Medicine and Health, Örebro University, Örebro, Sweden
| | - Paula Mölling
- Department of Laboratory Medicine, Clinical Microbiology, Faculty of Medicine and Health, Örebro University, Örebro, Sweden
| |
Collapse
|
32
|
Zheng D, Zhang W. Characterization of Expression and Epigenetic Features of Core Genes in Common Wheat. Genes (Basel) 2022; 13:genes13071112. [PMID: 35885895 PMCID: PMC9317296 DOI: 10.3390/genes13071112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Revised: 06/16/2022] [Accepted: 06/20/2022] [Indexed: 12/10/2022] Open
Abstract
The availability of multiple wheat genome sequences enables us to identify core genes and characterize their genetic and epigenetic features, thereby advancing our understanding of their biological implications within individual plant species. It is, however, largely understudied in wheat. To this end, we reanalyzed genome sequences from 16 different wheat varieties and identified 62,299 core genes. We found that core and non-core genes have different roles in subgenome differentiation. Meanwhile, according to their expression profiles, these core genes can be classified into genes related to tissue development and stress responses, including 3376 genes highly expressed in both spikelets and at high temperatures. After associating with six histone marks and open chromatin, we found that these core genes can be divided into eight sub-clusters with distinct epigenomic features. Furthermore, we found that ca. 51% of the expressed transcription factors (TFs) were marked with both H3K27me3 and H3K4me3, indicative of the bivalency feature, which can be involved in tissue development through the TF-centered regulatory network. Thus, our study provides a valuable resource for the functional characterization of core genes in stress responses and tissue development in wheat.
Collapse
|
33
|
Formenti G, Theissinger K, Fernandes C, Bista I, Bombarely A, Bleidorn C, Ciofi C, Crottini A, Godoy JA, Höglund J, Malukiewicz J, Mouton A, Oomen RA, Paez S, Palsbøll PJ, Pampoulie C, Ruiz-López MJ, Svardal H, Theofanopoulou C, de Vries J, Waldvogel AM, Zhang G, Mazzoni CJ, Jarvis ED, Bálint M. The era of reference genomes in conservation genomics. Trends Ecol Evol 2022; 37:197-202. [PMID: 35086739 DOI: 10.1016/j.tree.2021.11.008] [Citation(s) in RCA: 90] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 11/10/2021] [Accepted: 11/16/2021] [Indexed: 02/08/2023]
Abstract
Progress in genome sequencing now enables the large-scale generation of reference genomes. Various international initiatives aim to generate reference genomes representing global biodiversity. These genomes provide unique insights into genomic diversity and architecture, thereby enabling comprehensive analyses of population and functional genomics, and are expected to revolutionize conservation genomics.
Collapse
Affiliation(s)
- Giulio Formenti
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA
| | - Kathrin Theissinger
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Georg-Voigt-Str. 14-16, 60325 Frankfurt/Main, Germany; University of Koblenz-Landau, Institute for Environmental Sciences, Fortstrasse 7, 76829 Landau, Germany; Senckenberg Biodiversity and Climate Research Centre, Georg-Voigt-Str. 14-16, 60325 Frankfurt/Main, Germany
| | - Carlos Fernandes
- CE3C - Centre for Ecology, Evolution and Environmental Changes, Departamento de Biologia Animal, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal; Faculdade de Psicologia, Universidade de Lisboa, Alameda da Universidade, 1649-013 Lisboa, Portugal
| | - Iliana Bista
- University of Cambridge, Department of Genetics, Cambridge CB2 3EH, UK; Wellcome Sanger Institute, CB10 1SA, Hinxton, UK
| | | | - Christoph Bleidorn
- University of Göttingen, Department of Animal Evolution and Biodiversity, Untere Karspüle, 2, 37073, Germany
| | - Claudio Ciofi
- University of Florence, Department of Biology, Via Madonna del Piano 6, Sesto Fiorentino (FI) 50019, Italy
| | - Angelica Crottini
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Campus de Vairão, Universidade do Porto, 4485-661 Vairão, Portugal
| | - José A Godoy
- Estación Biológica de Doñana, Consejo Superior de Investigaciones Científicas, Av. Américo Vespucio, 26, 41092, Spain
| | - Jacob Höglund
- Dept. of Ecology and Genetics, Uppsala University, Norbyvägen 18D, 75246, Sweden
| | | | - Alice Mouton
- InBios - Conservation Genetics Lab, University of Liege, Chemin de la Vallée 4, 4000, Belgium
| | - Rebekah A Oomen
- Centre for Ecological and Evolutionary Synthesis, University of Oslo, Blindernveien 31, 0371 Oslo, Norway; Centre for Coastal Research, University of Agder, Gimlemoen 25j, 4630 Kristiansand, Norway
| | - Sadye Paez
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA
| | - Per J Palsbøll
- Groningen Institute of Evolutionary Life Sciences University of Groningen Nijenborgh, 9747, AG, Groningen, the Netherlands; Center for Coastal Studies, 5 Holway Avenue, Provincetown, MA 02657, USA
| | - Christophe Pampoulie
- Marine and Freshwater Research Institute, Fornubúðir, 5, 220 Hanafjörður, Iceland
| | - María J Ruiz-López
- Estación Biológica de Doñana, Consejo Superior de Investigaciones Científicas, Av. Américo Vespucio, 26, 41092, Spain
| | - Hannes Svardal
- Department of Biology, University of Antwerp, Groenenborgerlaan 171, 2020, Belgium
| | | | - Jan de Vries
- University of Göttingen, Institute for Microbiology and Genetics, Dept. of Applied Bioinformatics, Goettingen Center for Molecular Biosciences (GZMB), Campus Institute Data Science (CIDAS), Goldschmidtstr. 1, 37077, Germany
| | - Ann-Marie Waldvogel
- Institute of Zoology, University of Cologne, Zülpicherstrasse 47b, D-50674, Germany
| | - Guojie Zhang
- Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Denmark, Build 3, Universitetsparken 15, Copenhagen 2100, Denmark; China National Genebank, BGI-Shenzhen, Jinsha Road, Dapeng District, Shenzhen 518083, China
| | - Camila J Mazzoni
- Leibniz Institute for Zoo and Wildlife Research (IZW), Alfred-Kowalke-Str 17, 10315 Berlin, Germany
| | - Erich D Jarvis
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA
| | - Miklós Bálint
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Georg-Voigt-Str. 14-16, 60325 Frankfurt/Main, Germany; Senckenberg Biodiversity and Climate Research Centre, Georg-Voigt-Str. 14-16, 60325 Frankfurt/Main, Germany; Institute for Insect Biotechnology, Justus-Liebig University Gießen, Heinrich-Buff-Ring 26-32, 35392 Giessen, Germany.
| | | |
Collapse
|
34
|
Pust MM, Tümmler B. Bacterial low-abundant taxa are key determinants of a healthy airway metagenome in the early years of human life. Comput Struct Biotechnol J 2021; 20:175-186. [PMID: 35024091 PMCID: PMC8713036 DOI: 10.1016/j.csbj.2021.12.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Revised: 12/06/2021] [Accepted: 12/06/2021] [Indexed: 11/17/2022] Open
Abstract
The default removal of low-abundance (rare) taxa from microbial community analyses may lead to an incomplete picture of the taxonomic and functional microbial potential within the human habitat. Publicly available shotgun metagenomics data of healthy children and children with cystic fibrosis (CF) were reanalysed to study the development of the rare species biosphere, which was here defined by either the 15th, 25th or 35th species abundance percentile. We found that healthy children contained an age-independent network of abundant (core) and rare species with both entities being essential in maintaining the network structure. The protein sequence usage for more than 100 bacterial metabolic pathways differed between the core and rare species biosphere. In CF children, the background structure was underdeveloped and random forest bootstrapping based on all constituents of the early airway metagenome and host-associated factors indicated that rare taxa were the most important variables in deciding whether a child was healthy or suffered from the life-limiting CF disease. Attempts failed to make the age-independent CF network as robust as the healthy structure when an increasing number of bacterial taxa from the healthy network was incorporated into the CF structure by computer-based model simulations. However, the transfer of a key combination of taxa from the healthy to the CF network structure with high species diversity and low species dominance, correlated with a more robust CF network and a topological approximation of CF and healthy graph structures. Rothia mucilaginosa, Streptococci and rare species were essential in improving the underdeveloped CF network.
Collapse
Affiliation(s)
- Marie-Madlen Pust
- Department of Paediatric Pneumology, Allergology, and Neonatology, Hannover Medical School (MHH), Germany
- Biomedical Research in Endstage and Obstructive Lung Disease Hannover (BREATH), German Center for Lung Research, Hannover Medical School, Germany
| | - Burkhard Tümmler
- Department of Paediatric Pneumology, Allergology, and Neonatology, Hannover Medical School (MHH), Germany
- Biomedical Research in Endstage and Obstructive Lung Disease Hannover (BREATH), German Center for Lung Research, Hannover Medical School, Germany
| |
Collapse
|
35
|
Pla-Díaz M, Sánchez-Busó L, Giacani L, Šmajs D, Bosshard PP, Bagheri HC, Schuenemann VJ, Nieselt K, Arora N, González-Candelas F. Evolutionary processes in the emergence and recent spread of the syphilis agent, Treponema pallidum. Mol Biol Evol 2021; 39:6427636. [PMID: 34791386 PMCID: PMC8789261 DOI: 10.1093/molbev/msab318] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The incidence of syphilis has risen worldwide in the last decade in spite of being an easily treated infection. The causative agent of this sexually transmitted disease is the bacterium Treponema pallidum subspecies pallidum (TPA), very closely related to subsp. pertenue (TPE) and endemicum (TEN), responsible for the human treponematoses yaws and bejel, respectively. Although much focus has been placed on the question of the spatial and temporary origins of TPA, the processes driving the evolution and epidemiological spread of TPA since its divergence from TPE and TEN are not well understood. Here, we investigate the effects of recombination and selection as forces of genetic diversity and differentiation acting during the evolution of T. pallidum subspecies. Using a custom-tailored procedure, named phylogenetic incongruence method, with 75 complete genome sequences, we found strong evidence for recombination among the T. pallidum subspecies, involving 12 genes and 21 events. In most cases, only one recombination event per gene was detected and all but one event corresponded to intersubspecies transfers, from TPE/TEN to TPA. We found a clear signal of natural selection acting on the recombinant genes, which is more intense in their recombinant regions. The phylogenetic location of the recombination events detected and the functional role of the genes with signals of positive selection suggest that these evolutionary processes had a key role in the evolution and recent expansion of the syphilis bacteria and significant implications for the selection of vaccine candidates and the design of a broadly protective syphilis vaccine.
Collapse
Affiliation(s)
- Marta Pla-Díaz
- Unidad Mixta Infección y Salud Pública FISABIO/Universidad de Valencia-I2SysBio, Spain.,CIBER in Epidemiology and Public Health, Spain
| | - Leonor Sánchez-Busó
- Genomics and Health Area, Foundation for the Promotion of Health and Biomedical Research in the Valencian Community (FISABIO-Public Health), Valencia, Spain
| | - Lorenzo Giacani
- Department of Medicine, Division of Allergy and Infectious Diseases, and Department of Global Health, University of Washington, Seattle, WA, USA
| | - David Šmajs
- Department of Biology, Faculty of Medicine, Masaryk University, Czech Republic
| | - Philipp P Bosshard
- Department of Dermatology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | | | | | - Kay Nieselt
- Center for Bioinformatics, University of Tübingen, Germany
| | - Natasha Arora
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Switzerland.,Zurich Institute of Forensic Medicine, University of Zurich, Switzerland
| | - Fernando González-Candelas
- Unidad Mixta Infección y Salud Pública FISABIO/Universidad de Valencia-I2SysBio, Spain.,CIBER in Epidemiology and Public Health, Spain.,Genomics and Health Area, Foundation for the Promotion of Health and Biomedical Research in the Valencian Community (FISABIO-Public Health), Valencia, Spain
| |
Collapse
|
36
|
Abstract
PURPOSE OF REVIEW The advancement of molecular techniques such as whole-genome sequencing (WGS) has revolutionized the field of bacterial strain typing, with important implications for epidemiological surveillance and outbreak investigations. This review summarizes state-of-the-art techniques in strain typing and examines barriers faced by clinical and public health laboratories in implementing these new methodologies. RECENT FINDINGS WGS-based methodologies are on track to become the new 'gold standards' in bacterial strain typing, replacing traditional methods like pulsed-field gel electrophoresis and multilocus sequence typing. These new techniques have an improved ability to identify genetic relationships among organisms of interest. Further, advances in long-read sequencing approaches will likely provide a highly discriminatory tool to perform pangenome analyses and characterize relevant accessory genome elements, including mobile genetic elements carrying antibiotic resistance determinants in real time. Barriers to widespread integration of these approaches include a lack of standardized workflows and technical training. SUMMARY Genomic bacterial strain typing has facilitated a paradigm shift in clinical and molecular epidemiology. The increased resolution that these new techniques provide, along with epidemiological data, will facilitate the rapid identification of transmission routes with high confidence, leading to timely and effective deployment of infection control and public health interventions in outbreak settings.
Collapse
|
37
|
Singh R, Kusalik A, Dillon JAR. Bioinformatics tools used for whole-genome sequencing analysis of Neisseria gonorrhoeae: a literature review. Brief Funct Genomics 2021; 21:78-89. [PMID: 34170311 DOI: 10.1093/bfgp/elab028] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 05/21/2021] [Accepted: 05/24/2021] [Indexed: 01/02/2023] Open
Abstract
Whole-genome sequencing (WGS) data are well established for the investigation of gonococcal transmission, antimicrobial resistance prediction, population structure determination and population dynamics. A variety of bioinformatics tools, repositories, services and platforms have been applied to manage and analyze Neisseria gonorrhoeae WGS datasets. This review provides an overview of the various bioinformatics approaches and resources used in 105 published studies (as of 30 April 2021). The challenges in the analysis of N. gonorrhoeae WGS datasets, as well as future bioinformatics requirements, are also discussed.
Collapse
Affiliation(s)
- Reema Singh
- Department of Biochemistry, Microbiology and Immunology
| | - Anthony Kusalik
- Department of Computer Science at the University of Saskatchewan
| | - Jo-Anne R Dillon
- Department of Biochemistry Microbiology and Immunology, College of Medicine, c/o Vaccine and Infectious Disease Organization, University of Saskatchewan, 120 Veterinary Road, Saskatoon, Saskatchewan S7N5E3, Canada
| |
Collapse
|
38
|
Carroll LM, Cheng RA, Wiedmann M, Kovac J. Keeping up with the Bacillus cereus group: taxonomy through the genomics era and beyond. Crit Rev Food Sci Nutr 2021; 62:7677-7702. [PMID: 33939559 DOI: 10.1080/10408398.2021.1916735] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The Bacillus cereus group, also known as B. cereus sensu lato (s.l.), is a species complex that contains numerous closely related lineages, which vary in their ability to cause illness in humans and animals. The classification of B. cereus s.l. isolates into species-level taxonomic units is thus essential for informing public health and food safety efforts. However, taxonomic classification of these organisms is challenging. Numerous-often conflicting-taxonomic changes to the group have been proposed over the past two decades, making it difficult to remain up to date. In this review, we discuss the major nomenclatural changes that have accumulated in the B. cereus s.l. taxonomic space prior to 2020, particularly in the genomic sequencing era, and outline the resulting problems. We discuss several contemporary taxonomic frameworks as applied to B. cereus s.l., including (i) phenotypic, (ii) genomic, and (iii) hybrid nomenclatural frameworks, and we discuss the advantages and disadvantages of each. We offer suggestions as to how readers can avoid B. cereus s.l. taxonomic ambiguities, regardless of the nomenclatural framework(s) they choose to employ. Finally, we discuss future directions and open problems in the B. cereus s.l. taxonomic realm, including those that cannot be solved by genomic approaches alone.
Collapse
Affiliation(s)
- Laura M Carroll
- Structural and Computational Biology Unit, EMBL, Heidelberg, Germany
| | - Rachel A Cheng
- Department of Food Science, Cornell University, Ithaca, New York, USA
| | - Martin Wiedmann
- Department of Food Science, Cornell University, Ithaca, New York, USA
| | - Jasna Kovac
- Department of Food Science, The Pennsylvania State University, University Park, Pennsylvania, USA
| |
Collapse
|