1
|
An improved assembly and annotation of the melon (Cucumis melo L.) reference genome. Sci Rep 2018; 8:8088. [PMID: 29795526 PMCID: PMC5967340 DOI: 10.1038/s41598-018-26416-2] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Accepted: 05/09/2018] [Indexed: 12/20/2022] Open
Abstract
We report an improved assembly (v3.6.1) of the melon (Cucumis melo L.) genome and a new genome annotation (v4.0). The optical mapping approach allowed correcting the order and the orientation of 21 previous scaffolds and permitted to correctly define the gap-size extension along the 12 pseudomolecules. A new comprehensive annotation was also built in order to update the previous annotation v3.5.1, released more than six years ago. Using an integrative annotation pipeline, based on exhaustive RNA-Seq collections and ad-hoc transposable element annotation, we identified 29,980 protein-coding loci. Compared to the previous version, the v4.0 annotation improved gene models in terms of completeness of gene structure, UTR regions definition, intron-exon junctions and reduction of fragmented genes. More than 8,000 new genes were identified, one third of them being well supported by RNA-Seq data. To make all the new resources easily exploitable and completely available for the scientific community, a redesigned Melonomics genomic platform was released at http://melonomics.net. The resources produced in this work considerably increase the reliability of the melon genome assembly and resolution of the gene models paving the way for further studies in melon and related species.
Collapse
|
2
|
Mullins KE, Hang J, Clifford RJ, Onmus-Leone F, Yang Y, Jiang J, Leguia M, Kasper MR, Maguina C, Lesho EP, Jarman RG, Richards A, Blazes D. Whole-Genome Analysis of Bartonella ancashensis, a Novel Pathogen Causing Verruga Peruana, Rural Ancash Region, Peru. Emerg Infect Dis 2018; 23:430-438. [PMID: 28221130 PMCID: PMC5382735 DOI: 10.3201/eid2303.161476] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
The genus Bartonella contains >40 species, and an increasing number of these Bartonella species are being implicated in human disease. One such pathogen is Bartonella ancashensis, which was isolated in blood samples from 2 patients living in Caraz, Peru, during a clinical trial of treatment for bartonellosis. Three B. ancashensis strains were analyzed by using whole-genome restriction mapping and high-throughput pyrosequencing. Genome-wide comparative analysis of Bartonella species showed that B. ancashensis has features seen in modern and ancient lineages of Bartonella species and is more related to B. bacilliformis. The divergence between B. ancashensis and B. bacilliformis is much greater than what is seen between known Bartonella genetic lineages. In addition, B. ancashensis contains type IV secretion system proteins, which are not present in B. bacilliformis. Whole-genome analysis indicates that B. ancashensis might represent a distinct Bartonella lineage phylogenetically related to B. bacilliformis.
Collapse
|
3
|
Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome. Nat Genet 2017; 49:643-650. [PMID: 28263316 DOI: 10.1038/ng.3802] [Citation(s) in RCA: 386] [Impact Index Per Article: 55.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Accepted: 02/03/2017] [Indexed: 12/30/2022]
Abstract
The decrease in sequencing cost and increased sophistication of assembly algorithms for short-read platforms has resulted in a sharp increase in the number of species with genome assemblies. However, these assemblies are highly fragmented, with many gaps, ambiguities, and errors, impeding downstream applications. We demonstrate current state of the art for de novo assembly using the domestic goat (Capra hircus) based on long reads for contig formation, short reads for consensus validation, and scaffolding by optical and chromatin interaction mapping. These combined technologies produced what is, to our knowledge, the most continuous de novo mammalian assembly to date, with chromosome-length scaffolds and only 649 gaps. Our assembly represents a ∼400-fold improvement in continuity due to properly assembled gaps, compared to the previously published C. hircus assembly, and better resolves repetitive structures longer than 1 kb, representing the largest repeat family and immune gene complex yet produced for an individual of a ruminant species.
Collapse
|
4
|
Chapleau RR, Baldwin JC. Optical Whole-Genome Restriction Mapping as a Tool for Rapidly Distinguishing and Identifying Bacterial Contaminants in Clinical Samples. J Clin Diagn Res 2015; 9:DC24-7. [PMID: 26435946 DOI: 10.7860/jcdr/2015/13983.6408] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2015] [Accepted: 06/09/2015] [Indexed: 12/11/2022]
Abstract
INTRODUCTION Optical restriction genome mapping is a technology in which a genome is linearized on a surface and digested with specific restriction enzymes, giving an arrangement of the genome with gaps whose order and size are unique for a given organism. Current applications of this technology include assisting with the correct scaffolding and ordering of genomes in conjunction with whole-genome sequencing, observation of genetic drift and evolution using comparative genomics and epidemiological monitoring of the spread of infections. Here, we investigated the suitability of genome mapping for use in clinical labs as a potential diagnostic tool. MATERIALS AND METHODS Using whole genome mapping, we investigated the basic performance of the technology for identifying two bacteria of interest for food-safety (Lactobacilli spp. and Enterohemorrhagic Escherichia coli). We further evaluated the performance for identifying multiple organisms from both simple and complex mixtures. RESULTS We were able to successfully generate optical restriction maps of four Lactobacillus species as well as a strain of Enterohemorrhagic Escherichia coli from within a mixed solution, each distinguished using a common compatible restriction enzyme. Finally, we demonstrated that optical restriction maps were successfully obtained and the correct organism identified within a clinical matrix. CONCLUSION With additional development, whole genome mapping may be a useful clinical tool for rapid invitro diagnostics.
Collapse
Affiliation(s)
- Richard R Chapleau
- Applied Technology and Genomics Center, United States Air Force School of Aerospace Medicine , 711th Human Performance Wing, Air Force Research Laboratory, Wright-Patterson AFB OH
| | - James C Baldwin
- Applied Technology and Genomics Center, United States Air Force School of Aerospace Medicine , 711th Human Performance Wing, Air Force Research Laboratory, Wright-Patterson AFB OH
| |
Collapse
|
5
|
Gaviria-Agudelo C, Aroh C, Tareen N, Wakeland EK, Kim M, Copley LA. Genomic Heterogeneity of Methicillin Resistant Staphylococcus aureus Associated with Variation in Severity of Illness among Children with Acute Hematogenous Osteomyelitis. PLoS One 2015; 10:e0130415. [PMID: 26086671 PMCID: PMC4473274 DOI: 10.1371/journal.pone.0130415] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 05/20/2015] [Indexed: 11/23/2022] Open
Abstract
Introduction The association between severity of illness of children with osteomyelitis caused by Methicillin-resistant Staphylococcus aureus (MRSA) and genomic variation of the causative organism has not been previously investigated. The purpose of this study is to assess genomic heterogeneity among MRSA isolates from children with osteomyelitis who have diverse severity of illness. Materials and Methods Children with osteomyelitis were prospectively studied between 2010 and 2011. Severity of illness of the affected children was determined from clinical and laboratory parameters. MRSA isolates were analyzed with next generation sequencing (NGS) and optical mapping. Sequence data was used for multi-locus sequence typing (MLST), phylogenetic analysis by maximum likelihood (PAML), and identification of virulence genes and single nucleotide polymorphisms (SNP) relative to reference strains. Results The twelve children studied demonstrated severity of illness scores ranging from 0 (mild) to 9 (severe). All isolates were USA300, ST 8, SCC mec IVa MRSA by MLST. The isolates differed from reference strains by 2 insertions (40 Kb each) and 2 deletions (10 and 25 Kb) but had no rearrangements or copy number variations. There was a higher occurrence of virulence genes among study isolates when compared to the reference strains (p = 0.0124). There were an average of 11 nonsynonymous SNPs per strain. PAML demonstrated heterogeneity of study isolates from each other and from the reference strains. Discussion Genomic heterogeneity exists among MRSA isolates causing osteomyelitis among children in a single community. These variations may play a role in the pathogenesis of variation in clinical severity among these children.
Collapse
Affiliation(s)
- Claudia Gaviria-Agudelo
- Department of Pediatric Infectious Disease, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Chukwuemika Aroh
- Department of Immunology, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Naureen Tareen
- Children’s Medical Center, Dallas, Texas, United States of America
| | - Edward K. Wakeland
- Department of Immunology, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - MinSoo Kim
- Department of Biomedical Informatics, University of Texas Southwestern Medical Center, Dallas, TX, United States of America
| | - Lawson A. Copley
- Children’s Medical Center, Dallas, Texas, United States of America
- Texas Scottish Rite Hospital for Children, Dallas, Texas, United States of America
- Department of Orthopaedic Surgery, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- * E-mail:
| |
Collapse
|
6
|
Jex AR, Koehler AV, Ansell BR, Baker L, Karunajeewa H, Gasser RB. Getting to the guts of the matter: The status and potential of ‘omics’ research of parasitic protists of the human gastrointestinal system. Int J Parasitol 2013; 43:971-82. [DOI: 10.1016/j.ijpara.2013.06.005] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2013] [Revised: 06/07/2013] [Accepted: 06/07/2013] [Indexed: 11/17/2022]
|
7
|
Whole genome mapping and re-organization of the nuclear and mitochondrial genomes of Babesia microti isolates. PLoS One 2013; 8:e72657. [PMID: 24023759 PMCID: PMC3762879 DOI: 10.1371/journal.pone.0072657] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Accepted: 07/12/2013] [Indexed: 11/19/2022] Open
Abstract
Babesia microti is the primary causative agent of human babesiosis, an emerging pathogen that causes a malaria-like illness with possible fatal outcome in immunocompromised patients. The genome sequence of the B. microti R1 strain was reported in 2012 and revealed a distinct evolutionary path for this pathogen relative to that of other apicomplexa. Lacking from the first genome assembly and initial molecular analyses was information about the terminal ends of each chromosome, and both the exact number of chromosomes in the nuclear genome and the organization of the mitochondrial genome remained ambiguous. We have now performed various molecular analyses to characterize the nuclear and mitochondrial genomes of the B. microti R1 and Gray strains and generated high-resolution Whole Genome maps. These analyses show that the genome of B. microti consists of four nuclear chromosomes and a linear mitochondrial genome present in four different structural types. Furthermore, Whole Genome mapping allowed resolution of the chromosomal ends, identification of areas of misassembly in the R1 genome, and genomic differences between the R1 and Gray strains, which occur primarily in the telomeric regions. These studies set the stage for a better understanding of the evolution and diversity of this important human pathogen.
Collapse
|
8
|
Bosch T, Verkade E, van Luit M, Pot B, Vauterin P, Burggrave R, Savelkoul P, Kluytmans J, Schouls L. High Resolution Typing by Whole Genome Mapping Enables Discrimination of LA-MRSA (CC398) Strains and Identification of Transmission Events. PLoS One 2013; 8:e66493. [PMID: 23805225 PMCID: PMC3689830 DOI: 10.1371/journal.pone.0066493] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Accepted: 05/06/2013] [Indexed: 11/19/2022] Open
Abstract
After its emergence in 2003, a livestock-associated (LA-)MRSA clade (CC398) has caused an impressive increase in the number of isolates submitted for the Dutch national MRSA surveillance and now comprises 40% of all isolates. The currently used molecular typing techniques have limited discriminatory power for this MRSA clade, which hampers studies on the origin and transmission routes. Recently, a new molecular analysis technique named whole genome mapping was introduced. This method creates high-resolution, ordered whole genome restriction maps that may have potential for strain typing. In this study, we assessed and validated the capability of whole genome mapping to differentiate LA-MRSA isolates. Multiple validation experiments showed that whole genome mapping produced highly reproducible results. Assessment of the technique on two well-documented MRSA outbreaks showed that whole genome mapping was able to confirm one outbreak, but revealed major differences between the maps of a second, indicating that not all isolates belonged to this outbreak. Whole genome mapping of LA-MRSA isolates that were epidemiologically unlinked provided a much higher discriminatory power than spa-typing or MLVA. In contrast, maps created from LA-MRSA isolates obtained during a proven LA-MRSA outbreak were nearly indistinguishable showing that transmission of LA-MRSA can be detected by whole genome mapping. Finally, whole genome maps of LA-MRSA isolates originating from two unrelated veterinarians and their household members showed that veterinarians may carry and transmit different LA-MRSA strains at the same time. No such conclusions could be drawn based spa-typing and MLVA. Although PFGE seems to be suitable for molecular typing of LA-MRSA, WGM provides a much higher discriminatory power. Furthermore, whole genome mapping can provide a comparison with other maps within 2 days after the bacterial culture is received, making it suitable to investigate transmission events and outbreaks caused by LA-MRSA.
Collapse
Affiliation(s)
- Thijs Bosch
- Laboratory for Infectious Diseases and Screening, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
- * E-mail:
| | - Erwin Verkade
- Laboratory for Microbiology and Infection Control, Amphia Hospital, Breda, The Netherlands
- Laboratory for Medical Microbiology and Immunology, St. Elisabeth Hospital, Tilburg, The Netherlands
| | - Martijn van Luit
- Laboratory for Infectious Diseases and Screening, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| | - Bruno Pot
- Applied Maths, Sint-Martens-Latem, Belgium
| | | | | | - Paul Savelkoul
- Department of Medical Microbiology, Academic Hospital Maastricht, Maastricht, The Netherlands
- Department of Medical Microbiology, VU University Medical Center, Amsterdam, The Netherlands
| | - Jan Kluytmans
- Laboratory for Microbiology and Infection Control, Amphia Hospital, Breda, The Netherlands
- Laboratory for Medical Microbiology and Immunology, St. Elisabeth Hospital, Tilburg, The Netherlands
- Department of Medical Microbiology, VU University Medical Center, Amsterdam, The Netherlands
| | - Leo Schouls
- Laboratory for Infectious Diseases and Screening, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| |
Collapse
|
9
|
Hunt M, Kikuchi T, Sanders M, Newbold C, Berriman M, Otto TD. REAPR: a universal tool for genome assembly evaluation. Genome Biol 2013; 14:R47. [PMID: 23710727 PMCID: PMC3798757 DOI: 10.1186/gb-2013-14-5-r47] [Citation(s) in RCA: 279] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2013] [Accepted: 05/27/2013] [Indexed: 11/17/2022] Open
Abstract
Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free, respectively. When applied to an ongoing genome project, REAPR provides corrected assembly statistics allowing the quantitative comparison of multiple assemblies. REAPR is available at http://www.sanger.ac.uk/resources/software/reapr/.
Collapse
|
10
|
Enhanced de novo assembly of high throughput pyrosequencing data using whole genome mapping. PLoS One 2013; 8:e61762. [PMID: 23613926 PMCID: PMC3629165 DOI: 10.1371/journal.pone.0061762] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2012] [Accepted: 03/11/2013] [Indexed: 01/20/2023] Open
Abstract
Despite major advances in next-generation sequencing, assembly of sequencing data, especially data from novel microorganisms or re-emerging pathogens, remains constrained by the lack of suitable reference sequences. De novo assembly is the best approach to achieve an accurate finished sequence, but multiple sequencing platforms or paired-end libraries are often required to achieve full genome coverage. In this study, we demonstrated a method to assemble complete bacterial genome sequences by integrating shotgun Roche 454 pyrosequencing with optical whole genome mapping (WGM). The whole genome restriction map (WGRM) was used as the reference to scaffold de novo assembled sequence contigs through a stepwise process. Large de novo contigs were placed in the correct order and orientation through alignment to the WGRM. De novo contigs that were not aligned to WGRM were merged into scaffolds using contig branching structure information. These extended scaffolds were then aligned to the WGRM to identify the overlaps to be eliminated and the gaps and mismatches to be resolved with unused contigs. The process was repeated until a sequence with full coverage and alignment with the whole genome map was achieved. Using this method we were able to achieved 100% WGRM coverage without a paired-end library. We assembled complete sequences for three distinct genetic components of a clinical isolate of Providencia stuartii: a bacterial chromosome, a novel bla NDM-1 plasmid, and a novel bacteriophage, without separately purifying them to homogeneity.
Collapse
|
11
|
Hang J, Clifford RJ, Yang Y, Riley MC, Mody RM, Kuschner RA, Lesho EP. Genome sequencing of pathogenic Rhodococcus spp. Emerg Infect Dis 2013; 18:1915-6. [PMID: 23092600 PMCID: PMC3559142 DOI: 10.3201/eid1811.120818] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
|
12
|
Abstract
Strain-typing technology in support of outbreak identification and resolution has evolved from phenotypic analysis, such as serology and biotypes, to much-more-robust molecular genetic approaches, such as pulsed-field gel electrophoresis (PFGE) and whole-genome sequencing. Whole-genome mapping (WGM) has been recently applied to subtyping analysis, and it bridges the gap between PFGE (∼20 bands sorted by size) and whole-genome sequencing. WGM utilizes restriction site analysis but arranges 200 to 500 bands in the order they appear on the chromosome. WGM is able to quickly and cost-effectively generate high-resolution, ordered whole-genome maps of bacteria.
Collapse
|