1
|
Akther S, Mongodin EF, Morgan RD, Di L, Yang X, Golovchenko M, Rudenko N, Margos G, Hepner S, Fingerle V, Kawabata H, Norte AC, de Carvalho IL, Núncio MS, Marques A, Schutzer SE, Fraser CM, Luft BJ, Casjens SR, Qiu W. Natural selection and recombination at host-interacting lipoprotein loci drive genome diversification of Lyme disease and related bacteria. mBio 2024; 15:e0174924. [PMID: 39145656 PMCID: PMC11389397 DOI: 10.1128/mbio.01749-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Accepted: 06/28/2024] [Indexed: 08/16/2024] Open
Abstract
Lyme disease, caused by spirochetes in the Borrelia burgdorferi sensu lato clade within the Borrelia genus, is transmitted by Ixodes ticks and is currently the most prevalent and rapidly expanding tick-borne disease in Europe and North America. We report complete genome sequences of 47 isolates that encompass all established species in this clade while highlighting the diversity of the widespread human pathogenic species B. burgdorferi. A similar set of plasmids has been maintained throughout Borrelia divergence, indicating that they are a key adaptive feature of this genus. Phylogenetic reconstruction of all sequenced Borrelia genomes revealed the original divergence of Eurasian and North American lineages and subsequent dispersals that introduced B. garinii, B. bavariensis, B. lusitaniae, B. valaisiana, and B. afzelii from East Asia to Europe and B. burgdorferi and B. finlandensis from North America to Europe. Molecular phylogenies of the universally present core replicons (chromosome and cp26 and lp54 plasmids) are highly consistent, revealing a strong clonal structure. Nonetheless, numerous inconsistencies between the genome and gene phylogenies indicate species dispersal, genetic exchanges, and rapid sequence evolution at plasmid-borne loci, including key host-interacting lipoprotein genes. While localized recombination occurs uniformly on the main chromosome at a rate comparable to mutation, lipoprotein-encoding loci are recombination hotspots on the plasmids, suggesting adaptive maintenance of recombinant alleles at loci directly interacting with the host. We conclude that within- and between-species recombination facilitates adaptive sequence evolution of host-interacting lipoprotein loci and contributes to human virulence despite a genome-wide clonal structure of its natural populations. IMPORTANCE Lyme disease (also called Lyme borreliosis in Europe), a condition caused by spirochete bacteria of the genus Borrelia, transmitted by hard-bodied Ixodes ticks, is currently the most prevalent and rapidly expanding tick-borne disease in the United States and Europe. Borrelia interspecies and intraspecies genome comparisons of Lyme disease-related bacteria are essential to reconstruct their evolutionary origins, track epidemiological spread, identify molecular mechanisms of human pathogenicity, and design molecular and ecological approaches to disease prevention, diagnosis, and treatment. These Lyme disease-associated bacteria harbor complex genomes that encode many genes that do not have homologs in other organisms and are distributed across multiple linear and circular plasmids. The functional significance of most of the plasmid-borne genes and the multipartite genome organization itself remains unknown. Here we sequenced, assembled, and analyzed whole genomes of 47 Borrelia isolates from around the world, including multiple isolates of the human pathogenic species. Our analysis elucidates the evolutionary origins, historical migration, and sources of genomic variability of these clinically important pathogens. We have developed web-based software tools (BorreliaBase.org) to facilitate dissemination and continued comparative analysis of Borrelia genomes to identify determinants of human pathogenicity.
Collapse
Affiliation(s)
- Saymon Akther
- Graduate Center and Hunter College, City University of New York, New York, New York, USA
| | | | | | - Lia Di
- Graduate Center and Hunter College, City University of New York, New York, New York, USA
| | - Xiaohua Yang
- Department of Medicine, Renaissance School of Medicine, Stony Brook University (SUNY), Stony Brook, New York, USA
| | - Maryna Golovchenko
- Biology Centre Czech Academy of Sciences, Institute of Parasitology, České Budějovice, Czech Republic
| | - Natalie Rudenko
- Biology Centre Czech Academy of Sciences, Institute of Parasitology, České Budějovice, Czech Republic
| | - Gabriele Margos
- Bavarian Health and Food Safety Authority and German National Reference Centre for Borrelia, Oberschleissheim, Bavaria, Germany
| | - Sabrina Hepner
- Bavarian Health and Food Safety Authority and German National Reference Centre for Borrelia, Oberschleissheim, Bavaria, Germany
| | - Volker Fingerle
- Bavarian Health and Food Safety Authority and German National Reference Centre for Borrelia, Oberschleissheim, Bavaria, Germany
| | | | - Ana Cláudia Norte
- Department of Life Sciences, University of Coimbra, MARE-Marine and Environmental Sciences Centre, Coimbra, Portugal
| | | | - Maria Sofia Núncio
- Centre for Vector and Infectious Diseases Research, Águas de Moura, Portugal
| | - Adriana Marques
- National Institute of Allergy and Infectious Diseases, Bethesda, Maryland, USA
| | | | - Claire M Fraser
- University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Benjamin J Luft
- Department of Medicine, Renaissance School of Medicine, Stony Brook University (SUNY), Stony Brook, New York, USA
| | - Sherwood R Casjens
- University of Utah School of Medicine and School of Biological Sciences, Salt Lake City, Utah, USA
| | - Weigang Qiu
- Graduate Center and Hunter College, City University of New York, New York, New York, USA
- Weill Cornell Medical College, New York, New York, USA
| |
Collapse
|
2
|
Cicconardi F, Milanetti E, Pinheiro de Castro EC, Mazo-Vargas A, Van Belleghem SM, Ruggieri AA, Rastas P, Hanly J, Evans E, Jiggins CD, Owen McMillan W, Papa R, Di Marino D, Martin A, Montgomery SH. Evolutionary dynamics of genome size and content during the adaptive radiation of Heliconiini butterflies. Nat Commun 2023; 14:5620. [PMID: 37699868 PMCID: PMC10497600 DOI: 10.1038/s41467-023-41412-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 08/30/2023] [Indexed: 09/14/2023] Open
Abstract
Heliconius butterflies, a speciose genus of Müllerian mimics, represent a classic example of an adaptive radiation that includes a range of derived dietary, life history, physiological and neural traits. However, key lineages within the genus, and across the broader Heliconiini tribe, lack genomic resources, limiting our understanding of how adaptive and neutral processes shaped genome evolution during their radiation. Here, we generate highly contiguous genome assemblies for nine Heliconiini, 29 additional reference-assembled genomes, and improve 10 existing assemblies. Altogether, we provide a dataset of annotated genomes for a total of 63 species, including 58 species within the Heliconiini tribe. We use this extensive dataset to generate a robust and dated heliconiine phylogeny, describe major patterns of introgression, explore the evolution of genome architecture, and the genomic basis of key innovations in this enigmatic group, including an assessment of the evolution of putative regulatory regions at the Heliconius stem. Our work illustrates how the increased resolution provided by such dense genomic sampling improves our power to generate and test gene-phenotype hypotheses, and precisely characterize how genomes evolve.
Collapse
Affiliation(s)
- Francesco Cicconardi
- School of Biological Sciences, Bristol University, Bristol, United Kingdom.
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom.
| | - Edoardo Milanetti
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185, Rome, Italy
- Center for Life Nano- & Neuro-Science, Italian Institute of Technology, Viale Regina Elena 291, 00161, Rome, Italy
| | | | - Anyi Mazo-Vargas
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Steven M Van Belleghem
- Department of Biology, University of Puerto Rico, Rio Piedras, PR, Puerto Rico
- Ecology, Evolution and Conservation Biology, Biology Department, KU Leuven, Leuven, Belgium
| | | | - Pasi Rastas
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | - Joseph Hanly
- Department of Biological Sciences, The George Washington University, Washington DC, WA, 20052, USA
- Smithsonian Tropical Research Institute, Panama City, Panama
| | - Elizabeth Evans
- Department of Biology, University of Puerto Rico, Rio Piedras, PR, Puerto Rico
| | - Chris D Jiggins
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom
| | - W Owen McMillan
- Smithsonian Tropical Research Institute, Panama City, Panama
| | - Riccardo Papa
- Department of Biology, University of Puerto Rico, Rio Piedras, PR, Puerto Rico
- Molecular Sciences and Research Center, University of Puerto Rico, San Juan, PR, Puerto Rico
- Comprehensive Cancer Center, University of Puerto Rico, San Juan, PR, Puerto Rico
| | - Daniele Di Marino
- Department of Life and Environmental Sciences, New York-Marche Structural Biology Center (NY-MaSBiC), Polytechnic University of Marche, Via Brecce Bianche, 60131, Ancona, Italy
- Neuronal Death and Neuroprotection Unit, Department of Neuroscience, Mario Negri Institute for Pharmacological Research-IRCCS, Via Mario Negri 2, 20156, Milano, Italy
- National Biodiversity Future Center (NBFC), Palermo, Italy
| | - Arnaud Martin
- Department of Biological Sciences, The George Washington University, Washington DC, WA, 20052, USA
| | - Stephen H Montgomery
- School of Biological Sciences, Bristol University, Bristol, United Kingdom.
- Smithsonian Tropical Research Institute, Panama City, Panama.
| |
Collapse
|
3
|
End-point RT-PCR based on a conservation landscape for SARS-COV-2 detection. Sci Rep 2022; 12:4759. [PMID: 35306521 PMCID: PMC8933765 DOI: 10.1038/s41598-022-07756-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 02/24/2022] [Indexed: 11/09/2022] Open
Abstract
End-point RT-PCR is a suitable alternative diagnostic technique since it is cheaper than RT-qPCR tests and can be implemented on a massive scale in low- and middle-income countries. In this work, a bioinformatic approach to guide the design of PCR primers was developed, and an alternative diagnostic test based on end-point PCR was designed. End-point PCR primers were designed through conservation analysis based on kmer frequency in SARS-CoV-2 and human respiratory pathogen genomes. Highly conserved regions were identified for primer design, and the resulting PCR primers were used to amplify 871 nasopharyngeal human samples with a previous RT-qPCR based SARS-CoV-2 diagnosis. The diagnostic test showed high accuracy in identifying SARS-CoV-2-positive samples including B.1.1.7, P.1, B.1.427/B.1.429 and B.1.617.2/ AY samples with a detection limit of 7.2 viral copies/µL. In addition, this test could discern SARS-CoV-2 infection from other viral infections with COVID-19-like symptomatology. The designed end-point PCR diagnostic test to detect SARS-CoV-2 is a suitable alternative to RT-qPCR. Since the proposed bioinformatic approach can be easily applied in thousands of viral genomes and over highly divergent strains, it can be used as a PCR design tool as new SARS-CoV-2 variants emerge. Therefore, this end-point PCR test could be employed in epidemiological surveillance to detect new SARS-CoV-2 variants as they emerge and propagate.
Collapse
|
4
|
Gómez-Muñoz C, García-Ortega LF, Montalvo-Arredondo J, Pérez-Ortega E, Damas-Buenrostro LC, Riego-Ruiz L. Long insert clone experimental evidence for assembly improvement and chimeric chromosomes detection in an allopentaploid beer yeast. G3-GENES GENOMES GENETICS 2021; 11:6188626. [PMID: 33768233 PMCID: PMC8495930 DOI: 10.1093/g3journal/jkab088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 03/12/2021] [Indexed: 11/18/2022]
Abstract
Lager beer is made with the hybrid Saccharomyces pastorianus. Many publicly available S. pastorianus genome assemblies are highly fragmented due to the difficulties of assembling hybrid genomes, such as the presence of homeologous chromosomes from both parental types, and translocations between them. To improve the assembly of a previously sequenced lager yeast hybrid Saccharomyces sp. 790 and elucidate its genome structure, we proposed the use of alternative experimental evidence. We determined the phylogenetic position of Saccharomyces sp. 790 and established it as S. pastorianus 790. Then, we obtained from this yeast a bacterial artificial chromosome (BAC) genomic library with its BAC-end sequences (BESs). To analyze these data, we developed a pipeline (applicable to other assemblies) that classifies BES pairs alignments according to their orientation. For the case of S. pastorianus 790, paired-end BESs alignments validated parts of the assembly and unpaired-end ones suggested contig joins or misassemblies. Importantly, the BACs library was preserved and used for verification experiments. Unpaired-end alignments were used to upgrade the previous assembly and provided an improved detection of translocations. With this, we proposed a genome structure of S. pastorianus 790, which was similar to that of other lager yeasts; however, when we estimated chromosome copy number and experimentally measured its genome size, we discovered that one key difference is the outstanding S. pastorianus 790 ploidy level (allopentaploid). Altogether, our results show the value of combining bioinformatic analyses with experimental data such as long-insert clone information to improve a short-read assembly of a hybrid genome.
Collapse
Affiliation(s)
- Cintia Gómez-Muñoz
- División de Biología Molecular, Instituto Potosino de Investigación Científica y Tecnológica, A.C., San Luis Potosí, Mexico, 78216
| | - Luis Fernando García-Ortega
- División de Biología Molecular, Instituto Potosino de Investigación Científica y Tecnológica, A.C., San Luis Potosí, Mexico, 78216.,Departamento de Ingeniería Genética, Centro de Investigación y de Estudios Avanzados del IPN, Irapuato, Mexico, 36824
| | - Javier Montalvo-Arredondo
- División de Biología Molecular, Instituto Potosino de Investigación Científica y Tecnológica, A.C., San Luis Potosí, Mexico, 78216.,Dirección General Académica, Universidad Autónoma Agraria Antonio Narro, Saltillo, Mexico, 25315
| | | | | | - Lina Riego-Ruiz
- División de Biología Molecular, Instituto Potosino de Investigación Científica y Tecnológica, A.C., San Luis Potosí, Mexico, 78216
| |
Collapse
|
5
|
Seaman J, Buggs RJA. FluentDNA: Nucleotide Visualization of Whole Genomes, Annotations, and Alignments. Front Genet 2020; 11:292. [PMID: 32425967 PMCID: PMC7203487 DOI: 10.3389/fgene.2020.00292] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 03/11/2020] [Indexed: 12/03/2022] Open
Abstract
Researchers seldom look at naked genome assemblies: instead the attributes of DNA sequences are mediated through statistics, annotations and high level summaries. Here we present software that visualizes the bare sequences of whole genome assemblies in a zoomable interface. This can assist in detection of chromosome architecture and contamination by the naked eye through changes in color patterns, in the absence of any other annotation. When available, annotations can be visualized alongside or on top of the naked sequence. Genome alignments can also be visualized, laying two genomes side by side in an alignment and highlighting their differences at nucleotide resolution. FluentDNA gives researchers direct visualization of whole genome assemblies, annotations and alignments, for quality control, hypothesis generation, and communicating results.
Collapse
Affiliation(s)
- Josiah Seaman
- Royal Botanic Gardens Kew, Jodrell Laboratory, Richmond, United Kingdom.,School of Biological and Chemical Sciences, Queen Mary University of London, London, United Kingdom
| | - Richard J A Buggs
- Royal Botanic Gardens Kew, Jodrell Laboratory, Richmond, United Kingdom.,School of Biological and Chemical Sciences, Queen Mary University of London, London, United Kingdom
| |
Collapse
|
6
|
Zhang LN, Ma PF, Zhang YX, Zeng CX, Zhao L, Li DZ. Using nuclear loci and allelic variation to disentangle the phylogeny of Phyllostachys (Poaceae, Bambusoideae). Mol Phylogenet Evol 2019; 137:222-235. [PMID: 31112779 DOI: 10.1016/j.ympev.2019.05.011] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2018] [Revised: 05/16/2019] [Accepted: 05/17/2019] [Indexed: 11/18/2022]
Abstract
With the development of sequencing technologies, the use of multiple nuclear genes has become conventional for resolving difficult phylogenies. However, this technique also presents challenges due to gene-tree discordance, as a result of incomplete lineage sorting (ILS) and reticulate evolution. Although alleles can show sequence variation within individuals, which contain information regarding the evolution of organisms, they continue to be ignored in almost all phylogenetic analyses using randomly phased genome sequences. Here, we tried to incorporate alleles from multiple nuclear loci to study the phylogeny of the economically important bamboo genus Phyllostachys (Poaceae, Bambusoideae). Obtaining a total of 3926 sequences, we documented extensive allelic variation for 61 genes from 39 sampled species. Using datasets consisting of selected alleles, we demonstrated substantial discordance among phylogenetic relationships inferred from different alleles, as well as between concatenation and coalescent methods. Furthermore, ILS and hybridization were suggested to be underlying causes of the discordant phylogenetic signals. Taking these possible causes for conflicting phylogenetic results into consideration, we recovered the monophyly of Phyllostachys and its two morphology-defined sections. Our study also suggests that alleles deserve more attention in phylogenetic studies, since ignoring them can yield highly supported but spurious phylogenies. Meanwhile, alleles are helpful for unraveling complex evolutionary processes, particularly hybridization.
Collapse
Affiliation(s)
- Li-Na Zhang
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan 650201, China; College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, Fujian 350002, China
| | - Peng-Fei Ma
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
| | - Yu-Xiao Zhang
- Yunnan Academy of Biodiversity, Southwest Forestry University, Kunming, Yunnan 650224, China
| | - Chun-Xia Zeng
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
| | - Lei Zhao
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
| | - De-Zhu Li
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan 650201, China.
| |
Collapse
|
7
|
Holm KO, Bækkedal C, Söderberg JJ, Haugen P. Complete Genome Sequences of Seven Vibrio anguillarum Strains as Derived from PacBio Sequencing. Genome Biol Evol 2018; 10:1127-1131. [PMID: 29635365 PMCID: PMC5905569 DOI: 10.1093/gbe/evy074] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/06/2018] [Indexed: 11/13/2022] Open
Abstract
We report here the complete genome sequences of seven Vibrio anguillarum strains isolated from multiple geographic locations, thus increasing the total number of genomes of finished quality to 11. The genomes were de novo assembled from long-sequence PacBio reads. Including draft genomes, a total of 44 V. anguillarum genomes are currently available in the genome databases. They represent an important resource in the study of, for example, genetic variations and for identifying virulence determinants. In this article, we present the genomes and basic genome comparisons of the 11 complete genomes, including a BRIG analysis, and pan genome calculation. We also describe some structural features of superintegrons on chromosome 2 s, and associated insertion sequence (IS) elements, including 18 new ISs (ISVa3 - ISVa20), both of importance in the complement of V. anguillarum genomes.
Collapse
Affiliation(s)
- Kåre Olav Holm
- Department of Chemistry and Center for Bioinformatics (SfB), Faculty of Science and Technology, UiT?-?The Arctic University of Norway, Tromsø, Norway
| | - Cecilie Bækkedal
- Department of Chemistry and Center for Bioinformatics (SfB), Faculty of Science and Technology, UiT?-?The Arctic University of Norway, Tromsø, Norway
| | - Jenny Johansson Söderberg
- Department of Chemistry and Center for Bioinformatics (SfB), Faculty of Science and Technology, UiT?-?The Arctic University of Norway, Tromsø, Norway
| | - Peik Haugen
- Department of Chemistry and Center for Bioinformatics (SfB), Faculty of Science and Technology, UiT?-?The Arctic University of Norway, Tromsø, Norway
| |
Collapse
|
8
|
Abstract
The precise location of variants in the human genome is of utmost importance. We present a unique approach, coverage-based single nucleotide variant (SNV) identification (COBASI), which uses only perfect matches between the reads of a sequence project and a reference genome to detect and accurately identify de novo SNVs. From the perfect matches, a representation of the read coverage per nucleotide along the genome, the variation landscape, is generated. SNVs are then pinpointed as significant changes in coverage and de novo SNVs can be identified with high precision. The performance of COBASI was analyzed using simulations and experimentally validated by sequencing de novo SNVs identified from a parent–offspring trio. We propose this pipeline as a useful tool for different genomic applications. The precise determination of de novo genetic variants has enormous implications across different fields of biology and medicine, particularly personalized medicine. Currently, de novo variations are identified by mapping sample reads from a parent–offspring trio to a reference genome, allowing for a certain degree of differences. While widely used, this approach often introduces false-positive (FP) results due to misaligned reads and mischaracterized sequencing errors. In a previous study, we developed an alternative approach to accurately identify single nucleotide variants (SNVs) using only perfect matches. However, this approach could be applied only to haploid regions of the genome and was computationally intensive. In this study, we present a unique approach, coverage-based single nucleotide variant identification (COBASI), which allows the exploration of the entire genome using second-generation short sequence reads without extensive computing requirements. COBASI identifies SNVs using changes in coverage of exactly matching unique substrings, and is particularly suited for pinpointing de novo SNVs. Unlike other approaches that require population frequencies across hundreds of samples to filter out any methodological biases, COBASI can be applied to detect de novo SNVs within isolated families. We demonstrate this capability through extensive simulation studies and by studying a parent–offspring trio we sequenced using short reads. Experimental validation of all 58 candidate de novo SNVs and a selection of non-de novo SNVs found in the trio confirmed zero FP calls. COBASI is available as open source at https://github.com/Laura-Gomez/COBASI for any researcher to use.
Collapse
|
9
|
Fonseca A, Ishoey T, Espinoza C, Pérez-Pantoja D, Manghisi A, Morabito M, Salas-Burgos A, Gallardo VA. Genomic features of "Candidatus Venteria ishoeyi", a new sulfur-oxidizing macrobacterium from the Humboldt Sulfuretum off Chile. PLoS One 2017; 12:e0188371. [PMID: 29236755 PMCID: PMC5728499 DOI: 10.1371/journal.pone.0188371] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2016] [Accepted: 11/06/2017] [Indexed: 12/13/2022] Open
Abstract
The Humboldt Sulfuretum (HS), in the productive Humboldt Eastern Boundary Current Upwelling Ecosystem, extends under the hypoxic waters of the Peru-Chile Undercurrent (ca. 6°S and ca. 36°S). Studies show that primeval sulfuretums held diverse prokaryotic life, and, while rare today, still sustain species-rich giant sulfur-oxidizing bacterial communities. We here present the genomic features of a new bacteria of the HS, "Candidatus Venteria ishoeyi" ("Ca. V. ishoeyi") in the family Thiotrichaceae.Three identical filaments were micro-manipulated from reduced sediments collected off central Chile; their DNA was extracted, amplified, and sequenced by a Roche 454 GS FLX platform. Using three sequenced libraries and through de novo genome assembly, a draft genome of 5.7 Mbp, 495 scaffolds, and a N50 of 70 kbp, was obtained. The 16S rRNA gene phylogenetic analysis showed that "Ca. V. ishoeyi" is related to non-vacuolate forms presently known as Beggiatoa or Beggiatoa-like forms. The complete set of genes involved in respiratory nitrate-reduction to dinitrogen was identified in "Ca. V. ishoeyi"; including genes likely leading to ammonification. As expected, the sulfur-oxidation pathway reported for other sulfur-oxidizing bacteria were deduced and also, key inorganic and organic carbon acquisition related genes were identified. Unexpectedly, the genome of "Ca. V. ishoeyi" contained numerous CRISPR repeats and an I-F CRISPR-Cas type system gene coding array. Findings further show that, as a member of an eons-old marine ecosystem, "Ca. V. ishoeyi" contains the needed metabolic plasticity for life in an increasingly oxygenated and variable ocean.
Collapse
Affiliation(s)
- Alexis Fonseca
- Department of Pharmacology, University of Concepcion, Concepcion, Chile
- Department of Oceanography, University of Concepcion, Concepcion, Chile
| | - Thomas Ishoey
- Independent consultant, Encinitas, California, United States of America
| | - Carola Espinoza
- Department of Oceanography, University of Concepcion, Concepcion, Chile
- College of Ocean Science and Resources, Institute Marine Affairs and Resource Management, National Taiwan Ocean University, Keelung, Taiwan
| | - Danilo Pérez-Pantoja
- Programa Institucional de Fomento a la Investigación, Desarrollo e Innovación, Universidad Tecnológica Metropolitana, San Joaquin, Santiago, Chile
| | - Antonio Manghisi
- Department of Chemical, Biological, Pharmaceutical and Environmental Sciences, University of Messina, Messina, Italy
| | - Marina Morabito
- Department of Chemical, Biological, Pharmaceutical and Environmental Sciences, University of Messina, Messina, Italy
| | | | - Víctor A. Gallardo
- Department of Oceanography, University of Concepcion, Concepcion, Chile
- College of Ocean Science and Resources, Institute Marine Affairs and Resource Management, National Taiwan Ocean University, Keelung, Taiwan
| |
Collapse
|
10
|
Draft Genome Sequences of Candida glabrata Isolates 1A, 1B, 2A, 2B, 3A, and 3B. GENOME ANNOUNCEMENTS 2017; 5:5/10/e00328-16. [PMID: 28280017 PMCID: PMC5347237 DOI: 10.1128/genomea.00328-16] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Here, we report the draft genome sequences of six Candida glabrata isolates. The isolates were taken from blood samples from patients after recurrent C. glabrata infection. Two isolates were taken from each of three patients a minimum 3 months apart.
Collapse
|
11
|
Istace B, Friedrich A, d'Agata L, Faye S, Payen E, Beluche O, Caradec C, Davidas S, Cruaud C, Liti G, Lemainque A, Engelen S, Wincker P, Schacherer J, Aury JM. de novo assembly and population genomic survey of natural yeast isolates with the Oxford Nanopore MinION sequencer. Gigascience 2017; 6:1-13. [PMID: 28369459 PMCID: PMC5466710 DOI: 10.1093/gigascience/giw018] [Citation(s) in RCA: 111] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Revised: 11/15/2016] [Accepted: 12/27/2016] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Oxford Nanopore Technologies Ltd (Oxford, UK) have recently commercialized MinION, a small single-molecule nanopore sequencer, that offers the possibility of sequencing long DNA fragments from small genomes in a matter of seconds. The Oxford Nanopore technology is truly disruptive; it has the potential to revolutionize genomic applications due to its portability, low cost, and ease of use compared with existing long reads sequencing technologies. The MinION sequencer enables the rapid sequencing of small eukaryotic genomes, such as the yeast genome. Combined with existing assembler algorithms, near complete genome assemblies can be generated and comprehensive population genomic analyses can be performed. RESULTS Here, we resequenced the genome of the Saccharomyces cerevisiae S288C strain to evaluate the performance of nanopore-only assemblers. Then we de novo sequenced and assembled the genomes of 21 isolates representative of the S. cerevisiae genetic diversity using the MinION platform. The contiguity of our assemblies was 14 times higher than the Illumina-only assemblies and we obtained one or two long contigs for 65 % of the chromosomes. This high contiguity allowed us to accurately detect large structural variations across the 21 studied genomes. CONCLUSION Because of the high completeness of the nanopore assemblies, we were able to produce a complete cartography of transposable elements insertions and inspect structural variants that are generally missed using a short-read sequencing strategy. Our analyses show that the Oxford Nanopore technology is already usable for de novo sequencing and assembly; however, non-random errors in homopolymers require polishing the consensus using an alternate sequencing technology.
Collapse
Affiliation(s)
- Benjamin Istace
- Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Anne Friedrich
- Université de Strasbourg, CNRS, GMGM UMR 7156, F-67000 Strasbourg, France
| | - Léo d'Agata
- Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Sébastien Faye
- Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Emilie Payen
- Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Odette Beluche
- Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Claudia Caradec
- Université de Strasbourg, CNRS, GMGM UMR 7156, F-67000 Strasbourg, France
| | - Sabrina Davidas
- Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Corinne Cruaud
- Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Gianni Liti
- Institute of Research on Cancer and Ageing of Nice (IRCAN), CNRS UMR 7284-INSERM U1081, Faculté de Médecine, Université de Nice Sophia Antipolis, Nice, France
| | - Arnaud Lemainque
- Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Stefan Engelen
- Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Patrick Wincker
- Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
- Université d'Evry Val d'Essonne, UMR 8030, CP5706, 91057 Evry, France
- Centre National de Recherche Scientifique (CNRS), UMR 8030, CP5706, 91057 Evry, France
| | - Joseph Schacherer
- Université de Strasbourg, CNRS, GMGM UMR 7156, F-67000 Strasbourg, France
| | - Jean-Marc Aury
- Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| |
Collapse
|
12
|
Chan CH, Octavia S, Sintchenko V, Lan R. SnpFilt: A pipeline for reference-free assembly-based identification of SNPs in bacterial genomes. Comput Biol Chem 2016; 65:178-184. [DOI: 10.1016/j.compbiolchem.2016.09.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Accepted: 09/07/2016] [Indexed: 10/21/2022]
|
13
|
Ghurye JS, Cepeda-Espinoza V, Pop M. Metagenomic Assembly: Overview, Challenges and Applications. THE YALE JOURNAL OF BIOLOGY AND MEDICINE 2016; 89:353-362. [PMID: 27698619 PMCID: PMC5045144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Advances in sequencing technologies have led to the increased use of high throughput sequencing in characterizing the microbial communities associated with our bodies and our environment. Critical to the analysis of the resulting data are sequence assembly algorithms able to reconstruct genes and organisms from complex mixtures. Metagenomic assembly involves new computational challenges due to the specific characteristics of the metagenomic data. In this survey, we focus on major algorithmic approaches for genome and metagenome assembly, and discuss the new challenges and opportunities afforded by this new field. We also review several applications of metagenome assembly in addressing interesting biological problems.
Collapse
Affiliation(s)
| | | | - Mihai Pop
- To whom all correspondence should be addressed: Mihai Pop, Department of Computer Science and Center of Bioinformatics and Computational Biology, University of Maryland, Center for Bioinformatics and Computational Biology, Biomolecular Sciences Building. Rm. 3120F, College Park, MD 20742, Phone Number: 301-405-7245,
| |
Collapse
|
14
|
Abstract
The number of large-scale genomics projects is increasing due to the availability of affordable high-throughput sequencing (HTS) technologies. The use of HTS for bacterial infectious disease research is attractive because one whole-genome sequencing (WGS) run can replace multiple assays for bacterial typing, molecular epidemiology investigations, and more in-depth pathogenomic studies. The computational resources and bioinformatics expertise required to accommodate and analyze the large amounts of data pose new challenges for researchers embarking on genomics projects for the first time. Here, we present a comprehensive overview of a bacterial genomics projects from beginning to end, with a particular focus on the planning and computational requirements for HTS data, and provide a general understanding of the analytical concepts to develop a workflow that will meet the objectives and goals of HTS projects.
Collapse
|
15
|
The A, C, G, and T of Genome Assembly. BIOMED RESEARCH INTERNATIONAL 2016; 2016:6329217. [PMID: 27247941 PMCID: PMC4877455 DOI: 10.1155/2016/6329217] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Accepted: 12/22/2015] [Indexed: 11/18/2022]
Abstract
Genome assembly in its two decades of history has produced significant research, in terms of both biotechnology and computational biology. This contribution delineates sequencing platforms and their characteristics, examines key steps involved in filtering and processing raw data, explains assembly frameworks, and discusses quality statistics for the assessment of the assembled sequence. Furthermore, the paper explores recent Ubuntu-based software environments oriented towards genome assembly as well as some avenues for future research.
Collapse
|
16
|
Romano S, Fernàndez-Guerra A, Reen FJ, Glöckner FO, Crowley SP, O'Sullivan O, Cotter PD, Adams C, Dobson ADW, O'Gara F. Comparative Genomic Analysis Reveals a Diverse Repertoire of Genes Involved in Prokaryote-Eukaryote Interactions within the Pseudovibrio Genus. Front Microbiol 2016; 7:387. [PMID: 27065959 PMCID: PMC4811931 DOI: 10.3389/fmicb.2016.00387] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Accepted: 03/11/2016] [Indexed: 01/15/2023] Open
Abstract
Strains of the Pseudovibrio genus have been detected worldwide, mainly as part of bacterial communities associated with marine invertebrates, particularly sponges. This recurrent association has been considered as an indication of a symbiotic relationship between these microbes and their host. Until recently, the availability of only two genomes, belonging to closely related strains, has limited the knowledge on the genomic and physiological features of the genus to a single phylogenetic lineage. Here we present 10 newly sequenced genomes of Pseudovibrio strains isolated from marine sponges from the west coast of Ireland, and including the other two publicly available genomes we performed an extensive comparative genomic analysis. Homogeneity was apparent in terms of both the orthologous genes and the metabolic features shared amongst the 12 strains. At the genomic level, a key physiological difference observed amongst the isolates was the presence only in strain P. axinellae AD2 of genes encoding proteins involved in assimilatory nitrate reduction, which was then proved experimentally. We then focused on studying those systems known to be involved in the interactions with eukaryotic and prokaryotic cells. This analysis revealed that the genus harbors a large diversity of toxin-like proteins, secretion systems and their potential effectors. Their distribution in the genus was not always consistent with the phylogenetic relationship of the strains. Finally, our analyses identified new genomic islands encoding potential toxin-immunity systems, previously unknown in the genus. Our analyses shed new light on the Pseudovibrio genus, indicating a large diversity of both metabolic features and systems for interacting with the host. The diversity in both distribution and abundance of these systems amongst the strains underlines how metabolically and phylogenetically similar bacteria may use different strategies to interact with the host and find a niche within its microbiota. Our data suggest the presence of a sponge-specific lineage of Pseudovibrio. The reduction in genome size and the loss of some systems potentially used to successfully enter the host, leads to the hypothesis that P. axinellae strain AD2 may be a lineage that presents an ancient association with the host and that may be vertically transmitted to the progeny.
Collapse
Affiliation(s)
- Stefano Romano
- BIOMERIT Research Centre, University College Cork Cork, Ireland
| | - Antonio Fernàndez-Guerra
- Oxford e-Research Centre, University of OxfordOxford, UK; Microbial Genomics and Bioinformatics Research Group, Max Planck Institute for Marine MicrobiologyBremen, Germany
| | - F Jerry Reen
- BIOMERIT Research Centre, University College Cork Cork, Ireland
| | - Frank O Glöckner
- Microbial Genomics and Bioinformatics Research Group, Max Planck Institute for Marine MicrobiologyBremen, Germany; Jacobs University Bremen gGmbHBremen, Germany
| | | | - Orla O'Sullivan
- Teagasc Food Research CentreFermoy, Ireland; APC Microbiome InstituteCork, Ireland
| | - Paul D Cotter
- Teagasc Food Research CentreFermoy, Ireland; APC Microbiome InstituteCork, Ireland
| | - Claire Adams
- BIOMERIT Research Centre, University College Cork Cork, Ireland
| | - Alan D W Dobson
- School of Microbiology, University College CorkCork, Ireland; Environmental Research Institute, University College CorkCork, Ireland
| | - Fergal O'Gara
- BIOMERIT Research Centre, University College CorkCork, Ireland; School of Biomedical Sciences, Curtin Health Innovation Research Institute, Curtin UniversityPerth, WA, Australia
| |
Collapse
|
17
|
Wences AH, Schatz MC. Metassembler: merging and optimizing de novo genome assemblies. Genome Biol 2015; 16:207. [PMID: 26403281 PMCID: PMC4581417 DOI: 10.1186/s13059-015-0764-4] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2015] [Accepted: 09/01/2015] [Indexed: 11/17/2022] Open
Abstract
Genome assembly projects typically run multiple algorithms in an attempt to find the single best assembly, although those assemblies often have complementary, if untapped, strengths and weaknesses. We present our metassembler algorithm that merges multiple assemblies of a genome into a single superior sequence. We apply it to the four genomes from the Assemblathon competitions and show it consistently and substantially improves the contiguity and quality of each assembly. We also develop guidelines for meta-assembly by systematically evaluating 120 permutations of merging the top 5 assemblies of the first Assemblathon competition. The software is open-source at http://metassembler.sourceforge.net.
Collapse
Affiliation(s)
- Alejandro Hernandez Wences
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA. .,Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México.
| | - Michael C Schatz
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
| |
Collapse
|
18
|
Pavlopoulos GA, Malliarakis D, Papanikolaou N, Theodosiou T, Enright AJ, Iliopoulos I. Visualizing genome and systems biology: technologies, tools, implementation techniques and trends, past, present and future. Gigascience 2015; 4:38. [PMID: 26309733 PMCID: PMC4548842 DOI: 10.1186/s13742-015-0077-2] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Accepted: 08/03/2015] [Indexed: 01/31/2023] Open
Abstract
"Α picture is worth a thousand words." This widely used adage sums up in a few words the notion that a successful visual representation of a concept should enable easy and rapid absorption of large amounts of information. Although, in general, the notion of capturing complex ideas using images is very appealing, would 1000 words be enough to describe the unknown in a research field such as the life sciences? Life sciences is one of the biggest generators of enormous datasets, mainly as a result of recent and rapid technological advances; their complexity can make these datasets incomprehensible without effective visualization methods. Here we discuss the past, present and future of genomic and systems biology visualization. We briefly comment on many visualization and analysis tools and the purposes that they serve. We focus on the latest libraries and programming languages that enable more effective, efficient and faster approaches for visualizing biological concepts, and also comment on the future human-computer interaction trends that would enable for enhancing visualization further.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | | | - Nikolas Papanikolaou
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | - Theodosis Theodosiou
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | - Anton J Enright
- EMBL - European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SD UK
| | - Ioannis Iliopoulos
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| |
Collapse
|
19
|
Schatz MC, Maron LG, Stein JC, Hernandez Wences A, Gurtowski J, Biggers E, Lee H, Kramer M, Antoniou E, Ghiban E, Wright MH, Chia JM, Ware D, McCouch SR, McCombie WR. Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica. Genome Biol 2015; 15:506. [PMID: 25468217 DOI: 10.1186/preaccept-2784872521277375] [Citation(s) in RCA: 101] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The use of high throughput genome-sequencing technologies has uncovered a large extent of structural variation in eukaryotic genomes that makes important contributions to genomic diversity and phenotypic variation. When the genomes of different strains of a given organism are compared, whole genome resequencing data are typically aligned to an established reference sequence. However, when the reference differs in significant structural ways from the individuals under study, the analysis is often incomplete or inaccurate. RESULTS Here, we use rice as a model to demonstrate how improvements in sequencing and assembly technology allow rapid and inexpensive de novo assembly of next generation sequence data into high-quality assemblies that can be directly compared using whole genome alignment to provide an unbiased assessment. Using this approach, we are able to accurately assess the "pan-genome" of three divergent rice varieties and document several megabases of each genome absent in the other two. CONCLUSIONS Many of the genome-specific loci are annotated to contain genes, reflecting the potential for new biological properties that would be missed by standard reference-mapping approaches. We further provide a detailed analysis of several loci associated with agriculturally important traits, including the S5 hybrid sterility locus, the Sub1 submergence tolerance locus, the LRK gene cluster associated with improved yield, and the Pup1 cluster associated with phosphorus deficiency, illustrating the utility of our approach for biological discovery. All of the data and software are openly available to support further breeding and functional studies of rice and other species.
Collapse
|
20
|
Conlan S, Thomas PJ, Deming C, Park M, Lau AF, Dekker JP, Snitkin ES, Clark TA, Luong K, Song Y, Tsai YC, Boitano M, Dayal J, Brooks SY, Schmidt B, Young AC, Thomas JW, Bouffard GG, Blakesley RW, Mullikin JC, Korlach J, Henderson DK, Frank KM, Palmore TN, Segre JA. Single-molecule sequencing to track plasmid diversity of hospital-associated carbapenemase-producing Enterobacteriaceae. Sci Transl Med 2015; 6:254ra126. [PMID: 25232178 DOI: 10.1126/scitranslmed.3009845] [Citation(s) in RCA: 242] [Impact Index Per Article: 24.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Public health officials have raised concerns that plasmid transfer between Enterobacteriaceae species may spread resistance to carbapenems, an antibiotic class of last resort, thereby rendering common health care-associated infections nearly impossible to treat. To determine the diversity of carbapenemase-encoding plasmids and assess their mobility among bacterial species, we performed comprehensive surveillance and genomic sequencing of carbapenem-resistant Enterobacteriaceae in the National Institutes of Health (NIH) Clinical Center patient population and hospital environment. We isolated a repertoire of carbapenemase-encoding Enterobacteriaceae, including multiple strains of Klebsiella pneumoniae, Klebsiella oxytoca, Escherichia coli, Enterobacter cloacae, Citrobacter freundii, and Pantoea species. Long-read genome sequencing with full end-to-end assembly revealed that these organisms carry the carbapenem resistance genes on a wide array of plasmids. K. pneumoniae and E. cloacae isolated simultaneously from a single patient harbored two different carbapenemase-encoding plasmids, indicating that plasmid transfer between organisms was unlikely within this patient. We did, however, find evidence of horizontal transfer of carbapenemase-encoding plasmids between K. pneumoniae, E. cloacae, and C. freundii in the hospital environment. Our data, including full plasmid identification, challenge assumptions about horizontal gene transfer events within patients and identify possible connections between patients and the hospital environment. In addition, we identified a new carbapenemase-encoding plasmid of potentially high clinical impact carried by K. pneumoniae, E. coli, E. cloacae, and Pantoea species, in unrelated patients and in the hospital environment.
Collapse
Affiliation(s)
- Sean Conlan
- National Human Genome Research Institute, Bethesda, MD 20892, USA
| | - Pamela J Thomas
- National Institutes of Health Intramural Sequencing Center (NISC), Bethesda, MD 20852, USA
| | - Clayton Deming
- National Human Genome Research Institute, Bethesda, MD 20892, USA
| | - Morgan Park
- National Institutes of Health Intramural Sequencing Center (NISC), Bethesda, MD 20852, USA
| | - Anna F Lau
- National Institutes of Health Clinical Center, Bethesda, MD 20892, USA
| | - John P Dekker
- National Institutes of Health Clinical Center, Bethesda, MD 20892, USA
| | - Evan S Snitkin
- National Human Genome Research Institute, Bethesda, MD 20892, USA
| | | | - Khai Luong
- Pacific Biosciences, Menlo Park, CA 94025, USA
| | - Yi Song
- Pacific Biosciences, Menlo Park, CA 94025, USA
| | | | | | - Jyoti Dayal
- National Institutes of Health Intramural Sequencing Center (NISC), Bethesda, MD 20852, USA
| | - Shelise Y Brooks
- National Institutes of Health Intramural Sequencing Center (NISC), Bethesda, MD 20852, USA
| | - Brian Schmidt
- National Institutes of Health Intramural Sequencing Center (NISC), Bethesda, MD 20852, USA
| | - Alice C Young
- National Institutes of Health Intramural Sequencing Center (NISC), Bethesda, MD 20852, USA
| | - James W Thomas
- National Institutes of Health Intramural Sequencing Center (NISC), Bethesda, MD 20852, USA
| | - Gerard G Bouffard
- National Institutes of Health Intramural Sequencing Center (NISC), Bethesda, MD 20852, USA
| | - Robert W Blakesley
- National Institutes of Health Intramural Sequencing Center (NISC), Bethesda, MD 20852, USA
| | | | - James C Mullikin
- National Institutes of Health Intramural Sequencing Center (NISC), Bethesda, MD 20852, USA
| | | | - David K Henderson
- National Institutes of Health Clinical Center, Bethesda, MD 20892, USA
| | - Karen M Frank
- National Institutes of Health Clinical Center, Bethesda, MD 20892, USA.
| | - Tara N Palmore
- National Institutes of Health Clinical Center, Bethesda, MD 20892, USA.
| | - Julia A Segre
- National Human Genome Research Institute, Bethesda, MD 20892, USA.
| |
Collapse
|
21
|
Dissemination of cephalosporin resistance genes between Escherichia coli strains from farm animals and humans by specific plasmid lineages. PLoS Genet 2014; 10:e1004776. [PMID: 25522320 PMCID: PMC4270446 DOI: 10.1371/journal.pgen.1004776] [Citation(s) in RCA: 247] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2014] [Accepted: 09/24/2014] [Indexed: 11/19/2022] Open
Abstract
Third-generation cephalosporins are a class of β-lactam antibiotics that are often used for the treatment of human infections caused by Gram-negative bacteria, especially Escherichia coli. Worryingly, the incidence of human infections caused by third-generation cephalosporin-resistant E. coli is increasing worldwide. Recent studies have suggested that these E. coli strains, and their antibiotic resistance genes, can spread from food-producing animals, via the food-chain, to humans. However, these studies used traditional typing methods, which may not have provided sufficient resolution to reliably assess the relatedness of these strains. We therefore used whole-genome sequencing (WGS) to study the relatedness of cephalosporin-resistant E. coli from humans, chicken meat, poultry and pigs. One strain collection included pairs of human and poultry-associated strains that had previously been considered to be identical based on Multi-Locus Sequence Typing, plasmid typing and antibiotic resistance gene sequencing. The second collection included isolates from farmers and their pigs. WGS analysis revealed considerable heterogeneity between human and poultry-associated isolates. The most closely related pairs of strains from both sources carried 1263 Single-Nucleotide Polymorphisms (SNPs) per Mbp core genome. In contrast, epidemiologically linked strains from humans and pigs differed by only 1.8 SNPs per Mbp core genome. WGS-based plasmid reconstructions revealed three distinct plasmid lineages (IncI1- and IncK-type) that carried cephalosporin resistance genes of the Extended-Spectrum Beta-Lactamase (ESBL)- and AmpC-types. The plasmid backbones within each lineage were virtually identical and were shared by genetically unrelated human and animal isolates. Plasmid reconstructions from short-read sequencing data were validated by long-read DNA sequencing for two strains. Our findings failed to demonstrate evidence for recent clonal transmission of cephalosporin-resistant E. coli strains from poultry to humans, as has been suggested based on traditional, low-resolution typing methods. Instead, our data suggest that cephalosporin resistance genes are mainly disseminated in animals and humans via distinct plasmids. The rapid global rise of infections caused by Escherichia coli that are resistant to clinically relevant antimicrobials, including third-generation cephalosporins, is cause for concern. The intestinal tract of livestock, in particular poultry, is an important reservoir for drug resistant E. coli, but it is unknown to what extent these bacteria can spread to humans. Food is thought to be an important source because drug-resistant E. coli have been detected in animals raised for meat consumption and in meat products. Previous studies that used traditional, low-resolution, genetic typing methods found that drug resistant E. coli present in humans and poultry were indistinguishable from each other, suggesting dissemination of these bacteria through the food-chain to humans. However, by applying high-resolution, whole-genome sequencing methods, we did not find evidence for such transmission of bacteria through the food-chain. Instead, by employing a novel approach for the reconstruction of mobile genetic elements from whole-genome sequence data, we discovered that genetically unrelated E. coli isolates from both humans and animal sources carried nearly identical plasmids that encode third-generation cephalosporin resistance determinants. Our data suggest that cephalosporin resistance is mainly disseminated via the transfer of mobile genetic elements between animals and humans.
Collapse
|
22
|
Schatz MC, Maron LG, Stein JC, Wences AH, Gurtowski J, Biggers E, Lee H, Kramer M, Antoniou E, Ghiban E, Wright MH, Chia JM, Ware D, McCouch SR, McCombie WR. Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica. Genome Biol 2014. [PMID: 25468217 PMCID: PMC4268812 DOI: 10.1186/s13059-014-0506-z] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Background The use of high throughput genome-sequencing technologies has uncovered a large extent of structural variation in eukaryotic genomes that makes important contributions to genomic diversity and phenotypic variation. When the genomes of different strains of a given organism are compared, whole genome resequencing data are typically aligned to an established reference sequence. However, when the reference differs in significant structural ways from the individuals under study, the analysis is often incomplete or inaccurate. Results Here, we use rice as a model to demonstrate how improvements in sequencing and assembly technology allow rapid and inexpensive de novo assembly of next generation sequence data into high-quality assemblies that can be directly compared using whole genome alignment to provide an unbiased assessment. Using this approach, we are able to accurately assess the ‘pan-genome’ of three divergent rice varieties and document several megabases of each genome absent in the other two. Conclusions Many of the genome-specific loci are annotated to contain genes, reflecting the potential for new biological properties that would be missed by standard reference-mapping approaches. We further provide a detailed analysis of several loci associated with agriculturally important traits, including the S5 hybrid sterility locus, the Sub1 submergence tolerance locus, the LRK gene cluster associated with improved yield, and the Pup1 cluster associated with phosphorus deficiency, illustrating the utility of our approach for biological discovery. All of the data and software are openly available to support further breeding and functional studies of rice and other species. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0506-z) contains supplementary material, which is available to authorized users.
Collapse
|
23
|
One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly. Curr Opin Microbiol 2014; 23:110-20. [PMID: 25461581 DOI: 10.1016/j.mib.2014.11.014] [Citation(s) in RCA: 269] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2014] [Revised: 11/17/2014] [Accepted: 11/18/2014] [Indexed: 11/20/2022]
Abstract
Like a jigsaw puzzle with large pieces, a genome sequenced with long reads is easier to assemble. However, recent sequencing technologies have favored lowering per-base cost at the expense of read length. This has dramatically reduced sequencing cost, but resulted in fragmented assemblies, which negatively affect downstream analyses and hinder the creation of finished (gapless, high-quality) genomes. In contrast, emerging long-read sequencing technologies can now produce reads tens of kilobases in length, enabling the automated finishing of microbial genomes for under $1000. This promises to improve the quality of reference databases and facilitate new studies of chromosomal structure and variation. We present an overview of these new technologies and the methods used to assemble long reads into complete genomes.
Collapse
|
24
|
Scott D, Ely B. Comparison of genome sequencing technology and assembly methods for the analysis of a GC-rich bacterial genome. Curr Microbiol 2014; 70:338-44. [PMID: 25377284 DOI: 10.1007/s00284-014-0721-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 09/28/2014] [Indexed: 01/13/2023]
Abstract
Improvements in technology and decreases in price have made de novo bacterial genomic sequencing a reality for many researchers, but it has created a need to evaluate the methods for generating a complete and accurate genome assembly. We sequenced the GC-rich Caulobacter henricii genome using the Illumina MiSeq, Roche 454, and Pacific Biosciences RS II sequencing systems. To generate a complete genome sequence, we performed assemblies using eight readily available programs and found that builds using the Illumina MiSeq and the Roche 454 data produced accurate yet numerous contigs. SPAdes performed the best followed by PANDAseq. In contrast, the Celera assembler produced a single genomic contig using the Pacific Biosciences data after error correction with the Illumina MiSeq data. In addition, we duplicated this build using the Pacific Biosciences data with HGAP2.0. The accuracy of these builds was verified by pulsed-field gel electrophoresis of genomic DNA cut with restriction enzymes.
Collapse
Affiliation(s)
- Derrick Scott
- Department of Biological Sciences, University of South Carolina, Columbia, SC, 29208, USA,
| | | |
Collapse
|
25
|
Endersen L, Guinane CM, Johnston C, Neve H, Coffey A, Ross RP, McAuliffe O, O'Mahony J. Genome analysis of Cronobacter phage vB_CsaP_Ss1 reveals an endolysin with potential for biocontrol of Gram-negative bacterial pathogens. J Gen Virol 2014; 96:463-477. [PMID: 25371517 DOI: 10.1099/vir.0.068494-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Bacteriophages and their derivatives are continuously gaining impetus as viable alternative therapeutic agents to control harmful multidrug-resistant bacterial pathogens, particularly in the food industry. The reduced efficacy of conventional antibiotics has resulted in a quest to find novel alternatives in the war against infectious disease. This study describes the full-genome sequence of Cronobacter phage vB_CsaP_Ss1, with subsequent cloning and expression of its endolysin, capable of hydrolysing Gram-negative peptidoglycan. Cronobacter phage vB_CsaP_Ss1 is composed of 42 205 bp of dsDNA with a G+C content of 46.1 mol%. A total of 57 ORFs were identified of which 18 could be assigned a putative function based on similarity to characterized proteins. The genome of Cronobacter phage vB_CsaP_Ss1 showed little similarity to any other bacteriophage genomes available in the database and thus was considered unique. In addition, functional analysis of the predicted endolysin (LysSs1) was also investigated. Zymographic experiments demonstrated the hydrolytic activity of LysSs1 against Gram-negative peptidoglycan, and this endolysin thus represents a novel candidate with potential for use against Gram-negative pathogens.
Collapse
Affiliation(s)
- Lorraine Endersen
- Department of Biological Sciences, Cork Institute of Technology, Cork, Ireland
| | - Caitriona M Guinane
- Biotechnology Department, Teagasc, Moorepark Food Research Centre, Fermoy, County Cork, Ireland
| | | | - Horst Neve
- Department of Microbiology, Max Rubner-Institut, Federal Research Institute of Nutrition and Food, Hermann-Weigmann-Strasse 1, Kiel, Germany
| | - Aidan Coffey
- Department of Biological Sciences, Cork Institute of Technology, Cork, Ireland
| | - R Paul Ross
- Biotechnology Department, Teagasc, Moorepark Food Research Centre, Fermoy, County Cork, Ireland
| | - Olivia McAuliffe
- Biotechnology Department, Teagasc, Moorepark Food Research Centre, Fermoy, County Cork, Ireland
| | - Jim O'Mahony
- Department of Biological Sciences, Cork Institute of Technology, Cork, Ireland
| |
Collapse
|
26
|
Bao E, Jiang T, Girke T. AlignGraph: algorithm for secondary de novo genome assembly guided by closely related references. Bioinformatics 2014; 30:i319-i328. [PMID: 24932000 PMCID: PMC4058956 DOI: 10.1093/bioinformatics/btu291] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Motivation:De novo assemblies of genomes remain one of the most challenging applications in next-generation sequencing. Usually, their results are incomplete and fragmented into hundreds of contigs. Repeats in genomes and sequencing errors are the main reasons for these complications. With the rapidly growing number of sequenced genomes, it is now feasible to improve assemblies by guiding them with genomes from related species. Results: Here we introduce AlignGraph, an algorithm for extending and joining de novo-assembled contigs or scaffolds guided by closely related reference genomes. It aligns paired-end (PE) reads and preassembled contigs or scaffolds to a close reference. From the obtained alignments, it builds a novel data structure, called the PE multipositional de Bruijn graph. The incorporated positional information from the alignments and PE reads allows us to extend the initial assemblies, while avoiding incorrect extensions and early terminations. In our performance tests, AlignGraph was able to substantially improve the contigs and scaffolds from several assemblers. For instance, 28.7–62.3% of the contigs of Arabidopsis thaliana and human could be extended, resulting in improvements of common assembly metrics, such as an increase of the N50 of the extendable contigs by 89.9–94.5% and 80.3–165.8%, respectively. In another test, AlignGraph was able to improve the assembly of a published genome (Arabidopsis strain Landsberg) by increasing the N50 of its extendable scaffolds by 86.6%. These results demonstrate AlignGraph’s efficiency in improving genome assemblies by taking advantage of closely related references. Availability and implementation: The AlignGraph software can be downloaded for free from this site: https://github.com/baoe/AlignGraph. Contact:thomas.girke@ucr.edu
Collapse
Affiliation(s)
- Ergude Bao
- Department of Computer Science and Engineering and Department of Botany and Plant Sciences, University of California, Riverside, CA 92521, USA
| | - Tao Jiang
- Department of Computer Science and Engineering and Department of Botany and Plant Sciences, University of California, Riverside, CA 92521, USA
| | - Thomas Girke
- Department of Computer Science and Engineering and Department of Botany and Plant Sciences, University of California, Riverside, CA 92521, USA
| |
Collapse
|
27
|
Ekblom R, Wolf JBW. A field guide to whole-genome sequencing, assembly and annotation. Evol Appl 2014; 7:1026-42. [PMID: 25553065 PMCID: PMC4231593 DOI: 10.1111/eva.12178] [Citation(s) in RCA: 191] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Accepted: 05/20/2014] [Indexed: 12/12/2022] Open
Abstract
Genome sequencing projects were long confined to biomedical model organisms and required the concerted effort of large consortia. Rapid progress in high-throughput sequencing technology and the simultaneous development of bioinformatic tools have democratized the field. It is now within reach for individual research groups in the eco-evolutionary and conservation community to generate de novo draft genome sequences for any organism of choice. Because of the cost and considerable effort involved in such an endeavour, the important first step is to thoroughly consider whether a genome sequence is necessary for addressing the biological question at hand. Once this decision is taken, a genome project requires careful planning with respect to the organism involved and the intended quality of the genome draft. Here, we briefly review the state of the art within this field and provide a step-by-step introduction to the workflow involved in genome sequencing, assembly and annotation with particular reference to large and complex genomes. This tutorial is targeted at scientists with a background in conservation genetics, but more generally, provides useful practical guidance for researchers engaging in whole-genome sequencing projects.
Collapse
Affiliation(s)
- Robert Ekblom
- Department of Evolutionary Biology, Uppsala University Uppsala, Sweden
| | - Jochen B W Wolf
- Department of Evolutionary Biology, Uppsala University Uppsala, Sweden
| |
Collapse
|
28
|
Abstract
Lactobacillus animalis 381-IL-28 is an integral component of a multistrain commercial culture with food biopreservative and pathogen biocontrol functionality. A draft sequence of the L. animalis 381-IL-28 genome is described in this paper.
Collapse
|
29
|
Cavanagh D, Guinane CM, Neve H, Coffey A, Ross RP, Fitzgerald GF, McAuliffe O. Phages of non-dairy lactococci: isolation and characterization of ΦL47, a phage infecting the grass isolate Lactococcus lactis ssp. cremoris DPC6860. Front Microbiol 2014; 4:417. [PMID: 24454309 PMCID: PMC3888941 DOI: 10.3389/fmicb.2013.00417] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Accepted: 12/18/2013] [Indexed: 11/17/2022] Open
Abstract
Lactococci isolated from non-dairy sources have been found to possess enhanced metabolic activity when compared to dairy strains. These capabilities may be harnessed through the use of these strains as starter or adjunct cultures to produce more diverse flavor profiles in cheese and other dairy products. To understand the interactions between these organisms and the phages that infect them, a number of phages were isolated against lactococcal strains of non-dairy origin. One such phage, ΦL47, was isolated from a sewage sample using the grass isolate L. lactis ssp. cremoris DPC6860 as a host. Visualization of phage virions by transmission electron microscopy established that this phage belongs to the family Siphoviridae and possesses a long tail fiber, previously unseen in dairy lactococcal phages. Determination of the lytic spectrum revealed a broader than expected host range, with ΦL47 capable of infecting 4 industrial dairy strains, including ML8, HP and 310, and 3 additional non-dairy isolates. Whole genome sequencing of ΦL47 revealed a dsDNA genome of 128, 546 bp, making it the largest sequenced lactococcal phage to date. In total, 190 open reading frames (ORFs) were identified, and comparative analysis revealed that the predicted products of 117 of these ORFs shared greater than 50% amino acid identity with those of L. lactis phage Φ949, a phage isolated from cheese whey. Despite their different ecological niches, the genomic content and organization of ΦL47 and Φ949 are quite similar, with both containing 4 gene clusters oriented in different transcriptional directions. Other features that distinguish ΦL47 from Φ949 and other lactococcal phages, in addition to the presence of the tail fiber and the genome length, include a low GC content (32.5%) and a high number of predicted tRNA genes (8). Comparative genome analysis supports the conclusion that ΦL47 is a new member of the 949 lactococcal phage group which currently includes the dairy Φ949.
Collapse
Affiliation(s)
- Daniel Cavanagh
- Department of Food Biosciences, Teagasc Food Research Centre Fermoy, Ireland ; Department of Microbiology, University College Cork Co. Cork, Ireland
| | - Caitriona M Guinane
- Department of Food Biosciences, Teagasc Food Research Centre Fermoy, Ireland
| | - Horst Neve
- Department of Microbiology and Biotechnology, Max Rubner-Institut, Federal Research Institute of Nutrition and Food Kiel, Germany
| | - Aidan Coffey
- Department of Biological Sciences, Cork Institute of Technology Co. Cork, Ireland
| | - R Paul Ross
- Department of Food Biosciences, Teagasc Food Research Centre Fermoy, Ireland
| | | | - Olivia McAuliffe
- Department of Food Biosciences, Teagasc Food Research Centre Fermoy, Ireland
| |
Collapse
|
30
|
Pavlopoulos GA, Oulas A, Iacucci E, Sifrim A, Moreau Y, Schneider R, Aerts J, Iliopoulos I. Unraveling genomic variation from next generation sequencing data. BioData Min 2013; 6:13. [PMID: 23885890 PMCID: PMC3726446 DOI: 10.1186/1756-0381-6-13] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2013] [Accepted: 07/18/2013] [Indexed: 12/29/2022] Open
Abstract
Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Division of Basic Sciences, University of Crete Medical School, Heraklion 71110, Greece.
| | | | | | | | | | | | | | | |
Collapse
|
31
|
Shin SC, Ahn DH, Kim SJ, Lee H, Oh TJ, Lee JE, Park H. Advantages of Single-Molecule Real-Time Sequencing in High-GC Content Genomes. PLoS One 2013; 8:e68824. [PMID: 23894349 PMCID: PMC3720884 DOI: 10.1371/journal.pone.0068824] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2013] [Accepted: 06/03/2013] [Indexed: 02/02/2023] Open
Abstract
Next-generation sequencing has become the most widely used sequencing technology in genomics research, but it has inherent drawbacks when dealing with high-GC content genomes. Recently, single-molecule real-time sequencing technology (SMRT) was introduced as a third-generation sequencing strategy to compensate for this drawback. Here, we report that the unbiased and longer read length of SMRT sequencing markedly improved genome assembly with high GC content via gap filling and repeat resolution.
Collapse
Affiliation(s)
| | - Do Hwan Ahn
- Korea Polar Research Institute, Yeonsu-gu, Incheon, Korea
- University of Science & Technology, Yuseong-gu, Daejeon, Korea
| | - Su Jin Kim
- College of Life Sciences and Biotechnology, Korea University, Seongbuk-gu, Seoul, Korea
| | - Hyoungseok Lee
- Korea Polar Research Institute, Yeonsu-gu, Incheon, Korea
| | - Tae-Jin Oh
- Department of Pharmaceutical Engineering, SunMoon University, Asan, Korea
| | | | - Hyun Park
- Korea Polar Research Institute, Yeonsu-gu, Incheon, Korea
- University of Science & Technology, Yuseong-gu, Daejeon, Korea
- * E-mail:
| |
Collapse
|
32
|
Bzhalava D, Johansson H, Ekström J, Faust H, Möller B, Eklund C, Nordin P, Stenquist B, Paoli J, Persson B, Forslund O, Dillner J. Unbiased approach for virus detection in skin lesions. PLoS One 2013; 8:e65953. [PMID: 23840382 PMCID: PMC3696016 DOI: 10.1371/journal.pone.0065953] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2012] [Accepted: 05/01/2013] [Indexed: 01/28/2023] Open
Abstract
To assess presence of virus DNA in skin lesions, swab samples from 82 squamous cell carcinomas of the skin (SCCs), 60 actinic keratoses (AKs), paraffin-embedded biopsies from 28 SCCs and 72 kerathoacanthomas (KAs) and fresh-frozen biopsies from 92 KAs, 85 SCCs and 92 AKs were analyzed by high throughput sequencing (HTS) using 454 or Ion Torrent technology. We found total of 4,284 viral reads, out of which 4,168 were Human Papillomavirus (HPV)-related, belonging to 15 known (HPV8, HPV12, HPV20, HPV36, HPV38, HPV45, HPV57, HPV59, HPV104, HPV105, HPV107, HPV109, HPV124, HPV138, HPV147), four previously described putative (HPV 915 F 06 007 FD1, FA73, FA101, SE42) and two putatively new HPV types (SE46, SE47). SE42 was cloned, sequenced, designated as HPV155 and found to have 76% similarity to the most closely related known HPV type. In conclusion, an unbiased approach for viral DNA detection in skin tumors has found that, although some new putative HPVs were found, known HPV types constituted most of the viral DNA.
Collapse
Affiliation(s)
- Davit Bzhalava
- Departments of Laboratory Medicine, Medical Epidemiology & Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Departments of Clinical Microbiology and Pathology, Karolinska Hospital, Stockholm, Sweden
| | - Hanna Johansson
- Department of Medical Microbiology, Skåne University Hospital, Lund University, Malmö, Sweden
| | - Johanna Ekström
- Departments of Laboratory Medicine, Medical Epidemiology & Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Departments of Clinical Microbiology and Pathology, Karolinska Hospital, Stockholm, Sweden
- Department of Medical Microbiology, Skåne University Hospital, Lund University, Malmö, Sweden
| | - Helena Faust
- Department of Medical Microbiology, Skåne University Hospital, Lund University, Malmö, Sweden
| | - Birgitta Möller
- Departments of Laboratory Medicine, Medical Epidemiology & Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Departments of Clinical Microbiology and Pathology, Karolinska Hospital, Stockholm, Sweden
| | - Carina Eklund
- Departments of Laboratory Medicine, Medical Epidemiology & Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Departments of Clinical Microbiology and Pathology, Karolinska Hospital, Stockholm, Sweden
| | - Peter Nordin
- Dermatology Clinic, Läkarhuset, Gothenburg, Sweden
| | - Bo Stenquist
- Department of Dermatology and Venereology, Sahlgrenska University Hospital, Institute of Clinical Sciences at the Sahlgrenska Academy, University of Gothenburg, Sweden
| | - John Paoli
- Department of Dermatology and Venereology, Sahlgrenska University Hospital, Institute of Clinical Sciences at the Sahlgrenska Academy, University of Gothenburg, Sweden
| | - Bengt Persson
- IFM Bioinformatics and Swedish e-Science Research Centre, Linköping University, Linköping, Sweden
| | - Ola Forslund
- Department of Medical Microbiology, Skåne University Hospital, Lund University, Malmö, Sweden
| | - Joakim Dillner
- Departments of Laboratory Medicine, Medical Epidemiology & Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Departments of Clinical Microbiology and Pathology, Karolinska Hospital, Stockholm, Sweden
- Department of Medical Microbiology, Skåne University Hospital, Lund University, Malmö, Sweden
- * E-mail:
| |
Collapse
|
33
|
Sedeek KEM, Qi W, Schauer MA, Gupta AK, Poveda L, Xu S, Liu ZJ, Grossniklaus U, Schiestl FP, Schlüter PM. Transcriptome and proteome data reveal candidate genes for pollinator attraction in sexually deceptive orchids. PLoS One 2013; 8:e64621. [PMID: 23734209 PMCID: PMC3667177 DOI: 10.1371/journal.pone.0064621] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2012] [Accepted: 04/17/2013] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND Sexually deceptive orchids of the genus Ophrys mimic the mating signals of their pollinator females to attract males as pollinators. This mode of pollination is highly specific and leads to strong reproductive isolation between species. This study aims to identify candidate genes responsible for pollinator attraction and reproductive isolation between three closely related species, O. exaltata, O. sphegodes and O. garganica. Floral traits such as odour, colour and morphology are necessary for successful pollinator attraction. In particular, different odour hydrocarbon profiles have been linked to differences in specific pollinator attraction among these species. Therefore, the identification of genes involved in these traits is important for understanding the molecular basis of pollinator attraction by sexually deceptive orchids. RESULTS We have created floral reference transcriptomes and proteomes for these three Ophrys species using a combination of next-generation sequencing (454 and Solexa), Sanger sequencing, and shotgun proteomics (tandem mass spectrometry). In total, 121 917 unique transcripts and 3531 proteins were identified. This represents the first orchid proteome and transcriptome from the orchid subfamily Orchidoideae. Proteome data revealed proteins corresponding to 2644 transcripts and 887 proteins not observed in the transcriptome. Candidate genes for hydrocarbon and anthocyanin biosynthesis were represented by 156 and 61 unique transcripts in 20 and 7 genes classes, respectively. Moreover, transcription factors putatively involved in the regulation of flower odour, colour and morphology were annotated, including Myb, MADS and TCP factors. CONCLUSION Our comprehensive data set generated by combining transcriptome and proteome technologies allowed identification of candidate genes for pollinator attraction and reproductive isolation among sexually deceptive orchids. This includes genes for hydrocarbon and anthocyanin biosynthesis and regulation, and the development of floral morphology. These data will serve as an invaluable resource for research in orchid floral biology, enabling studies into the molecular mechanisms of pollinator attraction and speciation.
Collapse
Affiliation(s)
- Khalid E M Sedeek
- Institute of Systematic Botany & Zürich-Basel Plant Science Centre, University of Zurich, Zürich, Switzerland
| | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Abstract
Genome sequencing is now affordable, but assembling plant genomes de novo remains challenging. We assess the state of the art of assembly and review the best practices for the community.
Collapse
|
35
|
Schatz MC, Witkowski J, McCombie WR. Current challenges in de novo plant genome sequencing and assembly. Genome Biol 2013; 13:243. [PMID: 22546054 DOI: 10.1186/gb4015] [Citation(s) in RCA: 74] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Genome sequencing is now affordable, but assembling plant genomes de novo remains challenging. We assess the state of the art of assembly and review the best practices for the community.
Collapse
Affiliation(s)
- Michael C Schatz
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.
| | | | | |
Collapse
|
36
|
Abstract
Advances in sequencing technologies and increased access to sequencing services have led to renewed interest in sequence and genome assembly. Concurrently, new applications for sequencing have emerged, including gene expression analysis, discovery of genomic variants and metagenomics, and each of these has different needs and challenges in terms of assembly. We survey the theoretical foundations that underlie modern assembly and highlight the options and practical trade-offs that need to be considered, focusing on how individual features address the needs of specific applications. We also review key software and the interplay between experimental design and efficacy of assembly.
Collapse
Affiliation(s)
- Niranjan Nagarajan
- Computational and Systems Biology, Genome Institute of Singapore, 138672 Singapore
| | | |
Collapse
|
37
|
Treangen TJ, Koren S, Sommer DD, Liu B, Astrovskaya I, Ondov B, Darling AE, Phillippy AM, Pop M. MetAMOS: a modular and open source metagenomic assembly and analysis pipeline. Genome Biol 2013; 14:R2. [PMID: 23320958 PMCID: PMC4053804 DOI: 10.1186/gb-2013-14-1-r2] [Citation(s) in RCA: 154] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2012] [Accepted: 01/15/2013] [Indexed: 12/31/2022] Open
Abstract
We describe MetAMOS, an open source and modular metagenomic assembly and analysis pipeline. MetAMOS represents an important step towards fully automated metagenomic analysis, starting with next-generation sequencing reads and producing genomic scaffolds, open-reading frames and taxonomic or functional annotations. MetAMOS can aid in reducing assembly errors, commonly encountered when assembling metagenomic samples, and improves taxonomic assignment accuracy while also reducing computational cost. MetAMOS can be downloaded from: https://github.com/treangen/MetAMOS.
Collapse
|
38
|
Abstract
Fibrisoma limi strain BUZ 3(T), a Gram-negative bacterium, was isolated from coastal mud from the North Sea (Fedderwardersiel, Germany) and characterized using a polyphasic approach in 2011. The genome consists of a chromosome of about 7.5 Mb and three plasmids.
Collapse
|
39
|
Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat Biotechnol 2012; 30:693-700. [PMID: 22750884 PMCID: PMC3707490 DOI: 10.1038/nbt.2280] [Citation(s) in RCA: 704] [Impact Index Per Article: 54.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2011] [Accepted: 05/18/2012] [Indexed: 01/18/2023]
Abstract
Single-molecule sequencing instruments can generate multikilobase sequences with the potential to greatly improve genome and transcriptome assembly. However, the error rates of single-molecule reads are high, which has limited their use thus far to resequencing bacteria. To address this limitation, we introduce a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences. We demonstrate the utility of this approach on reads generated by a PacBio RS instrument from phage, prokaryotic and eukaryotic whole genomes, including the previously unsequenced genome of the parrot Melopsittacus undulatus, as well as for RNA-Seq reads of the corn (Zea mays) transcriptome. Our long-read correction achieves >99.9% base-call accuracy, leading to substantially better assemblies than current sequencing strategies: in the best example, the median contig size was quintupled relative to high-coverage, second-generation assemblies. Greater gains are predicted if read lengths continue to increase, including the prospect of single-contig bacterial chromosome assembly.
Collapse
|
40
|
Genome sequence of Fibrella aestuarina BUZ 2(T), a filamentous marine bacterium. J Bacteriol 2012; 194:3555. [PMID: 22689241 DOI: 10.1128/jb.00550-12] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Fibrella aestuarina BUZ 2(T) is the type strain of the recently characterized genus Fibrella. Here we report the draft genome sequence of this strain, which consists of a single scaffold representing the chromosome (with 11 gaps) and a 161-kb circular plasmid.
Collapse
|