1
|
Opulente DA, LaBella AL, Harrison MC, Wolters JF, Liu C, Li Y, Kominek J, Steenwyk JL, Stoneman HR, VanDenAvond J, Miller CR, Langdon QK, Silva M, Gonçalves C, Ubbelohde EJ, Li Y, Buh KV, Jarzyna M, Haase MAB, Rosa CA, Čadež N, Libkind D, DeVirgilio JH, Hulfachor AB, Kurtzman CP, Sampaio JP, Gonçalves P, Zhou X, Shen XX, Groenewald M, Rokas A, Hittinger CT. Genomic factors shape carbon and nitrogen metabolic niche breadth across Saccharomycotina yeasts. Science 2024; 384:eadj4503. [PMID: 38662846 PMCID: PMC11298794 DOI: 10.1126/science.adj4503] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 03/22/2024] [Indexed: 05/03/2024]
Abstract
Organisms exhibit extensive variation in ecological niche breadth, from very narrow (specialists) to very broad (generalists). Two general paradigms have been proposed to explain this variation: (i) trade-offs between performance efficiency and breadth and (ii) the joint influence of extrinsic (environmental) and intrinsic (genomic) factors. We assembled genomic, metabolic, and ecological data from nearly all known species of the ancient fungal subphylum Saccharomycotina (1154 yeast strains from 1051 species), grown in 24 different environmental conditions, to examine niche breadth evolution. We found that large differences in the breadth of carbon utilization traits between yeasts stem from intrinsic differences in genes encoding specific metabolic pathways, but we found limited evidence for trade-offs. These comprehensive data argue that intrinsic factors shape niche breadth variation in microbes.
Collapse
Affiliation(s)
- Dana A. Opulente
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
- Biology Department Villanova University, Villanova, PA 19085, USA
| | - Abigail Leavitt LaBella
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- North Carolina Research Center (NCRC), Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, 150 Research Campus Drive, Kannapolis, NC 28081, USA
| | - Marie-Claire Harrison
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - John F. Wolters
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Chao Liu
- College of Agriculture and Biotechnology and Centre for Evolutionary & Organismal Biology, Zhejiang University, Hangzhou 310058, China
| | - Yonglin Li
- Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou 510642, China
| | - Jacek Kominek
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
- LifeMine Therapeutics, Inc., Cambridge, MA 02140, USA
| | - Jacob L. Steenwyk
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- Howards Hughes Medical Institute and the Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Hayley R. Stoneman
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
- University of Colorado - Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Jenna VanDenAvond
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Caroline R. Miller
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Quinn K. Langdon
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Margarida Silva
- UCIBIO, Department of Life Sciences, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
- Associate Laboratory i4HB, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
| | - Carla Gonçalves
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- UCIBIO, Department of Life Sciences, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
- Associate Laboratory i4HB, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
| | - Emily J. Ubbelohde
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Yuanning Li
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Institute of Marine Science and Technology, Shandong University, Qingdao 266237, China
- Laboratory for Marine Biology and Biotechnology, Qingdao Marine Science and Technology Center, Qingdao 266237, China
| | - Kelly V. Buh
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Martin Jarzyna
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- Graduate Program in Neuroscience and Department of Biology, Washington University School of Medicine, St. Louis, MO 63130, USA
| | - Max A. B. Haase
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
- Vilcek Institute of Graduate Biomedical Sciences and Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
- Department of Mechanistic Cell Biology, Max Planck Institute of Molecular Physiology, 44227 Dortmund, Germany
| | - Carlos A. Rosa
- Departamento de Microbiologia, ICB, C.P. 486, Universidade Federal de Minas Gerais, Belo Horizonte, MG, 31270-901, Brazil
| | - Neža Čadež
- Food Science and Technology Department, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Diego Libkind
- Centro de Referencia en Levaduras y Tecnología Cervecera (CRELTEC), Instituto Andino Patagónico de Tecnologías Biológicas y Geoambientales (IPATEC), Universidad Nacional del Comahue, CONICET, CRUB, Quintral 1250, San Carlos de Bariloche, 8400, Río Negro, Argentina
| | - Jeremy H. DeVirgilio
- Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Agricultural Research Service, U.S. Department of Agriculture, Peoria, IL 61604, USA
| | - Amanda Beth Hulfachor
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Cletus P. Kurtzman
- Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Agricultural Research Service, U.S. Department of Agriculture, Peoria, IL 61604, USA
| | - José Paulo Sampaio
- UCIBIO, Department of Life Sciences, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
- Associate Laboratory i4HB, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
| | - Paula Gonçalves
- UCIBIO, Department of Life Sciences, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
- Associate Laboratory i4HB, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
| | - Xiaofan Zhou
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou 510642, China
| | - Xing-Xing Shen
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- College of Agriculture and Biotechnology and Centre for Evolutionary & Organismal Biology, Zhejiang University, Hangzhou 310058, China
| | | | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - Chris Todd Hittinger
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
| |
Collapse
|
2
|
Chiu KP, Stuart L, Ooi HS, Yu J, Smith DG, Pei KJC. Genome sequencing and application of Taiwanese macaque Macaca cyclopis. Sci Rep 2023; 13:11545. [PMID: 37460589 DOI: 10.1038/s41598-023-38402-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 07/07/2023] [Indexed: 07/20/2023] Open
Abstract
Formosan macaque (Macaca cyclopis) is the only non-human primate in Taiwan Island. We performed de novo hybrid assembly for M. cyclopis using Illumina paired-end short reads, mate-pair reads and Nanopore long reads and obtained 5065 contigs with a N50 of 2.66 megabases. M. cyclopis contigs > = 10 kb were assigned to chromosomes using Indian rhesus macaque (Macaca mulatta mulatta) genome assembly Mmul_10 as reference, resulting in a draft of M. cyclopis genome of 2,846,042,475 bases, distributed in 21 chromosomes. The draft genome contains 23,462 transcriptional origins (genes), capable of expressing 716,231 exons in 59,484 transcripts. Genome-based phylogenetic study using the assembled M. cyclopis genome together with genomes of four other macaque species, human, orangutan and chimpanzee showed similar result as previously reported. However, the M. cyclopis species was found to diverge from Chinese M. mulatta lasiota about 1.8 million years ago. Fossil gene analysis detected the presence of gap and pol endogenous viral elements of simian retrovirus in all macaques tested, including M. fascicularis, M. m. mulatta and M. cyclopis. However, M. cyclopis showed ~ 2 times less in number and more uniform in chromosomal locations. The constrain in foreign genome disturbance, presumably due to geographical isolation, should be able to simplify genomics-related investigations, making M. cyclopis an ideal primate species for medical research.
Collapse
Affiliation(s)
- Kuo-Ping Chiu
- Genomics Research Center, Academia Sinica, Taipei, Taiwan.
- Top Science Biotechnologies, Inc., 4F, 50-2 Dingping Rd., Sec. 1, Shiding District, New Taipei City, 223002, Taiwan.
| | - Lutimba Stuart
- Top Science Biotechnologies, Inc., 4F, 50-2 Dingping Rd., Sec. 1, Shiding District, New Taipei City, 223002, Taiwan
| | - Hong Sain Ooi
- Top Science Biotechnologies, Inc., 4F, 50-2 Dingping Rd., Sec. 1, Shiding District, New Taipei City, 223002, Taiwan
| | - John Yu
- Institute of Stem Cell and Translational Cancer Research, Chang Gung Memorial Hospital at Linkou, No.5, Fu-Shin St., Kuei Shang, Taoyuan, 333, Taiwan
| | - David Glenn Smith
- Department of Anthropology, University of California Davis, Davis, CA, USA
| | - Kurtis Jai-Chyi Pei
- Institute of Wildlife Conservation, College of Veterinary Medicine, National Pingtung University of Science and Technology, Pingtung, Taiwan
| |
Collapse
|
3
|
Cheng C, Fei Z, Xiao P. Methods to improve the accuracy of next-generation sequencing. Front Bioeng Biotechnol 2023; 11:982111. [PMID: 36741756 PMCID: PMC9895957 DOI: 10.3389/fbioe.2023.982111] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 01/11/2023] [Indexed: 01/21/2023] Open
Abstract
Next-generation sequencing (NGS) is present in all fields of life science, which has greatly promoted the development of basic research while being gradually applied in clinical diagnosis. However, the cost and throughput advantages of next-generation sequencing are offset by large tradeoffs with respect to read length and accuracy. Specifically, its high error rate makes it extremely difficult to detect SNPs or low-abundance mutations, limiting its clinical applications, such as pharmacogenomics studies primarily based on SNP and early clinical diagnosis primarily based on low abundance mutations. Currently, Sanger sequencing is still considered to be the gold standard due to its high accuracy, so the results of next-generation sequencing require verification by Sanger sequencing in clinical practice. In order to maintain high quality next-generation sequencing data, a variety of improvements at the levels of template preparation, sequencing strategy and data processing have been developed. This study summarized the general procedures of next-generation sequencing platforms, highlighting the improvements involved in eliminating errors at each step. Furthermore, the challenges and future development of next-generation sequencing in clinical application was discussed.
Collapse
|
4
|
Dufault‐Thompson K, Jiang X. Applications of de Bruijn graphs in microbiome research. IMETA 2022; 1:e4. [PMID: 38867733 PMCID: PMC10989854 DOI: 10.1002/imt2.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 01/24/2022] [Accepted: 01/24/2022] [Indexed: 06/14/2024]
Abstract
High-throughput sequencing has become an increasingly central component of microbiome research. The development of de Bruijn graph-based methods for assembling high-throughput sequencing data has been an important part of the broader adoption of sequencing as part of biological studies. Recent advances in the construction and representation of de Bruijn graphs have led to new approaches that utilize the de Bruijn graph data structure to aid in different biological analyses. One type of application of these methods has been in alternative approaches to the assembly of sequencing data like gene-targeted assembly, where only gene sequences are assembled out of larger metagenomes, and differential assembly, where sequences that are differentially present between two samples are assembled. de Bruijn graphs have also been applied for comparative genomics where they can be used to represent large sets of multiple genomes or metagenomes where structural features in the graphs can be used to identify variants, indels, and homologous regions in sequences. These de Bruijn graph-based representations of sequencing data have even begun to be applied to whole sequencing databases for large-scale searches and experiment discovery. de Bruijn graphs have played a central role in how high-throughput sequencing data is worked with, and the rapid development of new tools that rely on these data structures suggests that they will continue to play an important role in biology in the future.
Collapse
Affiliation(s)
- Keith Dufault‐Thompson
- Intramural Research ProgramNational Library of Medicine, National Institutes of HealthBethesdaMarylandUSA
| | - Xiaofang Jiang
- Intramural Research ProgramNational Library of Medicine, National Institutes of HealthBethesdaMarylandUSA
| |
Collapse
|
5
|
An automated and combinative method for the predictive ranking of candidate effector proteins of fungal plant pathogens. Sci Rep 2021; 11:19731. [PMID: 34611252 PMCID: PMC8492765 DOI: 10.1038/s41598-021-99363-0] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 09/16/2021] [Indexed: 01/29/2023] Open
Abstract
Fungal plant-pathogens promote infection of their hosts through the release of 'effectors'-a broad class of cytotoxic or virulence-promoting molecules. Effectors may be recognised by resistance or sensitivity receptors in the host, which can determine disease outcomes. Accurate prediction of effectors remains a major challenge in plant pathology, but if achieved will facilitate rapid improvements to host disease resistance. This study presents a novel tool and pipeline for the ranking of predicted effector candidates-Predector-which interfaces with multiple software tools and methods, aggregates disparate features that are relevant to fungal effector proteins, and applies a pairwise learning to rank approach. Predector outperformed a typical combination of secretion and effector prediction methods in terms of ranking performance when applied to a curated set of confirmed effectors derived from multiple species. We present Predector ( https://github.com/ccdmb/predector ) as a useful tool for the ranking of predicted effector candidates, which also aggregates and reports additional supporting information relevant to effector and secretome prediction in a simple, efficient, and reproducible manner.
Collapse
|
6
|
Tollis M, Ferris E, Campbell MS, Harris VK, Rupp SM, Harrison TM, Kiso WK, Schmitt DL, Garner MM, Aktipis CA, Maley CC, Boddy AM, Yandell M, Gregg C, Schiffman JD, Abegglen LM. Elephant Genomes Reveal Accelerated Evolution in Mechanisms Underlying Disease Defenses. Mol Biol Evol 2021; 38:3606-3620. [PMID: 33944920 PMCID: PMC8383897 DOI: 10.1093/molbev/msab127] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Disease susceptibility and resistance are important factors for the conservation of endangered species, including elephants. We analyzed pathology data from 26 zoos and report that Asian elephants have increased neoplasia and malignancy prevalence compared with African bush elephants. This is consistent with observed higher susceptibility to tuberculosis and elephant endotheliotropic herpesvirus (EEHV) in Asian elephants. To investigate genetic mechanisms underlying disease resistance, including differential responses between species, among other elephant traits, we sequenced multiple elephant genomes. We report a draft assembly for an Asian elephant, and defined 862 and 1,017 conserved potential regulatory elements in Asian and African bush elephants, respectively. In the genomes of both elephant species, conserved elements were significantly enriched with genes differentially expressed between the species. In Asian elephants, these putative regulatory regions were involved in immunity pathways including tumor-necrosis factor, which plays an important role in EEHV response. Genomic sequences of African bush, forest, and Asian elephant genomes revealed extensive sequence conservation at TP53 retrogene loci across three species, which may be related to TP53 functionality in elephant cancer resistance. Positive selection scans revealed outlier genes related to additional elephant traits. Our study suggests that gene regulation plays an important role in the differential inflammatory response of Asian and African elephants, leading to increased infectious disease and cancer susceptibility in Asian elephants. These genomic discoveries can inform future functional and translational studies aimed at identifying effective treatment approaches for ill elephants, which may improve conservation.
Collapse
Affiliation(s)
- Marc Tollis
- School of Informatics, Computing, and Cyber Systems, Northern Arizona University, Flagstaff, AZ, USA
- Arizona Cancer Evolution Center, Arizona State University, Tempe, AZ, USA
| | - Elliott Ferris
- Department of Neurobiology and Anatomy, University of Utah, Salt Lake City, UT, USA
| | | | - Valerie K Harris
- Arizona Cancer Evolution Center, Arizona State University, Tempe, AZ, USA
- Center for Biocomputing, Security and Society, Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Shawn M Rupp
- Arizona Cancer Evolution Center, Arizona State University, Tempe, AZ, USA
- Center for Biocomputing, Security and Society, Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Tara M Harrison
- Arizona Cancer Evolution Center, Arizona State University, Tempe, AZ, USA
- Department of Clinical Sciences, North Carolina State University, Raleigh, NC, USA
| | - Wendy K Kiso
- Ringling Bros Center for Elephant Conservation, Polk City, FL, USA
| | - Dennis L Schmitt
- Ringling Bros Center for Elephant Conservation, Polk City, FL, USA
- William H. Darr College of Agriculture, Missouri State University, Springfield, MO, USA
| | | | - Christina Athena Aktipis
- Arizona Cancer Evolution Center, Arizona State University, Tempe, AZ, USA
- Department of Psychology, Arizona State University, Tempe, AZ, USA
| | - Carlo C Maley
- Arizona Cancer Evolution Center, Arizona State University, Tempe, AZ, USA
- Center for Biocomputing, Security and Society, Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Amy M Boddy
- Arizona Cancer Evolution Center, Arizona State University, Tempe, AZ, USA
- Department of Anthropology, University of California, Santa Barbara, CA, USA
| | - Mark Yandell
- Department of Genetics, University of Utah, Salt Lake City, UT, USA
| | - Christopher Gregg
- Department of Neurobiology and Anatomy, University of Utah, Salt Lake City, UT, USA
| | - Joshua D Schiffman
- Arizona Cancer Evolution Center, Arizona State University, Tempe, AZ, USA
- Department of Pediatrics & Huntsman Cancer Institute, University of Utah, Salt Lake City, UT, USA
- PEEL Therapeutics, Inc., Salt Lake City, UT, USA & Haifa, Israel
| | - Lisa M Abegglen
- Arizona Cancer Evolution Center, Arizona State University, Tempe, AZ, USA
- Department of Pediatrics & Huntsman Cancer Institute, University of Utah, Salt Lake City, UT, USA
- PEEL Therapeutics, Inc., Salt Lake City, UT, USA & Haifa, Israel
| |
Collapse
|
7
|
Khan J, Patro R. Cuttlefish: fast, parallel and low-memory compaction of de Bruijn graphs from large-scale genome collections. Bioinformatics 2021; 37:i177-i186. [PMID: 34252958 PMCID: PMC8275350 DOI: 10.1093/bioinformatics/btab309] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Motivation The construction of the compacted de Bruijn graph from collections of reference genomes is a task of increasing interest in genomic analyses. These graphs are increasingly used as sequence indices for short- and long-read alignment. Also, as we sequence and assemble a greater diversity of genomes, the colored compacted de Bruijn graph is being used more and more as the basis for efficient methods to perform comparative genomic analyses on these genomes. Therefore, time- and memory-efficient construction of the graph from reference sequences is an important problem. Results We introduce a new algorithm, implemented in the tool Cuttlefish, to construct the (colored) compacted de Bruijn graph from a collection of one or more genome references. Cuttlefish introduces a novel approach of modeling de Bruijn graph vertices as finite-state automata, and constrains these automata’s state-space to enable tracking their transitioning states with very low memory usage. Cuttlefish is also fast and highly parallelizable. Experimental results demonstrate that it scales much better than existing approaches, especially as the number and the scale of the input references grow. On a typical shared-memory machine, Cuttlefish constructed the graph for 100 human genomes in under 9 h, using ∼29 GB of memory. On 11 diverse conifer plant genomes, the compacted graph was constructed by Cuttlefish in under 9 h, using ∼84 GB of memory. The only other tool completing these tasks on the hardware took over 23 h using ∼126 GB of memory, and over 16 h using ∼289 GB of memory, respectively. Availability and implementation Cuttlefish is implemented in C++14, and is available under an open source license at https://github.com/COMBINE-lab/cuttlefish. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jamshed Khan
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA
| | - Rob Patro
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA
- To whom correspondence should be addressed.
| |
Collapse
|
8
|
Liao X, Li M, Luo J, Zou Y, Wu FX, Luo F, Wang J. EPGA-SC : A Framework for de novo Assembly of Single-Cell Sequencing Reads. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1492-1503. [PMID: 31603794 DOI: 10.1109/tcbb.2019.2945761] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Assembling genomes from single-cell sequencing data is essential for single-cell studies. However, single-cell assemblies are challenging due to (i) the highly non-uniform read coverage and (ii) the elevated levels of sequencing errors and chimeric reads. Although several assemblers for single-cell data have been proposed in recent years, most of them fail to construct correct long contigs. In this study, we present a new framework called EPGA-SC for de novo assembly of single-cell sequencing reads. The EPGA assembler has designed strategies to solve the problems caused by sequencing errors, sequencing biases, and repetitive regions. However, the extremely unbalanced and richer error types prevent EPGA to achieve high performance in single-cell sequencing data. In this study, we designed EPGA-SC based on EPGA. The main innovations of EPGA-SC are as follows: (i) classifying reads to reduce the proportion of false reads; (ii) using multiple sets of high precision paired-end reads generated from the high precision assemblies produced by other assembler such as SPAdes to overcome the impact of sequencing biases and repetitive regions; and (iii) developing novel algorithms for removing chimeric errors and extending contigs. We test EPGA-SC with seven datasets. The experimental results show that EPGA-SC can generate better assemblies than most current tools in most time in term of MAX contig, N50, NG50, NA50, and NGA50.
Collapse
|
9
|
Guo G, Chen H, Yan D, Cheng J, Chen JY, Chong Z. Scalable De Novo Genome Assembly Using a Pregel-Like Graph-Parallel System. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:731-744. [PMID: 31180898 DOI: 10.1109/tcbb.2019.2920912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
De novo genome assembly is the process of stitching short DNA sequences to generate longer DNA sequences, without using any reference sequence for alignment. It enables high-throughput genome sequencing and thus accelerates the discovery of new genomes. In this paper, we present a toolkit, called PPA-assembler, for de novo genome assembly in a distributed setting. The operations in our toolkit provide strong performance guarantees, and can be assembled to implement various sequencing strategies. PPA-assembler adopts the popular de Bruijn graph based approach for sequencing, and each operation is implemented as a program in Google's Pregel framework which can be easily deployed in a generic cluster. Experiments on large real and simulated datasets demonstrate that PPA-assembler is much more efficient than the state-of-the-arts while providing comparable sequencing quality. PPA-assembler has been open-sourced at https://github.com/yaobaiwei/PPA-Assembler.
Collapse
|
10
|
Tsuchiya MTN, Dikow RB, Koepfli KP, Frandsen PB, Rockwood LL, Maldonado JE. Whole-Genome Sequencing of Procyonids Reveals Distinct Demographic Histories in Kinkajou (Potos flavus) and Northern Raccoon (Procyon lotor). Genome Biol Evol 2020; 13:6040737. [PMID: 33331895 PMCID: PMC7851585 DOI: 10.1093/gbe/evaa255] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/30/2020] [Indexed: 01/20/2023] Open
Abstract
Here, we present the initial comparison of the nuclear genomes of the North American raccoon (Procyon lotor) and the kinkajou (Potos flavus) based on draft assemblies. These two species encompass almost 21 Myr of evolutionary history within Procyonidae. Because assemblies greatly impact downstream results, such as gene prediction and annotation, we tested three de novo assembly strategies (implemented in ALLPATHS-LG, MaSuRCA, and Platanus), some of which are optimized for highly heterozygous genomes. We discovered significant variation in contig and scaffold N50 and L50 statistics and genome completeness depending on the de novo assembler used. We compared the performance of these three assembly algorithms in hopes that this study will aid others looking to improve the quality of existing draft genome assemblies even without additional sequence data. We also estimate the demographic histories of raccoons and kinkajous using the Pairwise Sequentially Markovian Coalescent and discuss the variation in population sizes with respect to climatic change during the Pleistocene, as well as aspects of their ecology and taxonomy. Our goal is to achieve a better understanding of the evolutionary history of procyonids and to create robust genomic resources for future studies regarding adaptive divergence and selection.
Collapse
Affiliation(s)
- Mirian T N Tsuchiya
- Data Science Lab, Office of the Chief Information Officer, Smithsonian Institution, Washington, DC, USA.,Center for Conservation Genomics, Smithsonian Conservation Biology Institute, National Zoological Park, Washington, DC, USA
| | - Rebecca B Dikow
- Data Science Lab, Office of the Chief Information Officer, Smithsonian Institution, Washington, DC, USA
| | - Klaus-Peter Koepfli
- Smithsonian-Mason School of Conservation, George Mason Univeristy, Front Royal, VA, USA.,Smithsonian Conservation Biology Institute, Center for Species Survival, National Zoological Park, Washington, DC, USA
| | - Paul B Frandsen
- Data Science Lab, Office of the Chief Information Officer, Smithsonian Institution, Washington, DC, USA.,Department of Plant & Wildlife Sciences, Brigham Young University, Provo, UT, USA
| | - Larry L Rockwood
- Department of Biology, George Mason University, Fairfax, VA, USA
| | - Jesús E Maldonado
- Center for Conservation Genomics, Smithsonian Conservation Biology Institute, National Zoological Park, Washington, DC, USA.,Department of Biology, George Mason University, Fairfax, VA, USA
| |
Collapse
|
11
|
Kang SH, Pandey RP, Lee CM, Sim JS, Jeong JT, Choi BS, Jung M, Ginzburg D, Zhao K, Won SY, Oh TJ, Yu Y, Kim NH, Lee OR, Lee TH, Bashyal P, Kim TS, Lee WH, Hawkins C, Kim CK, Kim JS, Ahn BO, Rhee SY, Sohng JK. Genome-enabled discovery of anthraquinone biosynthesis in Senna tora. Nat Commun 2020; 11:5875. [PMID: 33208749 PMCID: PMC7674472 DOI: 10.1038/s41467-020-19681-1] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Accepted: 10/22/2020] [Indexed: 02/06/2023] Open
Abstract
Senna tora is a widely used medicinal plant. Its health benefits have been attributed to the large quantity of anthraquinones, but how they are made in plants remains a mystery. To identify the genes responsible for plant anthraquinone biosynthesis, we reveal the genome sequence of S. tora at the chromosome level with 526 Mb (96%) assembled into 13 chromosomes. Comparison among related plant species shows that a chalcone synthase-like (CHS-L) gene family has lineage-specifically and rapidly expanded in S. tora. Combining genomics, transcriptomics, metabolomics, and biochemistry, we identify a CHS-L gene contributing to the biosynthesis of anthraquinones. The S. tora reference genome will accelerate the discovery of biologically active anthraquinone biosynthesis pathways in medicinal plants.
Collapse
Affiliation(s)
- Sang-Ho Kang
- Genomics Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea.
| | - Ramesh Prasad Pandey
- Department of Pharmaceutical Engineering and Biotechnology, Sun Moon University, Asan, 31460, Republic of Korea
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Chang-Muk Lee
- Metabolic Engineering Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Joon-Soo Sim
- Metabolic Engineering Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Jin-Tae Jeong
- Department of Herbal Crop Research, National Institute of Horticultural and Herbal Science, RDA, Eumseong, 55365, Republic of Korea
| | - Beom-Soon Choi
- Phyzen Genomics Institute, Seongnam, 13488, Republic of Korea
| | - Myunghee Jung
- Department of Forest Science, College of Agriculture and Life Science, Seoul National University, Seoul, 08826, Republic of Korea
| | - Daniel Ginzburg
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, 94305, USA
| | - Kangmei Zhao
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, 94305, USA
| | - So Youn Won
- Genomics Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Tae-Jin Oh
- Department of Pharmaceutical Engineering and Biotechnology, Sun Moon University, Asan, 31460, Republic of Korea
| | - Yeisoo Yu
- Phyzen Genomics Institute, Seongnam, 13488, Republic of Korea
- DNACARE Co. Ltd, Seoul, 06730, Republic of Korea
| | - Nam-Hoon Kim
- Phyzen Genomics Institute, Seongnam, 13488, Republic of Korea
| | - Ok Ran Lee
- Department of Applied Plant Science, College of Agriculture and Life Science, Chonnam National University, Gwangju, 61186, Republic of Korea
| | - Tae-Ho Lee
- Genomics Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Puspalata Bashyal
- Department of Pharmaceutical Engineering and Biotechnology, Sun Moon University, Asan, 31460, Republic of Korea
| | - Tae-Su Kim
- Department of Pharmaceutical Engineering and Biotechnology, Sun Moon University, Asan, 31460, Republic of Korea
| | - Woo-Haeng Lee
- Department of Pharmaceutical Engineering and Biotechnology, Sun Moon University, Asan, 31460, Republic of Korea
| | - Charles Hawkins
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, 94305, USA
| | - Chang-Kug Kim
- Genomics Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Jung Sun Kim
- Genomics Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Byoung Ohg Ahn
- Genomics Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Seung Yon Rhee
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, 94305, USA.
| | - Jae Kyung Sohng
- Department of Pharmaceutical Engineering and Biotechnology, Sun Moon University, Asan, 31460, Republic of Korea.
| |
Collapse
|
12
|
Kang SH, Pandey RP, Lee CM, Sim JS, Jeong JT, Choi BS, Jung M, Ginzburg D, Zhao K, Won SY, Oh TJ, Yu Y, Kim NH, Lee OR, Lee TH, Bashyal P, Kim TS, Lee WH, Hawkins C, Kim CK, Kim JS, Ahn BO, Rhee SY, Sohng JK. Genome-enabled discovery of anthraquinone biosynthesis in Senna tora. Nat Commun 2020. [PMID: 33208749 DOI: 10.1101/2020.04.27.063495] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/30/2023] Open
Abstract
Senna tora is a widely used medicinal plant. Its health benefits have been attributed to the large quantity of anthraquinones, but how they are made in plants remains a mystery. To identify the genes responsible for plant anthraquinone biosynthesis, we reveal the genome sequence of S. tora at the chromosome level with 526 Mb (96%) assembled into 13 chromosomes. Comparison among related plant species shows that a chalcone synthase-like (CHS-L) gene family has lineage-specifically and rapidly expanded in S. tora. Combining genomics, transcriptomics, metabolomics, and biochemistry, we identify a CHS-L gene contributing to the biosynthesis of anthraquinones. The S. tora reference genome will accelerate the discovery of biologically active anthraquinone biosynthesis pathways in medicinal plants.
Collapse
Affiliation(s)
- Sang-Ho Kang
- Genomics Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea.
| | - Ramesh Prasad Pandey
- Department of Pharmaceutical Engineering and Biotechnology, Sun Moon University, Asan, 31460, Republic of Korea
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Chang-Muk Lee
- Metabolic Engineering Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Joon-Soo Sim
- Metabolic Engineering Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Jin-Tae Jeong
- Department of Herbal Crop Research, National Institute of Horticultural and Herbal Science, RDA, Eumseong, 55365, Republic of Korea
| | - Beom-Soon Choi
- Phyzen Genomics Institute, Seongnam, 13488, Republic of Korea
| | - Myunghee Jung
- Department of Forest Science, College of Agriculture and Life Science, Seoul National University, Seoul, 08826, Republic of Korea
| | - Daniel Ginzburg
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, 94305, USA
| | - Kangmei Zhao
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, 94305, USA
| | - So Youn Won
- Genomics Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Tae-Jin Oh
- Department of Pharmaceutical Engineering and Biotechnology, Sun Moon University, Asan, 31460, Republic of Korea
| | - Yeisoo Yu
- Phyzen Genomics Institute, Seongnam, 13488, Republic of Korea
- DNACARE Co. Ltd, Seoul, 06730, Republic of Korea
| | - Nam-Hoon Kim
- Phyzen Genomics Institute, Seongnam, 13488, Republic of Korea
| | - Ok Ran Lee
- Department of Applied Plant Science, College of Agriculture and Life Science, Chonnam National University, Gwangju, 61186, Republic of Korea
| | - Tae-Ho Lee
- Genomics Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Puspalata Bashyal
- Department of Pharmaceutical Engineering and Biotechnology, Sun Moon University, Asan, 31460, Republic of Korea
| | - Tae-Su Kim
- Department of Pharmaceutical Engineering and Biotechnology, Sun Moon University, Asan, 31460, Republic of Korea
| | - Woo-Haeng Lee
- Department of Pharmaceutical Engineering and Biotechnology, Sun Moon University, Asan, 31460, Republic of Korea
| | - Charles Hawkins
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, 94305, USA
| | - Chang-Kug Kim
- Genomics Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Jung Sun Kim
- Genomics Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Byoung Ohg Ahn
- Genomics Division, National Institute of Agricultural Sciences, RDA, Jeonju, 54874, Republic of Korea
| | - Seung Yon Rhee
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, 94305, USA.
| | - Jae Kyung Sohng
- Department of Pharmaceutical Engineering and Biotechnology, Sun Moon University, Asan, 31460, Republic of Korea.
| |
Collapse
|
13
|
Segerman B. The Most Frequently Used Sequencing Technologies and Assembly Methods in Different Time Segments of the Bacterial Surveillance and RefSeq Genome Databases. Front Cell Infect Microbiol 2020; 10:527102. [PMID: 33194784 PMCID: PMC7604302 DOI: 10.3389/fcimb.2020.527102] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Accepted: 09/08/2020] [Indexed: 01/05/2023] Open
Abstract
Whole genome sequencing has become a powerful tool in modern microbiology. Especially bacterial genomes are sequenced in high numbers. Whole genome sequencing is not only used in research projects, but also in surveillance projects and outbreak investigations. Many whole genome analysis workflows begins with the production of a genome assembly. To accomplish this, a number of different sequencing technologies and assembly methods are available. Here, a summarization is provided over the most frequently used sequence technology and genome assembly approaches reported for the bacterial RefSeq genomes and for the bacterial genomes submitted as belonging to a surveillance project. The data is presented both in total and broken up on a per year basis. Information associated with over 400,000 publically available genomes dated April 2020 and prior were used. The information summarized include (i) the most frequently used sequencing technologies, (ii) the most common combinations of sequencing technologies, (iii) the most reported sequencing depth, and (iv) the most frequently used assembly software solutions. In all, this mini review provides an overview of the currently most common workflows for producing bacterial whole genome sequence assemblies.
Collapse
Affiliation(s)
- Bo Segerman
- Department of Microbiology, National Veterinary Institute (SVA), Uppsala, Sweden.,Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
14
|
Peña A, Busquets A, Gomila M, Mulet M, Gomila RM, Garcia-Valdes E, Reddy TBK, Huntemann M, Varghese N, Ivanova N, Chen IM, Göker M, Woyke T, Klenk HP, Kyrpides N, Lalucat J. High-quality draft genome sequences of Pseudomonas monteilii DSM 14164 T, Pseudomonas mosselii DSM 17497 T, Pseudomonas plecoglossicida DSM 15088 T, Pseudomonas taiwanensis DSM 21245 T and Pseudomonas vranovensis DSM 16006 T: taxonomic considerations. Access Microbiol 2020; 1:e000067. [PMID: 32974501 PMCID: PMC7491935 DOI: 10.1099/acmi.0.000067] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Accepted: 09/20/2019] [Indexed: 11/21/2022] Open
Abstract
Pseudomonas is the bacterial genus of Gram-negative bacteria with the highest number of recognized species. It is divided phylogenetically into three lineages and at least 11 groups of species. The Pseudomonas putida group of species is one of the most versatile and best studied. It comprises 15 species with validly published names. As a part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) project, we present the genome sequences of the type strains of five species included in this group: Pseudomonas monteilii (DSM 14164T), Pseudomonas mosselii (DSM 17497T), Pseudomonas plecoglossicida (DSM 15088T), Pseudomonas taiwanensis (DSM 21245T) and Pseudomonas vranovensis (DSM 16006T). These strains represent species of environmental and also of clinical interest due to their pathogenic properties against humans and animals. Some strains of these species promote plant growth or act as plant pathogens. Their genome sizes are among the largest in the group, ranging from 5.3 to 6.3 Mbp. In addition, the genome sequences of the type strains in the Pseudomonas taxonomy were analysed via genome-wide taxonomic comparisons of ANIb, gANI and GGDC values among 130 Pseudomonas strains classified within the group. The results demonstrate that at least 36 genomic species can be delineated within the P. putida phylogenetic group of species.
Collapse
Affiliation(s)
- Arantxa Peña
- Department of Biology-Microbiology, Universitat de les Illes Balears, Palma de, Mallorca, Spain
| | - Antonio Busquets
- Department of Biology-Microbiology, Universitat de les Illes Balears, Palma de, Mallorca, Spain
| | - Margarita Gomila
- Department of Biology-Microbiology, Universitat de les Illes Balears, Palma de, Mallorca, Spain
| | - Magdalena Mulet
- Department of Biology-Microbiology, Universitat de les Illes Balears, Palma de, Mallorca, Spain
| | - Rosa M Gomila
- Serveis Cientifico-Tècnics, Universitat de les Illes Balears, Palma de Mallorca, Spain
| | - Elena Garcia-Valdes
- Department of Biology-Microbiology, Universitat de les Illes Balears, Palma de, Mallorca, Spain.,Institut Mediterrani d'Estudis Avançats (IMEDEA, CSIC-UIB), Palma de Mallorca, Spain
| | - T B K Reddy
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698, USA
| | - Marcel Huntemann
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698, USA
| | - Neha Varghese
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698, USA
| | - Natalia Ivanova
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698, USA
| | - I-Min Chen
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698, USA
| | - Markus Göker
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, 38124 Braunschweig, Germany
| | - Tanja Woyke
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698, USA
| | - Hans-Peter Klenk
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
| | - Nikos Kyrpides
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698, USA
| | - Jorge Lalucat
- Department of Biology-Microbiology, Universitat de les Illes Balears, Palma de, Mallorca, Spain.,Institut Mediterrani d'Estudis Avançats (IMEDEA, CSIC-UIB), Palma de Mallorca, Spain
| |
Collapse
|
15
|
Holley G, Melsted P. Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs. Genome Biol 2020; 21:249. [PMID: 32943081 PMCID: PMC7499882 DOI: 10.1186/s13059-020-02135-8] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Accepted: 08/06/2020] [Indexed: 02/07/2023] Open
Abstract
Memory consumption of de Bruijn graphs is often prohibitive. Most de Bruijn graph-based assemblers reduce the complexity by compacting paths into single vertices, but this is challenging as it requires the uncompacted de Bruijn graph to be available in memory. We present a parallel and memory-efficient algorithm enabling the direct construction of the compacted de Bruijn graph without producing the intermediate uncompacted graph. Bifrost features a broad range of functions, such as indexing, editing, and querying the graph, and includes a graph coloring method that maps each k-mer of the graph to the genomes it occurs in.Availability https://github.com/pmelsted/bifrost.
Collapse
Affiliation(s)
- Guillaume Holley
- Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland, Reykjavík, Iceland.
| | - Páll Melsted
- Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland, Reykjavík, Iceland
| |
Collapse
|
16
|
Draft Genome Assembly of Rhodobacter sphaeroides 2.4.1 Substrain H2 from Nanopore Data. Microbiol Resour Announc 2020; 9:9/29/e00414-20. [PMID: 32675180 PMCID: PMC7365791 DOI: 10.1128/mra.00414-20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Rhodobacter sphaeroides is a purple bacterium with complex genomic architecture. Here, a draft genome is reported for R. sphaeroides strain 2.4.1 substrain H2, which was generated exclusively from Nanopore sequencing data. Rhodobacter sphaeroides is a purple bacterium with complex genomic architecture. Here, a draft genome is reported for R. sphaeroides strain 2.4.1 substrain H2, which was generated exclusively from Nanopore sequencing data.
Collapse
|
17
|
Morris KM, Hindle MM, Boitard S, Burt DW, Danner AF, Eory L, Forrest HL, Gourichon D, Gros J, Hillier LW, Jaffredo T, Khoury H, Lansford R, Leterrier C, Loudon A, Mason AS, Meddle SL, Minvielle F, Minx P, Pitel F, Seiler JP, Shimmura T, Tomlinson C, Vignal A, Webster RG, Yoshimura T, Warren WC, Smith J. The quail genome: insights into social behaviour, seasonal biology and infectious disease response. BMC Biol 2020; 18:14. [PMID: 32050986 PMCID: PMC7017630 DOI: 10.1186/s12915-020-0743-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 01/24/2020] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND The Japanese quail (Coturnix japonica) is a popular domestic poultry species and an increasingly significant model species in avian developmental, behavioural and disease research. RESULTS We have produced a high-quality quail genome sequence, spanning 0.93 Gb assigned to 33 chromosomes. In terms of contiguity, assembly statistics, gene content and chromosomal organisation, the quail genome shows high similarity to the chicken genome. We demonstrate the utility of this genome through three diverse applications. First, we identify selection signatures and candidate genes associated with social behaviour in the quail genome, an important agricultural and domestication trait. Second, we investigate the effects and interaction of photoperiod and temperature on the transcriptome of the quail medial basal hypothalamus, revealing key mechanisms of photoperiodism. Finally, we investigate the response of quail to H5N1 influenza infection. In quail lung, many critical immune genes and pathways were downregulated after H5N1 infection, and this may be key to the susceptibility of quail to H5N1. CONCLUSIONS We have produced a high-quality genome of the quail which will facilitate further studies into diverse research questions using the quail as a model avian species.
Collapse
Affiliation(s)
- Katrina M Morris
- The Roslin Institute and R(D)SVS, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK.
| | - Matthew M Hindle
- The Roslin Institute and R(D)SVS, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK
| | - Simon Boitard
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31326, Castanet Tolosan, France
| | - David W Burt
- The John Hay Building, Queensland Biosciences Precinct, 306 Carmody Road, The University of Queensland, QLD, St Lucia, 4072, Australia
| | - Angela F Danner
- Virology Division, Department of Infectious Diseases, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA
| | - Lel Eory
- The Roslin Institute and R(D)SVS, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK
| | - Heather L Forrest
- Virology Division, Department of Infectious Diseases, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA
| | - David Gourichon
- PEAT Pôle d'Expérimentation Avicole de Tours, Centre de recherche Val de Loire, INRAE, 1295, Nouzilly, UE, France
| | - Jerome Gros
- Department of Developmental and Stem Cell Biology, Institut Pasteur, 25 rue du Docteur Roux, 75724, Cedex 15, Paris, France
- CNRS URA3738, 25 rue du Dr Roux, 75015, Paris, France
| | - LaDeana W Hillier
- McDonnell Genome Institute, Washington University School of Medicine, 4444 Forest Park Blvd, St Louis, MO, 63108, USA
| | - Thierry Jaffredo
- CNRS UMR7622, Inserm U 1156, Laboratoire de Biologie du Développement, Sorbonne Université, IBPS, 75005, Paris, France
| | - Hanane Khoury
- CNRS UMR7622, Inserm U 1156, Laboratoire de Biologie du Développement, Sorbonne Université, IBPS, 75005, Paris, France
| | - Rusty Lansford
- Department of Radiology and Developmental Neuroscience Program, Saban Research Institute, Children's Hospital Los Angeles and Keck School of Medicine of the University of Southern California, Los Angeles, CA, 90027, USA
| | - Christine Leterrier
- UMR85 Physiologie de la Reproduction et des Comportements, INRAE, CNRS, Université François Rabelais, IFCE, INRAE, Val de Loire, 37380, Nouzilly, Centre, France
| | - Andrew Loudon
- Centre for Biological Timing, Faculty of Biology, Medicine and Health, School of Medical Sciences, University of Manchester, 3.001, A.V. Hill Building, Oxford Road, Manchester, M13 9PT, UK
| | - Andrew S Mason
- The Roslin Institute and R(D)SVS, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK
| | - Simone L Meddle
- The Roslin Institute and R(D)SVS, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK
| | - Francis Minvielle
- GABI, INRAE, AgroParisTech, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Patrick Minx
- McDonnell Genome Institute, Washington University School of Medicine, 4444 Forest Park Blvd, St Louis, MO, 63108, USA
| | - Frédérique Pitel
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31326, Castanet Tolosan, France
| | - J Patrick Seiler
- Virology Division, Department of Infectious Diseases, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA
| | - Tsuyoshi Shimmura
- Department of Biological Production, Tokyo University of Agriculture and Technology, 3-8-1 Harumi-cho, Fuchu, Tokyo, 183-8538, Japan
| | - Chad Tomlinson
- McDonnell Genome Institute, Washington University School of Medicine, 4444 Forest Park Blvd, St Louis, MO, 63108, USA
| | - Alain Vignal
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31326, Castanet Tolosan, France
| | - Robert G Webster
- Virology Division, Department of Infectious Diseases, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA
| | - Takashi Yoshimura
- Institute of Transformative Bio-Molecules (WPI-ITbM), Nagoya University, Furo-cho, Chikusa-ku, Nagoya, 464-8601, Japan
| | - Wesley C Warren
- Department of Animal Sciences, Department of Surgery, Institute for Data Science and Informatics, University of Missouri, Bond Life Sciences Center, 1201 Rollins Street, Columbia, MO, 65211, USA
| | - Jacqueline Smith
- The Roslin Institute and R(D)SVS, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK
| |
Collapse
|
18
|
Thomas GWC, Dohmen E, Hughes DST, Murali SC, Poelchau M, Glastad K, Anstead CA, Ayoub NA, Batterham P, Bellair M, Binford GJ, Chao H, Chen YH, Childers C, Dinh H, Doddapaneni HV, Duan JJ, Dugan S, Esposito LA, Friedrich M, Garb J, Gasser RB, Goodisman MAD, Gundersen-Rindal DE, Han Y, Handler AM, Hatakeyama M, Hering L, Hunter WB, Ioannidis P, Jayaseelan JC, Kalra D, Khila A, Korhonen PK, Lee CE, Lee SL, Li Y, Lindsey ARI, Mayer G, McGregor AP, McKenna DD, Misof B, Munidasa M, Munoz-Torres M, Muzny DM, Niehuis O, Osuji-Lacy N, Palli SR, Panfilio KA, Pechmann M, Perry T, Peters RS, Poynton HC, Prpic NM, Qu J, Rotenberg D, Schal C, Schoville SD, Scully ED, Skinner E, Sloan DB, Stouthamer R, Strand MR, Szucsich NU, Wijeratne A, Young ND, Zattara EE, Benoit JB, Zdobnov EM, Pfrender ME, Hackett KJ, Werren JH, Worley KC, Gibbs RA, Chipman AD, Waterhouse RM, Bornberg-Bauer E, Hahn MW, Richards S. Gene content evolution in the arthropods. Genome Biol 2020; 21:15. [PMID: 31969194 PMCID: PMC6977273 DOI: 10.1186/s13059-019-1925-7] [Citation(s) in RCA: 106] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 12/26/2019] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Arthropods comprise the largest and most diverse phylum on Earth and play vital roles in nearly every ecosystem. Their diversity stems in part from variations on a conserved body plan, resulting from and recorded in adaptive changes in the genome. Dissection of the genomic record of sequence change enables broad questions regarding genome evolution to be addressed, even across hyper-diverse taxa within arthropods. RESULTS Using 76 whole genome sequences representing 21 orders spanning more than 500 million years of arthropod evolution, we document changes in gene and protein domain content and provide temporal and phylogenetic context for interpreting these innovations. We identify many novel gene families that arose early in the evolution of arthropods and during the diversification of insects into modern orders. We reveal unexpected variation in patterns of DNA methylation across arthropods and examples of gene family and protein domain evolution coincident with the appearance of notable phenotypic and physiological adaptations such as flight, metamorphosis, sociality, and chemoperception. CONCLUSIONS These analyses demonstrate how large-scale comparative genomics can provide broad new insights into the genotype to phenotype map and generate testable hypotheses about the evolution of animal diversity.
Collapse
Affiliation(s)
- Gregg W. C. Thomas
- 0000 0001 0790 959Xgrid.411377.7Department of Biology and Department of Computer Science, Indiana University, Bloomington, IN USA
| | - Elias Dohmen
- Institute for Evolution and Biodiversity, University of Münsterss, 48149 Münster, Germany ,0000 0001 2287 2617grid.9026.dInstitute for Bioinformatics and Chemoinformatics, University of Hamburg, Hamburg, Germany ,Westphalian University of Applied Sciences, 45665 Recklinghausen, Germany
| | - Daniel S. T. Hughes
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA ,0000000419368729grid.21729.3fPresent Address: Institute for Genomic Medicine, Columbia University, New York, NY 10032 USA
| | - Shwetha C. Murali
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA ,0000000122986657grid.34477.33Present Address: Howard Hughes Medical Institute, Department of Genome Sciences, University of Washington, Seattle, WA 98195 USA
| | - Monica Poelchau
- 0000 0001 2113 2895grid.483014.aNational Agricultural Library, USDA, Beltsville, MD 20705 USA
| | - Karl Glastad
- 0000 0001 2097 4943grid.213917.fSchool of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332 USA ,0000 0004 1936 8972grid.25879.31Present Address: Penn Epigenetics Institute, Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104 USA
| | - Clare A. Anstead
- 0000 0001 2179 088Xgrid.1008.9Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC 3010 Australia
| | - Nadia A. Ayoub
- grid.268042.aDepartment of Biology, Washington and Lee University, 204 West Washington Street, Lexington, VA 24450 USA
| | - Phillip Batterham
- 0000 0001 2179 088Xgrid.1008.9School of BioSciences Science Faculty, The University of Melbourne, Melbourne, VIC 3010 Australia
| | - Michelle Bellair
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA ,Present Address: CooperGenomics, Houston, TX USA
| | - Greta J. Binford
- 0000 0004 1936 9043grid.259053.8Department of Biology, Lewis & Clark College, Portland, OR 97219 USA
| | - Hsu Chao
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Yolanda H. Chen
- 0000 0004 1936 7689grid.59062.38Department of Plant and Soil Sciences, University of Vermont, Burlington, USA
| | - Christopher Childers
- 0000 0001 2113 2895grid.483014.aNational Agricultural Library, USDA, Beltsville, MD 20705 USA
| | - Huyen Dinh
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Harsha Vardhan Doddapaneni
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Jian J. Duan
- 0000 0004 0404 0958grid.463419.dBeneficial Insects Introduction Research Unit, United States Department of Agriculture, Agricultural Research Service, Newark, DE USA
| | - Shannon Dugan
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Lauren A. Esposito
- 0000 0004 0461 6769grid.242287.9Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Drive, San Francisco, CA 94118 USA
| | - Markus Friedrich
- 0000 0001 1456 7807grid.254444.7Department of Biological Sciences, Wayne State University, Detroit, MI 48202 USA
| | - Jessica Garb
- 0000 0000 9620 1122grid.225262.3Department of Biological Sciences, University of Massachusetts Lowell, 198 Riverside Street, Lowell, MA 01854 USA
| | - Robin B. Gasser
- 0000 0001 2179 088Xgrid.1008.9Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC 3010 Australia
| | - Michael A. D. Goodisman
- 0000 0001 2097 4943grid.213917.fSchool of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332 USA
| | - Dawn E. Gundersen-Rindal
- 0000 0004 0404 0958grid.463419.dUSDA-ARS Invasive Insect Biocontrol and Behavior Laboratory, Beltsville, MD USA
| | - Yi Han
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Alfred M. Handler
- 0000 0004 0404 0958grid.463419.dUSDA-ARS, Center for Medical, Agricultural, and Veterinary Entomology, 1700 S.W. 23rd Drive, Gainesville, FL 32608 USA
| | - Masatsugu Hatakeyama
- 0000 0001 0699 0373grid.410590.9Division of Insect Sciences, National Institute of Agrobiological Sciences, Owashi, Tsukuba, 305-8634 Japan
| | - Lars Hering
- 0000 0001 1089 1036grid.5155.4Department of Zoology, Institute of Biology, University of Kassel, 34132 Kassel, Germany
| | - Wayne B. Hunter
- 0000 0004 0404 0958grid.463419.dUSDA ARS, U. S. Horticultural Research Laboratory, Ft. Pierce, FL 34945 USA
| | - Panagiotis Ioannidis
- 0000 0001 2322 4988grid.8591.5Department of Genetic Medicine and Development and Swiss Institute of Bioinformatics, University of Geneva, 1211 Geneva, Switzerland ,0000 0004 0635 685Xgrid.4834.bPresent Address: Foundation for Research and Technology Hellas, Institute of Molecular Biology and Biotechnology, Vassilika Vouton, 70013 Heraklion, Greece
| | - Joy C. Jayaseelan
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Divya Kalra
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Abderrahman Khila
- 0000 0001 2150 7757grid.7849.2Université de Lyon, Institut de Génomique Fonctionnelle de Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, 46 allée d’Italie, 69364 Lyon, France
| | - Pasi K. Korhonen
- 0000 0001 2179 088Xgrid.1008.9Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC 3010 Australia
| | - Carol Eunmi Lee
- 0000 0001 0701 8607grid.28803.31Department of Integrative Biology, University of Wisconsin, Madison, WI 53706 USA
| | - Sandra L. Lee
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Yiyuan Li
- 0000 0001 2168 0066grid.131063.6Department of Biological Sciences, University of Notre Dame, 109B Galvin Life Sciences, Notre Dame, IN 46556 USA
| | - Amelia R. I. Lindsey
- 0000 0001 2222 1582grid.266097.cDepartment of Entomology, University of California Riverside, Riverside, CA USA ,0000 0001 0790 959Xgrid.411377.7Present Address: Department of Biology, Indiana University, Bloomington, IN USA
| | - Georg Mayer
- 0000 0001 1089 1036grid.5155.4Department of Zoology, Institute of Biology, University of Kassel, 34132 Kassel, Germany
| | - Alistair P. McGregor
- 0000 0001 0726 8331grid.7628.bDepartment of Biological and Medical Sciences, Oxford Brookes University, Gipsy Lane, Oxford, OX3 0BP UK
| | - Duane D. McKenna
- 0000 0000 9560 654Xgrid.56061.34Department of Biological Sciences, University of Memphis, 3700 Walker Ave, Memphis, TN 38152 USA
| | - Bernhard Misof
- 0000 0001 2216 5875grid.452935.cCenter for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, Bonn, Germany
| | - Mala Munidasa
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Monica Munoz-Torres
- 0000 0001 2231 4551grid.184769.5Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, USA ,0000 0004 4665 2899grid.497331.bPresent Address: Phoenix Bioinformatics, 39221 Paseo Padre Parkway, Ste. J., Fremont, CA 94538 USA
| | - Donna M. Muzny
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Oliver Niehuis
- grid.5963.9Evolutionary Biology and Ecology, Institute of Biology I (Zoology), Albert Ludwig University of Freiburg, 79104 Freiburg (Brsg.), Germany
| | - Nkechinyere Osuji-Lacy
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Subba R. Palli
- 0000 0004 1936 8438grid.266539.dDepartment of Entomology, University of Kentucky, Lexington, KY 40546 USA
| | - Kristen A. Panfilio
- 0000 0000 8809 1613grid.7372.1School of Life Sciences, University of Warwick, Gibbet Hill Campus, Coventry, CV4 7AL UK
| | - Matthias Pechmann
- 0000 0000 8580 3777grid.6190.eCologne Biocenter, Zoological Institute, Department of Developmental Biology, University of Cologne, 50674 Cologne, Germany
| | - Trent Perry
- 0000 0001 2179 088Xgrid.1008.9School of BioSciences Science Faculty, The University of Melbourne, Melbourne, VIC 3010 Australia
| | - Ralph S. Peters
- 0000 0001 2216 5875grid.452935.cCentre of Taxonomy and Evolutionary Research, Arthropoda Department, Zoological Research Museum Alexander Koenig, Bonn, Germany
| | - Helen C. Poynton
- 0000 0004 0386 3207grid.266685.9School for the Environment, University of Massachusetts Boston, Boston, MA 02125 USA
| | - Nikola-Michael Prpic
- 0000 0001 2364 4210grid.7450.6Johann-Friedrich-Blumenbach-Institut für Zoologie und Anthropologie, Abteilung für Entwicklungsbiologie, Georg-August-Universität Göttingen, Göttingen, Germany ,0000 0001 2364 4210grid.7450.6Göttingen Center for Molecular Biosciences (GZMB), Georg-August-Universität Göttingen, Göttingen, Germany
| | - Jiaxin Qu
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Dorith Rotenberg
- 0000 0001 2173 6074grid.40803.3fDepartment of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC 27606 USA
| | - Coby Schal
- 0000 0001 2173 6074grid.40803.3fDepartment of Entomology and W.M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC 27695 USA
| | - Sean D. Schoville
- 0000 0001 2167 3675grid.14003.36Department of Entomology, University of Wisconsin-Madison, Madison, USA
| | - Erin D. Scully
- Stored Product Insect and Engineering Research Unit, USDA-ARS Center for Grain and Animal Health Research, Manhattan, KS 66502 USA
| | - Evette Skinner
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Daniel B. Sloan
- 0000 0004 1936 8083grid.47894.36Department of Biology, Colorado State University, Ft. Collins, CO USA
| | - Richard Stouthamer
- 0000 0001 2222 1582grid.266097.cDepartment of Entomology, University of California Riverside, Riverside, CA USA
| | - Michael R. Strand
- 0000 0004 1936 738Xgrid.213876.9Department of Entomology, University of Georgia, Athens, GA USA
| | - Nikolaus U. Szucsich
- 0000 0001 2169 5989grid.252381.fPresent Address: Arkansas Biosciences Institute, Arkansas State University, Jonesboro, AR USA
| | - Asela Wijeratne
- 0000 0000 9560 654Xgrid.56061.34Department of Biological Sciences, University of Memphis, 3700 Walker Ave, Memphis, TN 38152 USA ,0000 0001 2112 4115grid.425585.bNatural History Museum Vienna, Burgring 7, 1010 Vienna, Austria
| | - Neil D. Young
- 0000 0001 2179 088Xgrid.1008.9Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC 3010 Australia
| | - Eduardo E. Zattara
- 0000 0001 2112 473Xgrid.412234.2INIBIOMA, Univ. Nacional del Comahue – CONICET, Bariloche, Argentina
| | - Joshua B. Benoit
- 0000 0001 2179 9593grid.24827.3bDepartment of Biological Sciences, University of Cincinnati, Cincinnati, OH 45221 USA
| | - Evgeny M. Zdobnov
- 0000 0001 2322 4988grid.8591.5Department of Genetic Medicine and Development and Swiss Institute of Bioinformatics, University of Geneva, 1211 Geneva, Switzerland
| | - Michael E. Pfrender
- 0000 0001 2168 0066grid.131063.6Department of Biological Sciences, University of Notre Dame, 109B Galvin Life Sciences, Notre Dame, IN 46556 USA
| | - Kevin J. Hackett
- 0000 0004 0404 0958grid.463419.dCrop Production and Protection, U.S. Department of Agriculture-Agricultural Research Service, Beltsville, MD 20705 USA
| | - John H. Werren
- 0000 0004 1936 9174grid.16416.34Department of Biology, University of Rochester, Rochester, NY 14627 USA
| | - Kim C. Worley
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Richard A. Gibbs
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA
| | - Ariel D. Chipman
- 0000 0004 1937 0538grid.9619.7Department of Ecology, Evolution and Behavior, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Edmond J. Safra Campus, Givat Ram, 91904 Jerusalem, Israel
| | - Robert M. Waterhouse
- 0000 0001 2165 4204grid.9851.5Department of Ecology & Evolution and Swiss Institute of Bioinformatics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münsterss, 48149 Münster, Germany ,0000 0001 2287 2617grid.9026.dInstitute for Bioinformatics and Chemoinformatics, University of Hamburg, Hamburg, Germany ,0000 0001 1014 8330grid.419495.4Department Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Matthew W. Hahn
- 0000 0001 0790 959Xgrid.411377.7Department of Biology and Department of Computer Science, Indiana University, Bloomington, IN USA
| | - Stephen Richards
- 0000 0001 2160 926Xgrid.39382.33Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 USA ,0000 0004 1936 9684grid.27860.3bPresent Address: UC Davis Genome Center, University of California, Davis, CA 95616 USA
| |
Collapse
|
19
|
Dai Z, Li T, Li J, Han Z, Pan Y, Tang S, Diao X, Luo M. High-throughput long paired-end sequencing of a Fosmid library by PacBio. PLANT METHODS 2019; 15:142. [PMID: 31788019 PMCID: PMC6878638 DOI: 10.1186/s13007-019-0525-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Accepted: 11/12/2019] [Indexed: 06/10/2023]
Abstract
BACKGROUND Large insert paired-end sequencing technologies are important tools for assembling genomes, delineating associated breakpoints and detecting structural rearrangements. To facilitate the comprehensive detection of inter- and intra-chromosomal structural rearrangements or variants (SVs) and complex genome assembly with long repeats and segmental duplications, we developed a new method based on single-molecule real-time synthesis sequencing technology for generating long paired-end sequences of large insert DNA libraries. RESULTS A Fosmid vector, pHZAUFOS3, was developed with the following new features: (1) two 18-bp non-palindromic I-SceI sites flank the cloning site, and another two sites are present in the skeleton of the vector, allowing long DNA inserts (and the long paired-ends in this paper) to be recovered as single fragments and the vector (~ 8 kb) to be fragmented into 2-3 kb fragments by I-SceI digestion and therefore was effectively removed from the long paired-ends (5-10 kb); (2) the chloramphenicol (Cm) resistance gene and replicon (oriV), necessary for colony growth, are located near the two sides of the cloning site, helping to increase the proportion of the paired-end fragments to single-end fragments in the paired-end libraries. Paired-end libraries were constructed by ligating the size-selected, mechanically sheared pooled Fosmid DNA fragments to the Ampicillin (Amp) resistance gene fragment and screening the colonies with Cm and Amp. We tested this method on yeast and Setaria italica Yugu1. Fosmid-size paired-ends with an average length longer than 2 kb for each end were generated. The N50 scaffold lengths of the de novo assemblies of the yeast and S. italica Yugu1 genomes were significantly improved. Five large and five small structural rearrangements or assembly errors spanning tens of bp to tens of kb were identified in S. italica Yugu1 including deletions, inversions, duplications and translocations. CONCLUSIONS We developed a new method for long paired-end sequencing of large insert libraries, which can efficiently improve the quality of de novo genome assembly and identify large and small structural rearrangements or assembly errors.
Collapse
Affiliation(s)
- Zhaozhao Dai
- College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
| | - Tong Li
- College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
| | - Jiadong Li
- College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
| | - Zhifei Han
- College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
| | - Yonglong Pan
- College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
| | - Sha Tang
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 10081 China
| | - Xianmin Diao
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 10081 China
| | - Meizhong Luo
- College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
| |
Collapse
|
20
|
Wang A, Wang Z, Li Z, Li LM. BAUM: improving genome assembly by adaptive unique mapping and local overlap-layout-consensus approach. Bioinformatics 2019; 34:2019-2028. [PMID: 29346504 DOI: 10.1093/bioinformatics/bty020] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Accepted: 01/12/2018] [Indexed: 11/13/2022] Open
Abstract
Motivation It is highly desirable to assemble genomes of high continuity and consistency at low cost. The current bottleneck of draft genome continuity using the second generation sequencing (SGS) reads is primarily caused by uncertainty among repetitive sequences. Even though the single-molecule real-time sequencing technology is very promising to overcome the uncertainty issue, its relatively high cost and error rate add burden on budget or computation. Many long-read assemblers take the overlap-layout-consensus (OLC) paradigm, which is less sensitive to sequencing errors, heterozygosity and variability of coverage. However, current assemblers of SGS data do not sufficiently take advantage of the OLC approach. Results Aiming at minimizing uncertainty, the proposed method BAUM, breaks the whole genome into regions by adaptive unique mapping; then the local OLC is used to assemble each region in parallel. BAUM can (i) perform reference-assisted assembly based on the genome of a close species (ii) or improve the results of existing assemblies that are obtained based on short or long sequencing reads. The tests on two eukaryote genomes, a wild rice Oryza longistaminata and a parrot Melopsittacus undulatus, show that BAUM achieved substantial improvement on genome size and continuity. Besides, BAUM reconstructed a considerable amount of repetitive regions that failed to be assembled by existing short read assemblers. We also propose statistical approaches to control the uncertainty in different steps of BAUM. Availability and implementation http://www.zhanyuwang.xin/wordpress/index.php/2017/07/21/baum. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Anqi Wang
- National Center of Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Zhanyu Wang
- National Center of Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Zheng Li
- National Center of Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Lei M Li
- National Center of Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| |
Collapse
|
21
|
Abstract
As in any endeavor, the strategy applied to a genome project can mean the difference between success and failure. This is especially important when limited funding often means only a single approach may be tried at a given time. Although the advance of all areas of genomics and transcriptomics in recent years has led to an embarrassment of riches, methods in the field have not quite reached the turn-key production status for all species, despite being closer than ever. Here I contrast and compare the technical approaches to genome projects in the hope of enabling strategy choices with higher probabilities of success. Finally, I review the new technologies that are not yet widely distributed which are revolutionizing the future of genomics.
Collapse
Affiliation(s)
- Stephen Richards
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
22
|
Vignal A, Boitard S, Thébault N, Dayo GK, Yapi-Gnaore V, Youssao Abdou Karim I, Berthouly-Salazar C, Pálinkás-Bodzsár N, Guémené D, Thibaud-Nissen F, Warren WC, Tixier-Boichard M, Rognon X. A guinea fowl genome assembly provides new evidence on evolution following domestication and selection in galliformes. Mol Ecol Resour 2019; 19:997-1014. [PMID: 30945415 PMCID: PMC6579635 DOI: 10.1111/1755-0998.13017] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Revised: 03/19/2019] [Accepted: 03/25/2019] [Indexed: 01/25/2023]
Abstract
The helmeted guinea fowl Numida meleagris belongs to the order Galliformes. Its natural range includes a large part of sub‐Saharan Africa, from Senegal to Eritrea and from Chad to South Africa. Archaeozoological and artistic evidence suggest domestication of this species may have occurred about 2,000 years BP in Mali and Sudan primarily as a food resource, although villagers also benefit from its capacity to give loud alarm calls in case of danger, of its ability to consume parasites such as ticks and to hunt snakes, thus suggesting its domestication may have resulted from a commensal association process. Today, it is still farmed in Africa, mainly as a traditional village poultry, and is also bred more intensively in other countries, mainly France and Italy. The lack of available molecular genetic markers has limited the genetic studies conducted to date on guinea fowl. We present here a first‐generation whole‐genome sequence draft assembly used as a reference for a study by a Pool‐seq approach of wild and domestic populations from Europe and Africa. We show that the domestic populations share a higher genetic similarity between each other than they do to wild populations living in the same geographical area. Several genomic regions showing selection signatures putatively related to domestication or importation to Europe were detected, containing candidate genes, most notably EDNRB2, possibly explaining losses in plumage coloration phenotypes in domesticated populations.
Collapse
Affiliation(s)
- Alain Vignal
- GenPhySE, INRA, INPT, INP-ENVT, Université de Toulouse, Castanet Tolosan, France
| | - Simon Boitard
- GenPhySE, INRA, INPT, INP-ENVT, Université de Toulouse, Castanet Tolosan, France
| | - Noémie Thébault
- GenPhySE, INRA, INPT, INP-ENVT, Université de Toulouse, Castanet Tolosan, France
| | | | | | | | | | | | | | - Francoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland
| | - Wesley C Warren
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri.,Bond Life Sciences Center, University of Missouri, Columbia, Missouri
| | | | - Xavier Rognon
- GABI, INRA, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France
| |
Collapse
|
23
|
Complete Genome Sequence for Asinibacterium sp. Strain OR53 and Draft Genome Sequence for Asinibacterium sp. Strain OR43, Two Bacteria Tolerant to Uranium. Microbiol Resour Announc 2019; 8:8/14/e01701-18. [PMID: 30948472 PMCID: PMC6449563 DOI: 10.1128/mra.01701-18] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Asinibacterium sp. strains OR43 and OR53 belong to the phylum Bacteroidetes and were isolated from subsurface sediments in Oak Ridge, TN. Both strains grow at elevated levels of heavy metals. Here, we present the closed genome sequence of Asinibacterium sp. strain OR53 and the draft genome sequence of Asinibacterium sp. strain OR43.
Collapse
|
24
|
Spencer MD, Winglee K, Passaretti C, Earl AM, Manson AL, Mulder HP, Sautter RL, Fodor AA. Whole Genome Sequencing detects Inter-Facility Transmission of Carbapenem-resistant Klebsiella pneumoniae. J Infect 2018; 78:187-199. [PMID: 30503842 PMCID: PMC6408229 DOI: 10.1016/j.jinf.2018.11.003] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Revised: 11/03/2018] [Accepted: 11/19/2018] [Indexed: 12/22/2022]
Abstract
OBJECTIVES To identify transmission patterns of Carbapenem-resistant Klebsiella pneumoniae infection during an outbreak at a large, tertiary care hospital and to detect whether the outbreak organisms spread to other facilities in the integrated healthcare network. METHODS We analyzed 71 K. pneumoniae whole genome sequences collected from clinical specimens before, during and after the outbreak and reviewed corresponding patient medical records. Sequence and patient data were used to model probable transmissions and assess factors associated with the outbreak. RESULTS We identified close genetic relationships among carbapenem-resistant K. pneumoniae isolates sampled during the study period. Transmission tree analysis combined with patient records uncovered extended periods of silent colonization in many study patients and transmission routes that were likely the result of asymptomatic patients transitioning between facilities. CONCLUSIONS Detecting how and where Carbapenem-resistant K. pneumoniae infections spread is challenging in an environment of rising prevalence, asymptomatic carriage and mobility of patients. Whole genome sequencing improved the precision of investigating inter-facility transmissions. Our results emphasize that containment of Carbapenem-resistant K. pneumoniae infections requires coordinated efforts between healthcare networks and settings of care that acknowledge and mitigate transmission risk conferred by undetected carriage and by patient transfers between facilities.
Collapse
Affiliation(s)
- Melanie D Spencer
- Center for Outcomes Research and Evaluation, Atrium Health, Research Office Building, 1540 Garden Terrace, Charlotte, NC 28203, USA.
| | - Kathryn Winglee
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9331 Robert D. Snyder Road, Charlotte NC 28223, USA.
| | - Catherine Passaretti
- Departments of Internal Medicine and Infectious Disease, Atrium Health, 1616 Scott Avenue, Charlotte, NC 28203, USA.
| | - Ashlee M Earl
- Broad Institute of Harvard and Massachusetts Institute of Technology, 415 Main Street, Cambridge, MA 02142, USA.
| | - Abigail L Manson
- Broad Institute of Harvard and Massachusetts Institute of Technology, 415 Main Street, Cambridge, MA 02142, USA.
| | - Holly P Mulder
- Center for Outcomes Research and Evaluation, Atrium Health, Research Office Building, 1540 Garden Terrace, Charlotte, NC 28203, USA.
| | - Robert L Sautter
- Carolinas Pathology Group, P.O. Box 30637, Charlotte, NC 28230, USA.
| | - Anthony A Fodor
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9331 Robert D. Snyder Road, Charlotte NC 28223, USA.
| |
Collapse
|
25
|
Wu B, Li M, Liao X, Luo J, Wu F, Pan Y, Wang J. MEC: Misassembly Error Correction in contigs based on distribution of paired-end reads and statistics of GC-contents. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 17:847-857. [PMID: 30334805 DOI: 10.1109/tcbb.2018.2876855] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The de novo assembly tools aim at reconstructing genomes from next-generation sequencing (NGS) data. However, the assembly tools usually generate a large amount of contigs containing many misassemblies, which are caused by problems of repetitive regions, chimeric reads and sequencing errors. As they can improve the accuracy of assembly results, detecting and correcting the misassemblies in contigs are appealing, yet challenging. In this study, a novel method, called MEC, is proposed to identify and correct misassemblies in contigs. Based on the insert size distribution of paired-end reads and the statistical analysis of GC-contents, MEC can identify more misassemblies accurately. We evaluate our MEC with the metrics (NA50, NGA50) on four datasets, compared it with the most available misassembly correction tools, and carry out experiments to analyze the influence of MEC on scaffolding results, which shows that MEC can reduce misassemblies effectively and result in quantitative improvements in scaffolding quality. MEC is publicly available at https://github.com/bioinfomaticsCSU/MEC.
Collapse
|
26
|
Souvorov A, Agarwala R, Lipman DJ. SKESA: strategic k-mer extension for scrupulous assemblies. Genome Biol 2018; 19:153. [PMID: 30286803 PMCID: PMC6172800 DOI: 10.1186/s13059-018-1540-z] [Citation(s) in RCA: 331] [Impact Index Per Article: 55.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Accepted: 09/12/2018] [Indexed: 01/20/2023] Open
Abstract
SKESA is a DeBruijn graph-based de-novo assembler designed for assembling reads of microbial genomes sequenced using Illumina. Comparison with SPAdes and MegaHit shows that SKESA produces assemblies that have high sequence quality and contiguity, handles low-level contamination in reads, is fast, and produces an identical assembly for the same input when assembled multiple times with the same or different compute resources. SKESA has been used for assembling over 272,000 read sets in the Sequence Read Archive at NCBI and for real-time pathogen detection. Source code for SKESA is freely available at https://github.com/ncbi/SKESA/releases .
Collapse
Affiliation(s)
| | - Richa Agarwala
- NCBI/NLM/NIH/DHHS, 8600 Rockville Pike, Bethesda, 20894 MD USA
| | - David J. Lipman
- NCBI/NLM/NIH/DHHS, 8600 Rockville Pike, Bethesda, 20894 MD USA
- Impossible Foods, impossiblefoods.com, Redwood City, 94063 CA USA
| |
Collapse
|
27
|
Prabh N, Roeseler W, Witte H, Eberhardt G, Sommer RJ, Rödelsperger C. Deep taxon sampling reveals the evolutionary dynamics of novel gene families in Pristionchus nematodes. Genome Res 2018; 28:1664-1674. [PMID: 30232197 PMCID: PMC6211646 DOI: 10.1101/gr.234971.118] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Accepted: 09/05/2018] [Indexed: 01/20/2023]
Abstract
The widespread identification of genes without detectable homology in related taxa is a hallmark of genome sequencing projects in animals, together with the abundance of gene duplications. Such genes have been called novel, young, taxon-restricted, or orphans, but little is known about the mechanisms accounting for their origin, age, and mode of evolution. Phylogenomic studies relying on deep and systematic taxon sampling and using the comparative method can provide insight into the evolutionary dynamics acting on novel genes. We used a phylogenomic approach for the nematode model organism Pristionchus pacificus and sequenced six additional Pristionchus and two outgroup species. This resulted in 10 genomes with a ladder-like phylogeny, sequenced in one laboratory using the same platform and analyzed by the same bioinformatic procedures. Our analysis revealed that 68%-81% of genes are assignable to orthologous gene families, the majority of which defined nine age classes with presence/absence patterns that can be explained by single evolutionary events. Contrasting different age classes, we find that older age classes are concentrated at chromosome centers, whereas novel gene families preferentially arise at the periphery, are weakly expressed, evolve rapidly, and have a high propensity of being lost. Over time, they increase in expression and become more constrained. Thus, the detailed phylogenetic resolution allowed a comprehensive characterization of the evolutionary dynamics of Pristionchus genomes indicating that distribution of age classes and their associated differences shape chromosomal divergence. This study establishes the Pristionchus system for future research on the mechanisms that drive the formation of novel genes.
Collapse
Affiliation(s)
- Neel Prabh
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
| | - Waltraud Roeseler
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
| | - Hanh Witte
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
| | - Gabi Eberhardt
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
| | - Ralf J Sommer
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
| | - Christian Rödelsperger
- Department of Integrative Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Max-Planck-Ring 9, 72076 Tübingen, Germany
| |
Collapse
|
28
|
Zhao H, Wang S, Wang J, Chen C, Hao S, Chen L, Fei B, Han K, Li R, Shi C, Sun H, Wang S, Xu H, Yang K, Xu X, Shan X, Shi J, Feng A, Fan G, Liu X, Zhao S, Zhang C, Gao Q, Gao Z, Jiang Z. The chromosome-level genome assemblies of two rattans (Calamus simplicifolius and Daemonorops jenkinsiana). Gigascience 2018; 7:5067873. [PMID: 30101322 PMCID: PMC6117794 DOI: 10.1093/gigascience/giy097] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Revised: 06/03/2018] [Accepted: 07/26/2018] [Indexed: 01/22/2023] Open
Abstract
Background Calamus simplicifolius and Daemonorops jenkinsiana are two representative rattans, the most significant material sources for the rattan industry. However, the lack of reference genome sequences is a major obstacle for basic and applied biology on rattan. Findings We produced two chromosome-level genome assemblies of C. simplicifolius and D. jenkinsiana using Illumina, Pacific Biosciences, and Hi-C sequencing data. A total of ∼730 Gb and ∼682 Gb of raw data covered the predicted genome lengths (∼1.98 Gb of C. simplicifolius and ∼1.61 Gb of D. jenkinsiana) to ∼372 × and ∼426 × read depths, respectively. The two de novo genome assemblies, ∼1.94 Gb and ∼1.58 Gb, were generated with scaffold N50s of ∼160 Mb and ∼119 Mb in C. simplicifolius and D. jenkinsiana, respectively. The C. simplicifolius and D. jenkinsiana genomes were predicted to harbor 51,235 and 53,342 intact protein-coding gene models, respectively. Benchmarking Universal Single-Copy Orthologs evaluation demonstrated that genome completeness reached 96.4% and 91.3% in the C. simplicifolius and D. jenkinsiana genomes, respectively. Genome evolution showed that four Arecaceae plants clustered together, and the divergence time between the two rattans was ∼19.3 million years ago. Additionally, we identified 193 and 172 genes involved in the lignin biosynthesis pathway in the C. simplicifolius and D. jenkinsiana genomes, respectively. Conclusions We present the first de novo assemblies of two rattan genomes (C. simplicifolius and D. jenkinsiana). These data will not only provide a fundamental resource for functional genomics, particularly in promoting germplasm utilization for breeding, but also serve as reference genomes for comparative studies between and among different species.
Collapse
Affiliation(s)
- Hansheng Zhao
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, Institute of Gene Science for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Futongdong Rd, WangJing, Chaoyang District, Beijing 100102, China
| | - Songbo Wang
- BGI Genomics, BGI-Shenzhen, Building No. 7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, No. 7, Pengfei Road, Dapeng District, Shenzhen 518120, China
| | - Jiongliang Wang
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, Institute of Gene Science for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Futongdong Rd, WangJing, Chaoyang District, Beijing 100102, China
| | - Chunhai Chen
- BGI Genomics, BGI-Shenzhen, Building No. 7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
| | - Shijie Hao
- BGI-Qingdao, No. 2877, Tuanjie Road, Sino-German Ecopark, Qingdao, Shandong 266555, China
| | - Lianfu Chen
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, Institute of Gene Science for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Futongdong Rd, WangJing, Chaoyang District, Beijing 100102, China
| | - Benhua Fei
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, Institute of Gene Science for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Futongdong Rd, WangJing, Chaoyang District, Beijing 100102, China
| | - Kai Han
- BGI-Qingdao, No. 2877, Tuanjie Road, Sino-German Ecopark, Qingdao, Shandong 266555, China
| | - Rongsheng Li
- Research Institute of Tropical Forestry, Chinese Academy of Forestry, Guangshanyi Rd, Tianhe District, Guangzhou 510000, China
| | - Chengcheng Shi
- BGI-Qingdao, No. 2877, Tuanjie Road, Sino-German Ecopark, Qingdao, Shandong 266555, China
| | - Huayu Sun
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, Institute of Gene Science for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Futongdong Rd, WangJing, Chaoyang District, Beijing 100102, China
| | - Sining Wang
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, Institute of Gene Science for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Futongdong Rd, WangJing, Chaoyang District, Beijing 100102, China
| | - Hao Xu
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, Institute of Gene Science for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Futongdong Rd, WangJing, Chaoyang District, Beijing 100102, China
| | - Kebin Yang
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, Institute of Gene Science for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Futongdong Rd, WangJing, Chaoyang District, Beijing 100102, China
| | - Xiurong Xu
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, Institute of Gene Science for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Futongdong Rd, WangJing, Chaoyang District, Beijing 100102, China
| | - Xuemeng Shan
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, Institute of Gene Science for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Futongdong Rd, WangJing, Chaoyang District, Beijing 100102, China
| | - Jingjing Shi
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, Institute of Gene Science for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Futongdong Rd, WangJing, Chaoyang District, Beijing 100102, China
| | - Aiqin Feng
- BGI Genomics, BGI-Shenzhen, Building No. 7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
| | - Guangyi Fan
- BGI-Qingdao, No. 2877, Tuanjie Road, Sino-German Ecopark, Qingdao, Shandong 266555, China
| | - Xin Liu
- BGI-Qingdao, No. 2877, Tuanjie Road, Sino-German Ecopark, Qingdao, Shandong 266555, China
- BGI-Fuyang, Floor 3, Jinshan Building, Qinghe East Road, Yingzhou District, Fuyang 236009, China
| | - Shancen Zhao
- BGI Genomics, BGI-Shenzhen, Building No. 7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, No. 7, Pengfei Road, Dapeng District, Shenzhen 518120, China
| | - Chi Zhang
- BGI Genomics, BGI-Shenzhen, Building No. 7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, No. 7, Pengfei Road, Dapeng District, Shenzhen 518120, China
| | - Qiang Gao
- BGI Genomics, BGI-Shenzhen, Building No. 7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
| | - Zhimin Gao
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, Institute of Gene Science for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Futongdong Rd, WangJing, Chaoyang District, Beijing 100102, China
| | - Zehui Jiang
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, Institute of Gene Science for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Futongdong Rd, WangJing, Chaoyang District, Beijing 100102, China
| |
Collapse
|
29
|
Yin D, Ji C, Ma X, Li H, Zhang W, Li S, Liu F, Zhao K, Li F, Li K, Ning L, He J, Wang Y, Zhao F, Xie Y, Zheng H, Zhang X, Zhang Y, Zhang J. Genome of an allotetraploid wild peanut Arachis monticola: a de novo assembly. Gigascience 2018; 7:5040258. [PMID: 29931126 PMCID: PMC6009596 DOI: 10.1093/gigascience/giy066] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Revised: 03/13/2018] [Accepted: 05/24/2018] [Indexed: 12/16/2022] Open
Abstract
Arachis monticola (2n = 4x = 40) is the only allotetraploid wild peanut within the Arachis genus and section, with an AABB-type genome of ∼2.7 Gb in size. The AA-type subgenome is derived from diploid wild peanut Arachis duranensis, and the BB-type subgenome is derived from diploid wild peanut Arachis ipaensis. A. monticola is regarded either as the direct progenitor of the cultivated peanut or as an introgressive derivative between the cultivated peanut and wild species. The large polyploidy genome structure and enormous nearly identical regions of the genome make the assembly of chromosomal pseudomolecules very challenging. Here we report the first reference quality assembly of the A. monticola genome, using a series of advanced technologies. The final whole genome of A. monticola is ∼2.62 Gb and has a contig N50 and scaffold N50 of 106.66 Kb and 124.92 Mb, respectively. The vast majority (91.83%) of the assembled sequence was anchored onto the 20 pseudo-chromosomes, and 96.07% of assemblies were accurately separated into AA- and BB- subgenomes. We demonstrated efficiency of the current state of the strategy for de novo assembly of the highly complex allotetraploid species, wild peanut (A. monticola), based on whole-genome shotgun sequencing, single molecule real-time sequencing, high-throughput chromosome conformation capture technology, and BioNano optical genome maps. These combined technologies produced reference-quality genome of the allotetraploid wild peanut, which is valuable for understanding the peanut domestication and evolution within the Arachis genus and among legume crops.
Collapse
Affiliation(s)
- Dongmei Yin
- College of Agronomy, Henan Agricultural University, Zhengzhou 450002, China
| | - Changmian Ji
- Biomarker Technologies Corporation, Beijing 101300, China
| | - Xingli Ma
- College of Agronomy, Henan Agricultural University, Zhengzhou 450002, China
| | - Hang Li
- Biomarker Technologies Corporation, Beijing 101300, China
| | - Wanke Zhang
- State Key Lab of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Song Li
- Biomarker Technologies Corporation, Beijing 101300, China
| | - Fuyan Liu
- Biomarker Technologies Corporation, Beijing 101300, China
| | - Kunkun Zhao
- College of Agronomy, Henan Agricultural University, Zhengzhou 450002, China
| | - Fapeng Li
- College of Agronomy, Henan Agricultural University, Zhengzhou 450002, China
| | - Ke Li
- College of Agronomy, Henan Agricultural University, Zhengzhou 450002, China
| | - Longlong Ning
- College of Agronomy, Henan Agricultural University, Zhengzhou 450002, China
| | - Jialin He
- College of Agronomy, Henan Agricultural University, Zhengzhou 450002, China
| | - Yuejun Wang
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China
| | - Fei Zhao
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China
| | - Yilin Xie
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China
| | - Hongkun Zheng
- Biomarker Technologies Corporation, Beijing 101300, China
| | - Xingguo Zhang
- College of Agronomy, Henan Agricultural University, Zhengzhou 450002, China
| | - Yijing Zhang
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China
| | - Jinsong Zhang
- State Key Lab of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
30
|
Kelly S, Ivens A, Mott GA, O'Neill E, Emms D, Macleod O, Voorheis P, Tyler K, Clark M, Matthews J, Matthews K, Carrington M. An Alternative Strategy for Trypanosome Survival in the Mammalian Bloodstream Revealed through Genome and Transcriptome Analysis of the Ubiquitous Bovine Parasite Trypanosoma (Megatrypanum) theileri. Genome Biol Evol 2018; 9:2093-2109. [PMID: 28903536 PMCID: PMC5737535 DOI: 10.1093/gbe/evx152] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/12/2017] [Indexed: 12/19/2022] Open
Abstract
There are hundreds of Trypanosoma species that live in the blood and tissue spaces of their vertebrate hosts. The vast majority of these do not have the ornate system of antigenic variation that has evolved in the small number of African trypanosome species, but can still maintain long-term infections in the face of the vertebrate adaptive immune system. Trypanosoma theileri is a typical example, has a restricted host range of cattle and other Bovinae, and is only occasionally reported to cause patent disease although no systematic survey of the effect of infection on agricultural productivity has been performed. Here, a detailed genome sequence and a transcriptome analysis of gene expression in bloodstream form T. theileri have been performed. Analysis of the genome sequence and expression showed that T. theileri has a typical kinetoplastid genome structure and allowed a prediction that it is capable of meiotic exchange, gene silencing via RNA interference and, potentially, density-dependent growth control. In particular, the transcriptome analysis has allowed a comparison of two distinct trypanosome cell surfaces, T. brucei and T. theileri, that have each evolved to enable the maintenance of a long-term extracellular infection in cattle. The T. theileri cell surface can be modeled to contain a mixture of proteins encoded by four novel large and divergent gene families and by members of a major surface protease gene family. This surface composition is distinct from the uniform variant surface glycoprotein coat on African trypanosomes providing an insight into a second mechanism used by trypanosome species that proliferate in an extracellular milieu in vertebrate hosts to avoid the adaptive immune response.
Collapse
Affiliation(s)
- Steven Kelly
- Department of Plant Sciences, University of Oxford, United Kingdom
| | - Alasdair Ivens
- Centre for Immunity, Infection and Evolution and Institute for Immunology and Infection Research, School of Biological Sciences, University of Edinburgh, United Kingdom
| | - G Adam Mott
- Centre for Immunity, Infection and Evolution and Institute for Immunology and Infection Research, School of Biological Sciences, University of Edinburgh, United Kingdom
| | - Ellis O'Neill
- Department of Plant Sciences, University of Oxford, United Kingdom
| | - David Emms
- Department of Plant Sciences, University of Oxford, United Kingdom
| | - Olivia Macleod
- Department of Biochemistry, University of Cambridge, United Kingdom
| | - Paul Voorheis
- School of Biochemistry and Immunology, Trinity College, Dublin, Ireland
| | - Kevin Tyler
- Norwich Medical School, University of East Anglia, Norwich Research Park, Norwich, Norfolk, United Kingdom
| | - Matthew Clark
- Earlham Institute, Norwich Research Park, Norwich, Norfolk, United Kingdom
| | - Jacqueline Matthews
- Moredun Research Institute, Pentlands Science Park, Bush Loan, Penicuik, Midlothian, United Kingdom
| | - Keith Matthews
- Centre for Immunity, Infection and Evolution and Institute for Immunology and Infection Research, School of Biological Sciences, University of Edinburgh, United Kingdom
| | - Mark Carrington
- Department of Biochemistry, University of Cambridge, United Kingdom
| |
Collapse
|
31
|
Chemokine C-C motif ligand 33 is a key regulator of teleost fish barbel development. Proc Natl Acad Sci U S A 2018; 115:E5018-E5027. [PMID: 29760055 DOI: 10.1073/pnas.1718603115] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Barbels are important sensory organs in teleosts, reptiles, and amphibians. The majority of ∼4,000 catfish species, such as the channel catfish (Ictalurus punctatus), possess abundant whisker-like barbels. However, barbel-less catfish, such as the bottlenose catfish (Ageneiosus marmoratus), do exist. Barbeled catfish and barbel-less catfish are ideal natural models for determination of the genomic basis for barbel development. In this work, we generated and annotated the genome sequences of the bottlenose catfish, conducted comparative and subtractive analyses using genome and transcriptome datasets, and identified differentially expressed genes during barbel regeneration. Here, we report that chemokine C-C motif ligand 33 (ccl33), as a key regulator of barbel development and regeneration. It is present in barbeled fish but absent in barbel-less fish. The ccl33 genes are differentially expressed during barbel regeneration in a timing concordant with the timing of barbel regeneration. Knockout of ccl33 genes in the zebrafish (Danio rerio) resulted in various phenotypes, including complete loss of barbels, reduced barbel sizes, and curly barbels, suggesting that ccl33 is a key regulator of barbel development. Expression analysis indicated that paralogs of the ccl33 gene have both shared and specific expression patterns, most notably expressed highly in various parts of the head, such as the eye, brain, and mouth areas, supporting its role for barbel development.
Collapse
|
32
|
Li F, Harkess A. A guide to sequence your favorite plant genomes. APPLICATIONS IN PLANT SCIENCES 2018; 6:e1030. [PMID: 29732260 PMCID: PMC5895188 DOI: 10.1002/aps3.1030] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Accepted: 11/29/2017] [Indexed: 05/12/2023]
Abstract
With the rapid development of sequencing technology and the plummeting cost, assembling whole genomes from non-model plants will soon become routine for plant systematists and evolutionary biologists. Here we summarize and compare several of the latest genome sequencing and assembly approaches, offering a practical guide on how to approach a genome project. We also highlight certain precautions that need to be taken before investing time and money into a genome project.
Collapse
Affiliation(s)
- Fay‐Wei Li
- Boyce Thompson InstituteIthacaNew York14853USA
- Plant Biology SectionCornell UniversityIthacaNew York14853USA
| | - Alex Harkess
- Donald Danforth Plant Science CenterSt. LouisMissouri63132USA
| |
Collapse
|
33
|
Abstract
A high-quality, annotated genome assembly is the foundation for many downstream studies. However, obtaining such an assembly is a complex, reiterative process that requires the assimilation of high-quality data and combines different approaches and data types. While some software packages incorporating multiple steps of genome assembly are commercially available, they may not be flexible enough to be routinely applied to all organisms, particularly to nonmodel species such as pathogenic oomycetes and fungi. If researchers understand and apply the most appropriate, currently available tools for each step, it is possible to customize parameters and optimize results for their organism of study. Based on our experience of de novo assembly and annotation of several oomycete species, this chapter provides a modular workflow from processing of raw reads, to initial assembly generation, through optimization, chromosome-scale scaffolding and annotation, outlining input and output data as well as examples and alternative software used for each step. The accompanying Notes provide background information for each step as well as alternative options. The final result of this workflow could be an annotated, high-quality, validated, chromosome-scale assembly or a draft assembly of sufficient quality to meet specific needs of a project.
Collapse
Affiliation(s)
- Kyle Fletcher
- The Genome Center, Genome and Biomedical Sciences Facility, University of California, Davis, CA, USA
| | - Richard Michelmore
- The Genome Center, Genome and Biomedical Sciences Facility, University of California, Davis, CA, USA.
| |
Collapse
|
34
|
Li M, Wu B, Yan X, Luo J, Pan Y, Wu FX, Wang J. PECC: Correcting contigs based on paired-end read distribution. Comput Biol Chem 2017; 69:178-184. [DOI: 10.1016/j.compbiolchem.2017.03.012] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 03/27/2017] [Indexed: 11/26/2022]
|
35
|
Tracing Genetic Exchange and Biogeography of Cryptococcus neoformans var. grubii at the Global Population Level. Genetics 2017; 207:327-346. [PMID: 28679543 PMCID: PMC5586382 DOI: 10.1534/genetics.117.203836] [Citation(s) in RCA: 69] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Accepted: 06/28/2017] [Indexed: 11/18/2022] Open
Abstract
Cryptococcus neoformans var. grubii is the causative agent of cryptococcal meningitis, a significant source of mortality in immunocompromised individuals, typically human immunodeficiency virus/AIDS patients from developing countries. Despite the worldwide emergence of this ubiquitous infection, little is known about the global molecular epidemiology of this fungal pathogen. Here we sequence the genomes of 188 diverse isolates and characterize the major subdivisions, their relative diversity, and the level of genetic exchange between them. While most isolates of C. neoformans var. grubii belong to one of three major lineages (VNI, VNII, and VNB), some haploid isolates show hybrid ancestry including some that appear to have recently interbred, based on the detection of large blocks of each ancestry across each chromosome. Many isolates display evidence of aneuploidy, which was detected for all chromosomes. In diploid isolates of C. neoformans var. grubii (serotype AA) and of hybrids with C. neoformans var. neoformans (serotype AD) such aneuploidies have resulted in loss of heterozygosity, where a chromosomal region is represented by the genotype of only one parental isolate. Phylogenetic and population genomic analyses of isolates from Brazil reveal that the previously "African" VNB lineage occurs naturally in the South American environment. This suggests migration of the VNB lineage between Africa and South America prior to its diversification, supported by finding ancestral recombination events between isolates from different lineages and regions. The results provide evidence of substantial population structure, with all lineages showing multi-continental distributions; demonstrating the highly dispersive nature of this pathogen.
Collapse
|
36
|
Li M, Liao Z, He Y, Wang J, Luo J, Pan Y. ISEA: Iterative Seed-Extension Algorithm for De Novo Assembly Using Paired-End Information and Insert Size Distribution. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:916-925. [PMID: 27076460 DOI: 10.1109/tcbb.2016.2550433] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The purpose of de novo assembly is to report more contiguous, complete, and less error prone contigs. Thanks to the advent of the next generation sequencing (NGS) technologies, the cost of producing high depth reads is reduced greatly. However, due to the disadvantages of NGS, de novo assembly has to face the difficulties brought by repeat regions, error rate, and low sequencing coverage in some regions. Although many de novo algorithms have been proposed to solve these problems, the de novo assembly still remains a challenge. In this article, we developed an iterative seed-extension algorithm for de novo assembly, called ISEA. To avoid the negative impact induced by error rate, ISEA utilizes reads overlap and paired-end information to correct error reads before assemblying. During extending seeds in a De Bruijn graph, ISEA uses an elaborately designed score function based on paired-end information and the distribution of insert size to solve the repeat region problem. By employing the distribution of insert size, the score function can also reduce the influence of error reads. In scaffolding, ISEA adopts a relaxed strategy to join contigs that were terminated for low coverage during the extension. The performance of ISEA was compared with six previous popular assemblers on four real datasets. The experimental results demonstrate that ISEA can effectively obtain longer and more accurate scaffolds.
Collapse
|
37
|
Genome Sequences of Salisediminibacterium haloalkalitolerans 10nlg, Bacillus lonarensis 25nlg, Bacillus caseinilyticus SP, Pelagirhabdus alkalitolerans S5, Salibacterium halotolerans S7 and Salipaludibacillus aurantiacus S9 six novel, Recently Described Compatible Solute Producing Bacteria. JOURNAL OF PURE AND APPLIED MICROBIOLOGY 2017. [DOI: 10.22207/jpam.11.2.26] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
38
|
Deng CH, Plummer KM, Jones DAB, Mesarich CH, Shiller J, Taranto AP, Robinson AJ, Kastner P, Hall NE, Templeton MD, Bowen JK. Comparative analysis of the predicted secretomes of Rosaceae scab pathogens Venturia inaequalis and V. pirina reveals expanded effector families and putative determinants of host range. BMC Genomics 2017; 18:339. [PMID: 28464870 PMCID: PMC5412055 DOI: 10.1186/s12864-017-3699-1] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2016] [Accepted: 04/11/2017] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Fungal plant pathogens belonging to the genus Venturia cause damaging scab diseases of members of the Rosaceae. In terms of economic impact, the most important of these are V. inaequalis, which infects apple, and V. pirina, which is a pathogen of European pear. Given that Venturia fungi colonise the sub-cuticular space without penetrating plant cells, it is assumed that effectors that contribute to virulence and determination of host range will be secreted into this plant-pathogen interface. Thus the predicted secretomes of a range of isolates of Venturia with distinct host-ranges were interrogated to reveal putative proteins involved in virulence and pathogenicity. RESULTS Genomes of Venturia pirina (one European pear scab isolate) and Venturia inaequalis (three apple scab, and one loquat scab, isolates) were sequenced and the predicted secretomes of each isolate identified. RNA-Seq was conducted on the apple-specific V. inaequalis isolate Vi1 (in vitro and infected apple leaves) to highlight virulence and pathogenicity components of the secretome. Genes encoding over 600 small secreted proteins (candidate effectors) were identified, most of which are novel to Venturia, with expansion of putative effector families a feature of the genus. Numerous genes with similarity to Leptosphaeria maculans AvrLm6 and the Verticillium spp. Ave1 were identified. Candidates for avirulence effectors with cognate resistance genes involved in race-cultivar specificity were identified, as were putative proteins involved in host-species determination. Candidate effectors were found, on average, to be in regions of relatively low gene-density and in closer proximity to repeats (e.g. transposable elements), compared with core eukaryotic genes. CONCLUSIONS Comparative secretomics has revealed candidate effectors from Venturia fungal plant pathogens that attack pome fruit. Effectors that are putative determinants of host range were identified; both those that may be involved in race-cultivar and host-species specificity. Since many of the effector candidates are in close proximity to repetitive sequences this may point to a possible mechanism for the effector gene family expansion observed and a route to diversification via transposition and repeat-induced point mutation.
Collapse
Affiliation(s)
- Cecilia H. Deng
- The New Zealand Institute for Plant & Food Research Limited (PFR), Auckland, New Zealand
| | - Kim M. Plummer
- Animal, Plant & Soil Sciences Department, AgriBio Centre for AgriBioscience, La Trobe University, Melbourne, Victoria Australia
- Plant Biosecurity Cooperative Research Centre, Bruce, ACT Australia
| | - Darcy A. B. Jones
- Animal, Plant & Soil Sciences Department, AgriBio Centre for AgriBioscience, La Trobe University, Melbourne, Victoria Australia
- Present Address: The Centre for Crop and Disease Management, Curtin University, Bentley, Australia
| | - Carl H. Mesarich
- The New Zealand Institute for Plant & Food Research Limited (PFR), Auckland, New Zealand
- The School of Biological Sciences, University of Auckland, Auckland, New Zealand
- Present Address: Institute of Agriculture & Environment, Massey University, Palmerston North, New Zealand
| | - Jason Shiller
- Animal, Plant & Soil Sciences Department, AgriBio Centre for AgriBioscience, La Trobe University, Melbourne, Victoria Australia
- Present Address: INRA-Angers, Beaucouzé, Cedex, France
| | - Adam P. Taranto
- Animal, Plant & Soil Sciences Department, AgriBio Centre for AgriBioscience, La Trobe University, Melbourne, Victoria Australia
- Plant Sciences Division, Research School of Biology, The Australian National University, Canberra, Australia
| | - Andrew J. Robinson
- Animal, Plant & Soil Sciences Department, AgriBio Centre for AgriBioscience, La Trobe University, Melbourne, Victoria Australia
- Life Sciences Computation Centre, Victorian Life Sciences Computation Initiative (VLSCI), Victoria, Australia
| | - Patrick Kastner
- Animal, Plant & Soil Sciences Department, AgriBio Centre for AgriBioscience, La Trobe University, Melbourne, Victoria Australia
| | - Nathan E. Hall
- Animal, Plant & Soil Sciences Department, AgriBio Centre for AgriBioscience, La Trobe University, Melbourne, Victoria Australia
- Life Sciences Computation Centre, Victorian Life Sciences Computation Initiative (VLSCI), Victoria, Australia
| | - Matthew D. Templeton
- The New Zealand Institute for Plant & Food Research Limited (PFR), Auckland, New Zealand
- The School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Joanna K. Bowen
- The New Zealand Institute for Plant & Food Research Limited (PFR), Auckland, New Zealand
| |
Collapse
|
39
|
Permanent Draft Genome Sequence of Desulfurococcus amylolyticus Strain Z-533 T, a Peptide and Starch Degrader Isolated from Thermal Springs in the Kamchatka Peninsula and Kunashir Island, Russia. GENOME ANNOUNCEMENTS 2017; 5:5/15/e00078-17. [PMID: 28408663 PMCID: PMC5391401 DOI: 10.1128/genomea.00078-17] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Desulfurococcus amylolyticus Z-533T, a hyperthermophilic crenarcheon, ferments peptide and starch, generating acetate, isobutyrate, isovalerate, CO2, and hydrogen. Unlike D. amylolyticus Z-1312, it cannot use cellulose and is inhibited by hydrogen. The reported draft genome sequence of D. amylolyticus Z-533T will help to understand the molecular basis for these differences.
Collapse
|
40
|
Zhao H, Sun H, Li L, Lou Y, Li R, Qi L, Gao Z. Transcriptome-based investigation of cirrus development and identifying microsatellite markers in rattan (Daemonorops jenkinsiana). Sci Rep 2017; 7:46107. [PMID: 28383053 PMCID: PMC5382692 DOI: 10.1038/srep46107] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2016] [Accepted: 03/08/2017] [Indexed: 11/09/2022] Open
Abstract
Rattan is an important group of regenerating non-wood climbing palm in tropical forests. The cirrus is an essential climbing organ and provides morphological evidence for evolutionary and taxonomic studies. However, limited data are available on the molecular mechanisms underlying the development of the cirrus. Thus, we performed in-depth transcriptomic sequencing analyses to characterize the cirrus development at different developmental stages of Daemonorops jenkinsiana. The result showed 404,875 transcripts were assembled, including 61,569 high-quality unigenes were identified, of which approximately 76.16% were annotated and classified by seven authorized databases. Moreover, a comprehensive analysis of the gene expression profiles identified differentially expressed genes (DEGs) concentrated in developmental pathways, cell wall metabolism, and hook formation between the different stages of the cirri. Among them, 37 DEGs were validated by qRT-PCR. Furthermore, 14,693 transcriptome-based microsatellites were identified. Of the 168 designed SSR primer pairs, 153 were validated and 16 pairs were utilized for the polymorphic analysis of 25 rattan accessions. These findings can be used to interpret the molecular mechanisms of cirrus development, and the developed microsatellites markers provide valuable data for assisting rattan taxonomy and expanding the understanding of genomic study in rattan.
Collapse
Affiliation(s)
- Hansheng Zhao
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, International Center for Bamboo and Rattan, Beijing 100102, China
| | - Huayu Sun
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, International Center for Bamboo and Rattan, Beijing 100102, China
| | - Lichao Li
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, International Center for Bamboo and Rattan, Beijing 100102, China
| | - Yongfeng Lou
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, International Center for Bamboo and Rattan, Beijing 100102, China
| | - Rongsheng Li
- Research Institute of Tropical Forestry, Chinese Academy of Forestry, Guangzhou, 510000, China
| | - Lianghua Qi
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, International Center for Bamboo and Rattan, Beijing 100102, China
| | - Zhimin Gao
- State Forestry Administration Key Open Laboratory on the Science and Technology of Bamboo and Rattan, International Center for Bamboo and Rattan, Beijing 100102, China
| |
Collapse
|
41
|
Rognes T, Flouri T, Nichols B, Quince C, Mahé F. VSEARCH: a versatile open source tool for metagenomics. PeerJ 2016; 4:e2584. [PMID: 27781170 PMCID: PMC5075697 DOI: 10.7717/peerj.2584] [Citation(s) in RCA: 4775] [Impact Index Per Article: 596.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2016] [Accepted: 09/17/2016] [Indexed: 12/16/2022] Open
Abstract
Background VSEARCH is an open source and free of charge multithreaded 64-bit tool for processing and preparing metagenomics, genomics and population genomics nucleotide sequence data. It is designed as an alternative to the widely used USEARCH tool (Edgar, 2010) for which the source code is not publicly available, algorithm details are only rudimentarily described, and only a memory-confined 32-bit version is freely available for academic use. Methods When searching nucleotide sequences, VSEARCH uses a fast heuristic based on words shared by the query and target sequences in order to quickly identify similar sequences, a similar strategy is probably used in USEARCH. VSEARCH then performs optimal global sequence alignment of the query against potential target sequences, using full dynamic programming instead of the seed-and-extend heuristic used by USEARCH. Pairwise alignments are computed in parallel using vectorisation and multiple threads. Results VSEARCH includes most commands for analysing nucleotide sequences available in USEARCH version 7 and several of those available in USEARCH version 8, including searching (exact or based on global alignment), clustering by similarity (using length pre-sorting, abundance pre-sorting or a user-defined order), chimera detection (reference-based or de novo), dereplication (full length or prefix), pairwise alignment, reverse complementation, sorting, and subsampling. VSEARCH also includes commands for FASTQ file processing, i.e., format detection, filtering, read quality statistics, and merging of paired reads. Furthermore, VSEARCH extends functionality with several new commands and improvements, including shuffling, rereplication, masking of low-complexity sequences with the well-known DUST algorithm, a choice among different similarity definitions, and FASTQ file format conversion. VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with USEARCH for paired-ends read merging. VSEARCH is slower than USEARCH when performing clustering and chimera detection, but significantly faster when performing paired-end reads merging and dereplication. VSEARCH is available at https://github.com/torognes/vsearch under either the BSD 2-clause license or the GNU General Public License version 3.0. Discussion VSEARCH has been shown to be a fast, accurate and full-fledged alternative to USEARCH. A free and open-source versatile tool for sequence analysis is now available to the metagenomics community.
Collapse
Affiliation(s)
- Torbjørn Rognes
- Department of Informatics, University of Oslo, Oslo, Norway; Department of Microbiology, Oslo University Hospital, Oslo, Norway
| | - Tomáš Flouri
- Heidelberg Institute for Theoretical Studies, Heidelberg, Germany; Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Ben Nichols
- School of Engineering, University of Glasgow , Glasgow , United Kingdom
| | - Christopher Quince
- School of Engineering, University of Glasgow, Glasgow, United Kingdom; Warwick Medical School, University of Warwick, Coventry, United Kingdom
| | - Frédéric Mahé
- Department of Ecology, University of Kaiserslautern, Kaiserslautern, Germany; UMR LSTM, CIRAD, Montpellier, France
| |
Collapse
|
42
|
Yang J, Liu D, Wang X, Ji C, Cheng F, Liu B, Hu Z, Chen S, Pental D, Ju Y, Yao P, Li X, Xie K, Zhang J, Wang J, Liu F, Ma W, Shopan J, Zheng H, Mackenzie SA, Zhang M. The genome sequence of allopolyploid Brassica juncea and analysis of differential homoeolog gene expression influencing selection. Nat Genet 2016; 48:1225-1232. [PMID: 27595476 DOI: 10.1038/ng3657] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2016] [Accepted: 07/21/2016] [Indexed: 05/18/2023]
Abstract
The Brassica genus encompasses three diploid and three allopolyploid genomes, but a clear understanding of the evolution of agriculturally important traits via polyploidy is lacking. We assembled an allopolyploid Brassica juncea genome by shotgun and single-molecule reads integrated to genomic and genetic maps. We discovered that the A subgenomes of B. juncea and Brassica napus each had independent origins. Results suggested that A subgenomes of B. juncea were of monophyletic origin and evolved into vegetable-use and oil-use subvarieties. Homoeolog expression dominance occurs between subgenomes of allopolyploid B. juncea, in which differentially expressed genes display more selection potential than neutral genes. Homoeolog expression dominance in B. juncea has facilitated selection of glucosinolate and lipid metabolism genes in subvarieties used as vegetables and for oil production. These homoeolog expression dominance relationships among Brassicaceae genomes have contributed to selection response, predicting the directional effects of selection in a polyploid crop genome.
Collapse
Affiliation(s)
- Jinghua Yang
- Laboratory of Germplasm Innovation and Molecular Breeding, Institute of Vegetable Science, Zhejiang University, Hangzhou, China
- Key Laboratory of Horticultural Plant Growth, Development and Quality Improvement, Ministry of Agriculture, Hangzhou, China
- Zhejiang Provincial Key Laboratory of Horticultural Plant Integrative Biology, Hangzhou, China
| | - Dongyuan Liu
- Biomarker Technologies Corporation, Beijing, China
| | - Xiaowu Wang
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Science, Beijing, China
| | - Changmian Ji
- Biomarker Technologies Corporation, Beijing, China
| | - Feng Cheng
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Science, Beijing, China
| | - Baoning Liu
- Biomarker Technologies Corporation, Beijing, China
| | - Zhongyuan Hu
- Laboratory of Germplasm Innovation and Molecular Breeding, Institute of Vegetable Science, Zhejiang University, Hangzhou, China
- Key Laboratory of Horticultural Plant Growth, Development and Quality Improvement, Ministry of Agriculture, Hangzhou, China
- Zhejiang Provincial Key Laboratory of Horticultural Plant Integrative Biology, Hangzhou, China
| | - Sheng Chen
- School of Plant Biology (M084) and the UWA Institute of Agriculture, University of Western Australia, Perth, Western Australia, Australia
| | - Deepak Pental
- Center for Genetic Manipulation of Crop Plants, University of Delhi South Campus, New Delhi, India
| | - Youhui Ju
- Biomarker Technologies Corporation, Beijing, China
| | - Pu Yao
- Biomarker Technologies Corporation, Beijing, China
| | - Xuming Li
- Biomarker Technologies Corporation, Beijing, China
| | - Kun Xie
- Biomarker Technologies Corporation, Beijing, China
| | | | - Jianlin Wang
- College of Plant Science and Technology, Agricultural and Animal Husbandry College of Tibet University, Linzhi, China
| | - Fan Liu
- Beijing Vegetable Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Weiwei Ma
- Laboratory of Germplasm Innovation and Molecular Breeding, Institute of Vegetable Science, Zhejiang University, Hangzhou, China
| | - Jannat Shopan
- Laboratory of Germplasm Innovation and Molecular Breeding, Institute of Vegetable Science, Zhejiang University, Hangzhou, China
| | | | - Sally A Mackenzie
- Department of Agronomy and Horticulture, University of Nebraska, Lincoln, Nebraska, USA
| | - Mingfang Zhang
- Laboratory of Germplasm Innovation and Molecular Breeding, Institute of Vegetable Science, Zhejiang University, Hangzhou, China
- Key Laboratory of Horticultural Plant Growth, Development and Quality Improvement, Ministry of Agriculture, Hangzhou, China
- Zhejiang Provincial Key Laboratory of Horticultural Plant Integrative Biology, Hangzhou, China
| |
Collapse
|
43
|
The genome sequence of allopolyploid Brassica juncea and analysis of differential homoeolog gene expression influencing selection. Nat Genet 2016; 48:1225-32. [PMID: 27595476 DOI: 10.1038/ng.3657] [Citation(s) in RCA: 284] [Impact Index Per Article: 35.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2016] [Accepted: 07/21/2016] [Indexed: 12/24/2022]
Abstract
The Brassica genus encompasses three diploid and three allopolyploid genomes, but a clear understanding of the evolution of agriculturally important traits via polyploidy is lacking. We assembled an allopolyploid Brassica juncea genome by shotgun and single-molecule reads integrated to genomic and genetic maps. We discovered that the A subgenomes of B. juncea and Brassica napus each had independent origins. Results suggested that A subgenomes of B. juncea were of monophyletic origin and evolved into vegetable-use and oil-use subvarieties. Homoeolog expression dominance occurs between subgenomes of allopolyploid B. juncea, in which differentially expressed genes display more selection potential than neutral genes. Homoeolog expression dominance in B. juncea has facilitated selection of glucosinolate and lipid metabolism genes in subvarieties used as vegetables and for oil production. These homoeolog expression dominance relationships among Brassicaceae genomes have contributed to selection response, predicting the directional effects of selection in a polyploid crop genome.
Collapse
|
44
|
Chawla V, Kumar R, Shankar R. Identifying wrong assemblies in de novo short read primary sequence assembly contigs. J Biosci 2016; 41:455-74. [PMID: 27581937 DOI: 10.1007/s12038-016-9630-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
With the advent of short-reads-based genome sequencing approaches, large number of organisms are being sequenced all over the world. Most of these assemblies are done using some de novo short read assemblers and other related approaches. However, the contigs produced this way are prone to wrong assembly. So far, there is a conspicuous dearth of reliable tools to identify mis-assembled contigs. Mis-assemblies could result from incorrectly deleted or wrongly arranged genomic sequences. In the present work various factors related to sequence, sequencing and assembling have been assessed for their role in causing mis-assembly by using different genome sequencing data. Finally, some mis-assembly detecting tools have been evaluated for their ability to detect the wrongly assembled primary contigs, suggesting a lot of scope for improvement in this area. The present work also proposes a simple unsupervised learning-based novel approach to identify mis-assemblies in the contigs which was found performing reasonably well when compared to the already existing tools to report mis-assembled contigs. It was observed that the proposed methodology may work as a complementary system to the existing tools to enhance their accuracy.
Collapse
Affiliation(s)
- Vandna Chawla
- Studio of Computational Biology and Bioinformatics, Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
| | | | | |
Collapse
|
45
|
Peña A, Busquets A, Gomila M, Mulet M, Gomila RM, Reddy TBK, Huntemann M, Pati A, Ivanova N, Markowitz V, García-Valdés E, Göker M, Woyke T, Klenk HP, Kyrpides N, Lalucat J. High quality draft genome sequences of Pseudomonas fulva DSM 17717(T), Pseudomonas parafulva DSM 17004(T) and Pseudomonas cremoricolorata DSM 17059(T) type strains. Stand Genomic Sci 2016; 11:55. [PMID: 27594974 PMCID: PMC5009691 DOI: 10.1186/s40793-016-0178-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2015] [Accepted: 08/16/2016] [Indexed: 01/10/2023] Open
Abstract
Pseudomonas has the highest number of species out of any genus of Gram-negative bacteria and is phylogenetically divided into several groups. The Pseudomonas putida phylogenetic branch includes at least 13 species of environmental and industrial interest, plant-associated bacteria, insect pathogens, and even some members that have been found in clinical specimens. In the context of the Genomic Encyclopedia of Bacteria and Archaea project, we present the permanent, high-quality draft genomes of the type strains of 3 taxonomically and ecologically closely related species in the Pseudomonas putida phylogenetic branch: Pseudomonas fulva DSM 17717T, Pseudomonas parafulva DSM 17004T and Pseudomonas cremoricolorata DSM 17059T. All three genomes are comparable in size (4.6–4.9 Mb), with 4,119–4,459 protein-coding genes. Average nucleotide identity based on BLAST comparisons and digital genome-to-genome distance calculations are in good agreement with experimental DNA-DNA hybridization results. The genome sequences presented here will be very helpful in elucidating the taxonomy, phylogeny and evolution of the Pseudomonas putida species complex.
Collapse
Affiliation(s)
- Arantxa Peña
- Department of Biology-Microbiology, Universitat de les Illes Balears, Campus UIB, Crtra. Valldemossa km 7.5, 07122 Palma de Mallorca, Spain
| | - Antonio Busquets
- Department of Biology-Microbiology, Universitat de les Illes Balears, Campus UIB, Crtra. Valldemossa km 7.5, 07122 Palma de Mallorca, Spain
| | - Margarita Gomila
- Department of Biology-Microbiology, Universitat de les Illes Balears, Campus UIB, Crtra. Valldemossa km 7.5, 07122 Palma de Mallorca, Spain
| | - Magdalena Mulet
- Department of Biology-Microbiology, Universitat de les Illes Balears, Campus UIB, Crtra. Valldemossa km 7.5, 07122 Palma de Mallorca, Spain
| | - Rosa M Gomila
- Serveis Cientifico-Tècnics, Universitat de les Illes Balears, Palma de Mallorca, Spain
| | - T B K Reddy
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698 USA
| | - Marcel Huntemann
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698 USA
| | - Amrita Pati
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698 USA
| | - Natalia Ivanova
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698 USA
| | - Victor Markowitz
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698 USA
| | - Elena García-Valdés
- Department of Biology-Microbiology, Universitat de les Illes Balears, Campus UIB, Crtra. Valldemossa km 7.5, 07122 Palma de Mallorca, Spain ; Institut Mediterrani d'Estudis Avançats (IMEDEA, CSIC-UIB), Palma de Mallorca, Spain
| | - Markus Göker
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, 38124 Braunschweig, Germany
| | - Tanja Woyke
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698 USA
| | - Hans-Peter Klenk
- School of Biology, Newcastle University, Newcastle upon Tyne, NE1 7RU UK
| | - Nikos Kyrpides
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598-1698 USA ; Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Jorge Lalucat
- Department of Biology-Microbiology, Universitat de les Illes Balears, Campus UIB, Crtra. Valldemossa km 7.5, 07122 Palma de Mallorca, Spain ; Institut Mediterrani d'Estudis Avançats (IMEDEA, CSIC-UIB), Palma de Mallorca, Spain
| |
Collapse
|
46
|
Akogwu I, Wang N, Zhang C, Gong P. A comparative study of k-spectrum-based error correction methods for next-generation sequencing data analysis. Hum Genomics 2016; 10 Suppl 2:20. [PMID: 27461106 PMCID: PMC4965716 DOI: 10.1186/s40246-016-0068-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Innumerable opportunities for new genomic research have been stimulated by advancement in high-throughput next-generation sequencing (NGS). However, the pitfall of NGS data abundance is the complication of distinction between true biological variants and sequence error alterations during downstream analysis. Many error correction methods have been developed to correct erroneous NGS reads before further analysis, but independent evaluation of the impact of such dataset features as read length, genome size, and coverage depth on their performance is lacking. This comparative study aims to investigate the strength and weakness as well as limitations of some newest k-spectrum-based methods and to provide recommendations for users in selecting suitable methods with respect to specific NGS datasets. METHODS Six k-spectrum-based methods, i.e., Reptile, Musket, Bless, Bloocoo, Lighter, and Trowel, were compared using six simulated sets of paired-end Illumina sequencing data. These NGS datasets varied in coverage depth (10× to 120×), read length (36 to 100 bp), and genome size (4.6 to 143 MB). Error Correction Evaluation Toolkit (ECET) was employed to derive a suite of metrics (i.e., true positives, false positive, false negative, recall, precision, gain, and F-score) for assessing the correction quality of each method. RESULTS Results from computational experiments indicate that Musket had the best overall performance across the spectra of examined variants reflected in the six datasets. The lowest accuracy of Musket (F-score = 0.81) occurred to a dataset with a medium read length (56 bp), a medium coverage (50×), and a small-sized genome (5.4 MB). The other five methods underperformed (F-score < 0.80) and/or failed to process one or more datasets. CONCLUSIONS This study demonstrates that various factors such as coverage depth, read length, and genome size may influence performance of individual k-spectrum-based error correction methods. Thus, efforts have to be paid in choosing appropriate methods for error correction of specific NGS datasets. Based on our comparative study, we recommend Musket as the top choice because of its consistently superior performance across all six testing datasets. Further extensive studies are warranted to assess these methods using experimental datasets generated by NGS platforms (e.g., 454, SOLiD, and Ion Torrent) under more diversified parameter settings (k-mer values and edit distances) and to compare them against other non-k-spectrum-based classes of error correction methods.
Collapse
Affiliation(s)
- Isaac Akogwu
- School of Computing, University of Southern Mississippi, Hattiesburg, MS, 39406, USA
| | - Nan Wang
- School of Computing, University of Southern Mississippi, Hattiesburg, MS, 39406, USA
| | - Chaoyang Zhang
- School of Computing, University of Southern Mississippi, Hattiesburg, MS, 39406, USA
| | - Ping Gong
- Environmental Laboratory, U.S. Army Engineer Research and Development Center, Vicksburg, MS, 39180, USA.
| |
Collapse
|
47
|
An unusual strategy for the anoxic biodegradation of phthalate. ISME JOURNAL 2016; 11:224-236. [PMID: 27392087 DOI: 10.1038/ismej.2016.91] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Revised: 05/23/2016] [Accepted: 05/31/2016] [Indexed: 11/09/2022]
Abstract
In the past two decades, the study of oxygen-independent degradation of widely abundant aromatic compounds in anaerobic bacteria has revealed numerous unprecedented enzymatic principles. Surprisingly, the organisms, metabolites and enzymes involved in the degradation of o-phthalate (1,2-dicarboxybenzene), mainly derived from phthalate esters that are annually produced at the million ton scale, are sparsely known. Here, we demonstrate a previously unknown capacity of complete phthalate degradation in established aromatic compound-degrading, denitrifying model organisms of the genera Thauera, Azoarcus and 'Aromatoleum'. Differential proteome analyses revealed phthalate-induced gene clusters involved in uptake and conversion of phthalate to the central intermediate benzoyl-CoA. Enzyme assays provided in vitro evidence for the formation of phthaloyl-CoA by a succinyl-CoA- and phthalate-specific CoA transferase, which is essential for the subsequent oxygen-sensitive decarboxylation to benzoyl-CoA. The extreme instability of the phthaloyl-CoA intermediate requires highly balanced CoA transferase and decarboxylase activities to avoid its cellular accumulation. Phylogenetic analysis revealed phthaloyl-CoA decarboxylase as a novel member of the UbiD-like, (de)carboxylase enzyme family. Homologs of the encoding gene form a phylogenetic cluster and are found in soil, freshwater and marine bacteria; an ongoing global distribution of a possibly only recently evolved degradation pathway is suggested.
Collapse
|
48
|
Zhang J, Kudrna D, Mu T, Li W, Copetti D, Yu Y, Goicoechea JL, Lei Y, Wing RA. Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences. Bioinformatics 2016; 32:3058-3064. [PMID: 27318200 PMCID: PMC5048067 DOI: 10.1093/bioinformatics/btw370] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Accepted: 06/06/2016] [Indexed: 12/16/2022] Open
Abstract
Motivation: Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool—Genome Puzzle Master (GPM)—that enables the integration of additional genomic signposts to edit and build ‘new-gen-assemblies’ that result in high-quality ‘annotation-ready’ pseudomolecules. Results: With GPM, loaded datasets can be connected to each other via their logical relationships which accomplishes tasks to ‘group,’ ‘merge,’ ‘order and orient’ sequences in a draft assembly. Manual editing can also be performed with a user-friendly graphical interface. Final pseudomolecules reflect a user’s total data package and are available for long-term project management. GPM is a web-based pipeline and an important part of a Laboratory Information Management System (LIMS) which can be easily deployed on local servers for any genome research laboratory. Availability and Implementation: The GPM (with LIMS) package is available at https://github.com/Jianwei-Zhang/LIMS Contacts:jzhang@mail.hzau.edu.cn or rwing@mail.arizona.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jianwei Zhang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China Arizona Genomics Institute and BIO5 Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Dave Kudrna
- Arizona Genomics Institute and BIO5 Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Ting Mu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Weiming Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Dario Copetti
- Arizona Genomics Institute and BIO5 Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA International Rice Research Institute, Genetic Resource Center, Los Baños, Laguna, Philippines
| | - Yeisoo Yu
- Arizona Genomics Institute and BIO5 Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Jose Luis Goicoechea
- Arizona Genomics Institute and BIO5 Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Yang Lei
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Rod A Wing
- Arizona Genomics Institute and BIO5 Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA International Rice Research Institute, Genetic Resource Center, Los Baños, Laguna, Philippines
| |
Collapse
|
49
|
Ma L, Chen Z, Huang DW, Kutty G, Ishihara M, Wang H, Abouelleil A, Bishop L, Davey E, Deng R, Deng X, Fan L, Fantoni G, Fitzgerald M, Gogineni E, Goldberg JM, Handley G, Hu X, Huber C, Jiao X, Jones K, Levin JZ, Liu Y, Macdonald P, Melnikov A, Raley C, Sassi M, Sherman BT, Song X, Sykes S, Tran B, Walsh L, Xia Y, Yang J, Young S, Zeng Q, Zheng X, Stephens R, Nusbaum C, Birren BW, Azadi P, Lempicki RA, Cuomo CA, Kovacs JA. Genome analysis of three Pneumocystis species reveals adaptation mechanisms to life exclusively in mammalian hosts. Nat Commun 2016; 7:10740. [PMID: 26899007 PMCID: PMC4764891 DOI: 10.1038/ncomms10740] [Citation(s) in RCA: 117] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2015] [Accepted: 01/13/2016] [Indexed: 02/07/2023] Open
Abstract
Pneumocystis jirovecii is a major cause of life-threatening pneumonia in immunosuppressed patients including transplant recipients and those with HIV/AIDS, yet surprisingly little is known about the biology of this fungal pathogen. Here we report near complete genome assemblies for three Pneumocystis species that infect humans, rats and mice. Pneumocystis genomes are highly compact relative to other fungi, with substantial reductions of ribosomal RNA genes, transporters, transcription factors and many metabolic pathways, but contain expansions of surface proteins, especially a unique and complex surface glycoprotein superfamily, as well as proteases and RNA processing proteins. Unexpectedly, the key fungal cell wall components chitin and outer chain N-mannans are absent, based on genome content and experimental validation. Our findings suggest that Pneumocystis has developed unique mechanisms of adaptation to life exclusively in mammalian hosts, including dependence on the lungs for gas and nutrients and highly efficient strategies to escape both host innate and acquired immune defenses.
Collapse
Affiliation(s)
- Liang Ma
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Zehua Chen
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Da Wei Huang
- Leidos BioMedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, USA
| | - Geetha Kutty
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Mayumi Ishihara
- Complex Carbohydrate Research Center, University of Georgia, Athens, Georgia 30602, USA
| | - Honghui Wang
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Amr Abouelleil
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Lisa Bishop
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Emma Davey
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Rebecca Deng
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Xilong Deng
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Lin Fan
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Giovanna Fantoni
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Michael Fitzgerald
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Emile Gogineni
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Jonathan M. Goldberg
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Grace Handley
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Xiaojun Hu
- Leidos BioMedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, USA
| | - Charles Huber
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Xiaoli Jiao
- Leidos BioMedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, USA
| | - Kristine Jones
- Leidos BioMedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, USA
| | - Joshua Z. Levin
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Yueqin Liu
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Pendexter Macdonald
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Alexandre Melnikov
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Castle Raley
- Leidos BioMedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, USA
| | - Monica Sassi
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Brad T. Sherman
- Leidos BioMedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, USA
| | - Xiaohong Song
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Sean Sykes
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Bao Tran
- Leidos BioMedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, USA
| | - Laura Walsh
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Yun Xia
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| | - Jun Yang
- Leidos BioMedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, USA
| | - Sarah Young
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Qiandong Zeng
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Xin Zheng
- Leidos BioMedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, USA
| | - Robert Stephens
- Leidos BioMedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, USA
| | - Chad Nusbaum
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Bruce W. Birren
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Parastoo Azadi
- Complex Carbohydrate Research Center, University of Georgia, Athens, Georgia 30602, USA
| | - Richard A. Lempicki
- Leidos BioMedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, USA
| | - Christina A. Cuomo
- Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA
| | - Joseph A. Kovacs
- Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA
| |
Collapse
|
50
|
Susanti D, Johnson EF, Lapidus A, Han J, Reddy TBK, Pilay M, Ivanova NN, Markowitz VM, Woyke T, Kyrpides NC, Mukhopadhyay B. Permanent draft genome sequence of Desulfurococcus mobilis type strain DSM 2161, a thermoacidophilic sulfur-reducing crenarchaeon isolated from acidic hot springs of Hveravellir, Iceland. Stand Genomic Sci 2016; 11:3. [PMID: 26767090 PMCID: PMC4711178 DOI: 10.1186/s40793-015-0128-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 12/30/2015] [Indexed: 11/10/2022] Open
Abstract
This report presents the permanent draft genome sequence of Desulfurococcus mobilis type strain DSM 2161, an obligate anaerobic hyperthermophilic crenarchaeon that was isolated from acidic hot springs in Hveravellir, Iceland. D. mobilis utilizes peptides as carbon and energy sources and reduces elemental sulfur to H2S. A metabolic construction derived from the draft genome identified putative pathways for peptide degradation and sulfur respiration in this archaeon. Existence of several hydrogenase genes in the genome supported previous findings that H2 is produced during the growth of D. mobilis in the absence of sulfur. Interestingly, genes encoding glucose transport and utilization systems also exist in the D. mobilis genome though this archaeon does not utilize carbohydrate for growth. The draft genome of D. mobilis provides an additional mean for comparative genomic analysis of desulfurococci. In addition, our analysis on the Average Nucleotide Identity between D. mobilis and Desulfurococcus mucosus suggested that these two desulfurococci are two different strains of the same species.
Collapse
Affiliation(s)
- Dwi Susanti
- />Department of Biochemistry, Virginia Tech, Blacksburg, VA 24061 USA
| | - Eric F. Johnson
- />Biocomplexity Institute, Virginia Tech, Blacksburg, VA 24061 USA
| | - Alla Lapidus
- />Centre for Algorithmic Biotechnology, St. Petersburg State University, St. Petersburg, Russia
- />Algorithmic Biology Lab, St. Petersburg Academic University, St. Petersburg, Russia
| | - James Han
- />US DOE Joint Genome Institute, Walnut Creek, California 94598 USA
| | - T. B. K. Reddy
- />US DOE Joint Genome Institute, Walnut Creek, California 94598 USA
| | - Manoj Pilay
- />Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, California USA
| | | | - Victor M. Markowitz
- />Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, California USA
| | - Tanja Woyke
- />US DOE Joint Genome Institute, Walnut Creek, California 94598 USA
| | - Nikos C. Kyrpides
- />US DOE Joint Genome Institute, Walnut Creek, California 94598 USA
- />Department of Biology, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Biswarup Mukhopadhyay
- />Department of Biochemistry, Virginia Tech, Blacksburg, VA 24061 USA
- />Biocomplexity Institute, Virginia Tech, Blacksburg, VA 24061 USA
- />Department of Biological Sciences, Virginia Tech, Blacksburg, VA 24061 USA
| |
Collapse
|