1
|
Unneberg P, Larsson M, Olsson A, Wallerman O, Petri A, Bunikis I, Vinnere Pettersson O, Papetti C, Gislason A, Glenner H, Cartes JE, Blanco-Bercial L, Eriksen E, Meyer B, Wallberg A. Ecological genomics in the Northern krill uncovers loci for local adaptation across ocean basins. Nat Commun 2024; 15:6297. [PMID: 39090106 PMCID: PMC11294593 DOI: 10.1038/s41467-024-50239-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 05/15/2024] [Indexed: 08/04/2024] Open
Abstract
Krill are vital as food for many marine animals but also impacted by global warming. To learn how they and other zooplankton may adapt to a warmer world we studied local adaptation in the widespread Northern krill (Meganyctiphanes norvegica). We assemble and characterize its large genome and compare genome-scale variation among 74 specimens from the colder Atlantic Ocean and warmer Mediterranean Sea. The 19 Gb genome likely evolved through proliferation of retrotransposons, now targeted for inactivation by extensive DNA methylation, and contains many duplicated genes associated with molting and vision. Analysis of 760 million SNPs indicates extensive homogenizing gene-flow among populations. Nevertheless, we detect signatures of adaptive divergence across hundreds of genes, implicated in photoreception, circadian regulation, reproduction and thermal tolerance, indicating polygenic adaptation to light and temperature. The top gene candidate for ecological adaptation was nrf-6, a lipid transporter with a Mediterranean variant that may contribute to early spring reproduction. Such variation could become increasingly important for fitness in Atlantic stocks. Our study underscores the widespread but uneven distribution of adaptive variation, necessitating characterization of genetic variation among natural zooplankton populations to understand their adaptive potential, predict risks and support ocean conservation in the face of climate change.
Collapse
Affiliation(s)
- Per Unneberg
- Department of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Mårten Larsson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Husargatan 3, 751 23, Uppsala, Sweden
- Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
| | - Anna Olsson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Husargatan 3, 751 23, Uppsala, Sweden
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Uppsala University, Husargatan 3, 751 23, Uppsala, Sweden
| | - Anna Petri
- Uppsala Genome Center, Department of Immunology, Genetics and Pathology, Uppsala University, National Genomics Infrastructure hosted by SciLifeLab, Uppsala, Sweden
| | - Ignas Bunikis
- Uppsala Genome Center, Department of Immunology, Genetics and Pathology, Uppsala University, National Genomics Infrastructure hosted by SciLifeLab, Uppsala, Sweden
| | - Olga Vinnere Pettersson
- Uppsala Genome Center, Department of Immunology, Genetics and Pathology, Uppsala University, National Genomics Infrastructure hosted by SciLifeLab, Uppsala, Sweden
| | | | - Astthor Gislason
- Marine and Freshwater Research Institute, Pelagic Division, Reykjavik, Iceland
| | - Henrik Glenner
- Department of Biological Sciences, University of Bergen, Bergen, Norway
- Center for Macroecology, Evolution and Climate Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Joan E Cartes
- Instituto de Ciencias del Mar (ICM-CSIC), Barcelona, Spain
| | | | | | - Bettina Meyer
- Section Polar Biological Oceanography, Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Bremerhaven, Germany
- Institute for Chemistry and Biology of the Marine Environment, Carlvon Ossietzky University of Oldenburg, Oldenburg, Germany
- Helmholtz Institute for Functional Marine Biodiversity (HIFMB), University of Oldenburg, Oldenburg, Germany
| | - Andreas Wallberg
- Department of Medical Biochemistry and Microbiology, Uppsala University, Husargatan 3, 751 23, Uppsala, Sweden.
| |
Collapse
|
2
|
Parisot N, Vargas-Chávez C, Goubert C, Baa-Puyoulet P, Balmand S, Beranger L, Blanc C, Bonnamour A, Boulesteix M, Burlet N, Calevro F, Callaerts P, Chancy T, Charles H, Colella S, Da Silva Barbosa A, Dell'Aglio E, Di Genova A, Febvay G, Gabaldón T, Galvão Ferrarini M, Gerber A, Gillet B, Hubley R, Hughes S, Jacquin-Joly E, Maire J, Marcet-Houben M, Masson F, Meslin C, Montagné N, Moya A, Ribeiro de Vasconcelos AT, Richard G, Rosen J, Sagot MF, Smit AFA, Storer JM, Vincent-Monegat C, Vallier A, Vigneron A, Zaidman-Rémy A, Zamoum W, Vieira C, Rebollo R, Latorre A, Heddi A. The transposable element-rich genome of the cereal pest Sitophilus oryzae. BMC Biol 2021; 19:241. [PMID: 34749730 PMCID: PMC8576890 DOI: 10.1186/s12915-021-01158-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Accepted: 09/27/2021] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND The rice weevil Sitophilus oryzae is one of the most important agricultural pests, causing extensive damage to cereal in fields and to stored grains. S. oryzae has an intracellular symbiotic relationship (endosymbiosis) with the Gram-negative bacterium Sodalis pierantonius and is a valuable model to decipher host-symbiont molecular interactions. RESULTS We sequenced the Sitophilus oryzae genome using a combination of short and long reads to produce the best assembly for a Curculionidae species to date. We show that S. oryzae has undergone successive bursts of transposable element (TE) amplification, representing 72% of the genome. In addition, we show that many TE families are transcriptionally active, and changes in their expression are associated with insect endosymbiotic state. S. oryzae has undergone a high gene expansion rate, when compared to other beetles. Reconstruction of host-symbiont metabolic networks revealed that, despite its recent association with cereal weevils (30 kyear), S. pierantonius relies on the host for several amino acids and nucleotides to survive and to produce vitamins and essential amino acids required for insect development and cuticle biosynthesis. CONCLUSIONS Here we present the genome of an agricultural pest beetle, which may act as a foundation for pest control. In addition, S. oryzae may be a useful model for endosymbiosis, and studying TE evolution and regulation, along with the impact of TEs on eukaryotic genomes.
Collapse
Affiliation(s)
- Nicolas Parisot
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Carlos Vargas-Chávez
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
- Institute for Integrative Systems Biology (I2SySBio), Universitat de València and Spanish Research Council (CSIC), València, Spain
- Present Address: Institute of Evolutionary Biology (IBE), CSIC-Universitat Pompeu Fabra, Barcelona, Spain
| | - Clément Goubert
- Laboratoire de Biométrie et Biologie Evolutive, UMR5558, Université Lyon 1, Université Lyon, Villeurbanne, France
- Department of Molecular Biology and Genetics, Cornell University, 526 Campus Rd, Ithaca, New York, 14853, USA
- Present Address: Human Genetics, McGill University, Montreal, QC, Canada
| | | | - Séverine Balmand
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Louis Beranger
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Caroline Blanc
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Aymeric Bonnamour
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Matthieu Boulesteix
- Laboratoire de Biométrie et Biologie Evolutive, UMR5558, Université Lyon 1, Université Lyon, Villeurbanne, France
| | - Nelly Burlet
- Laboratoire de Biométrie et Biologie Evolutive, UMR5558, Université Lyon 1, Université Lyon, Villeurbanne, France
| | - Federica Calevro
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Patrick Callaerts
- Department of Human Genetics, Laboratory of Behavioral and Developmental Genetics, KU Leuven, University of Leuven, B-3000, Leuven, Belgium
| | - Théo Chancy
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Hubert Charles
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
- ERABLE European Team, INRIA, Rhône-Alpes, France
| | - Stefano Colella
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
- Present Address: LSTM, Laboratoire des Symbioses Tropicales et Méditerranéennes, IRD, CIRAD, INRAE, SupAgro, Univ Montpellier, Montpellier, France
| | - André Da Silva Barbosa
- INRAE, Sorbonne Université, CNRS, IRD, UPEC, Université de Paris, Institute of Ecology and Environmental Sciences of Paris, Versailles, France
| | - Elisa Dell'Aglio
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Alex Di Genova
- Laboratoire de Biométrie et Biologie Evolutive, UMR5558, Université Lyon 1, Université Lyon, Villeurbanne, France
- ERABLE European Team, INRIA, Rhône-Alpes, France
- Instituto de Ciencias de la Ingeniería, Universidad de O'Higgins, Rancagua, Chile
| | - Gérard Febvay
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Toni Gabaldón
- Life Sciences, Barcelona Supercomputing Centre (BSC-CNS), Barcelona, Spain
- Mechanisms of Disease, Institute for Research in Biomedicine (IRB), Barcelona, Spain
- Institut Catalan de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | | | - Alexandra Gerber
- Laboratório de Bioinformática, Laboratório Nacional de Computação Científica, Petrópolis, Brazil
| | - Benjamin Gillet
- Institut de Génomique Fonctionnelle de Lyon (IGFL), Université de Lyon, Ecole Normale Supérieure de Lyon, CNRS UMR 5242, Lyon, France
| | | | - Sandrine Hughes
- Institut de Génomique Fonctionnelle de Lyon (IGFL), Université de Lyon, Ecole Normale Supérieure de Lyon, CNRS UMR 5242, Lyon, France
| | - Emmanuelle Jacquin-Joly
- INRAE, Sorbonne Université, CNRS, IRD, UPEC, Université de Paris, Institute of Ecology and Environmental Sciences of Paris, Versailles, France
| | - Justin Maire
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
- Present Address: School of BioSciences, The University of Melbourne, Parkville, VIC, 3010, Australia
| | | | - Florent Masson
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
- Present Address: Global Health Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland
| | - Camille Meslin
- INRAE, Sorbonne Université, CNRS, IRD, UPEC, Université de Paris, Institute of Ecology and Environmental Sciences of Paris, Versailles, France
| | - Nicolas Montagné
- INRAE, Sorbonne Université, CNRS, IRD, UPEC, Université de Paris, Institute of Ecology and Environmental Sciences of Paris, Versailles, France
| | - Andrés Moya
- Institute for Integrative Systems Biology (I2SySBio), Universitat de València and Spanish Research Council (CSIC), València, Spain
- Foundation for the Promotion of Sanitary and Biomedical Research of Valencian Community (FISABIO), València, Spain
| | | | - Gautier Richard
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653, Le Rheu, France
| | - Jeb Rosen
- Institute for Systems Biology, Seattle, WA, USA
| | - Marie-France Sagot
- Laboratoire de Biométrie et Biologie Evolutive, UMR5558, Université Lyon 1, Université Lyon, Villeurbanne, France
- ERABLE European Team, INRIA, Rhône-Alpes, France
| | | | | | | | - Agnès Vallier
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Aurélien Vigneron
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
- Present Address: Department of Evolutionary Ecology, Institute for Organismic and Molecular Evolution, Johannes Gutenberg University, 55128, Mainz, Germany
| | - Anna Zaidman-Rémy
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Waël Zamoum
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Cristina Vieira
- Laboratoire de Biométrie et Biologie Evolutive, UMR5558, Université Lyon 1, Université Lyon, Villeurbanne, France.
- ERABLE European Team, INRIA, Rhône-Alpes, France.
| | - Rita Rebollo
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France.
| | - Amparo Latorre
- Institute for Integrative Systems Biology (I2SySBio), Universitat de València and Spanish Research Council (CSIC), València, Spain.
- Foundation for the Promotion of Sanitary and Biomedical Research of Valencian Community (FISABIO), València, Spain.
| | - Abdelaziz Heddi
- Univ Lyon, INSA Lyon, INRAE, BF2I, UMR 203, 69621 Villeurbanne, France.
| |
Collapse
|
3
|
Makita Y, Suzuki S, Fushimi K, Shimada S, Suehisa A, Hirata M, Kuriyama T, Kurihara Y, Hamasaki H, Okubo-Kurihara E, Yoshitake K, Watanabe T, Sakuta M, Gojobori T, Sakami T, Narikawa R, Yamaguchi H, Kawachi M, Matsui M. Identification of a dual orange/far-red and blue light photoreceptor from an oceanic green picoplankton. Nat Commun 2021; 12:3593. [PMID: 34135337 PMCID: PMC8209157 DOI: 10.1038/s41467-021-23741-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 05/11/2021] [Indexed: 11/09/2022] Open
Abstract
Photoreceptors are conserved in green algae to land plants and regulate various developmental stages. In the ocean, blue light penetrates deeper than red light, and blue-light sensing is key to adapting to marine environments. Here, a search for blue-light photoreceptors in the marine metagenome uncover a chimeric gene composed of a phytochrome and a cryptochrome (Dualchrome1, DUC1) in a prasinophyte, Pycnococcus provasolii. DUC1 detects light within the orange/far-red and blue spectra, and acts as a dual photoreceptor. Analyses of its genome reveal the possible mechanisms of light adaptation. Genes for the light-harvesting complex (LHC) are duplicated and transcriptionally regulated under monochromatic orange/blue light, suggesting P. provasolii has acquired environmental adaptability to a wide range of light spectra and intensities.
Collapse
Affiliation(s)
- Yuko Makita
- Synthetic Genomics Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Shigekatsu Suzuki
- Biodiversity Division, National Institute for Environmental Studies, Tsukuba, Japan
| | - Keiji Fushimi
- Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
- Research Institute of Green Science and Technology, Shizuoka University, Shizuoka, Japan
- Core Research for Evolutional Science and Technology, Japan Science and Technology Agency, Saitama, Japan
| | - Setsuko Shimada
- Synthetic Genomics Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Aya Suehisa
- Synthetic Genomics Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Manami Hirata
- Synthetic Genomics Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Tomoko Kuriyama
- Synthetic Genomics Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Yukio Kurihara
- Synthetic Genomics Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Hidefumi Hamasaki
- Synthetic Genomics Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
- Yokohama City University, Kihara Institute for Biological Research, Yokohama, Japan
| | - Emiko Okubo-Kurihara
- Synthetic Genomics Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Kazutoshi Yoshitake
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan
| | - Tsuyoshi Watanabe
- Fisheries Resources Institute, Japan Fisheries Research and Education Agency, Kushiro, Hokkaido, Japan
| | - Masaaki Sakuta
- Department of Biological Sciences, Ochanomizu University, Tokyo, Japan
| | - Takashi Gojobori
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Tomoko Sakami
- Fisheries Resources Institute, Japan Fisheries Research and Education Agency, Minami-ise, Mie, Japan
| | - Rei Narikawa
- Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
- Research Institute of Green Science and Technology, Shizuoka University, Shizuoka, Japan
- Core Research for Evolutional Science and Technology, Japan Science and Technology Agency, Saitama, Japan
- Department of Biological Sciences, Graduate School of Science, Tokyo Metropolitan University, Tokyo, Japan
| | - Haruyo Yamaguchi
- Biodiversity Division, National Institute for Environmental Studies, Tsukuba, Japan
| | - Masanobu Kawachi
- Biodiversity Division, National Institute for Environmental Studies, Tsukuba, Japan
| | - Minami Matsui
- Synthetic Genomics Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Japan.
- Yokohama City University, Kihara Institute for Biological Research, Yokohama, Japan.
| |
Collapse
|
4
|
Allio R, Tilak MK, Scornavacca C, Avenant NL, Kitchener AC, Corre E, Nabholz B, Delsuc F. High-quality carnivoran genomes from roadkill samples enable comparative species delineation in aardwolf and bat-eared fox. eLife 2021; 10:e63167. [PMID: 33599612 PMCID: PMC7963486 DOI: 10.7554/elife.63167] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 02/16/2021] [Indexed: 12/26/2022] Open
Abstract
In a context of ongoing biodiversity erosion, obtaining genomic resources from wildlife is essential for conservation. The thousands of yearly mammalian roadkill provide a useful source material for genomic surveys. To illustrate the potential of this underexploited resource, we used roadkill samples to study the genomic diversity of the bat-eared fox (Otocyon megalotis) and the aardwolf (Proteles cristatus), both having subspecies with similar disjunct distributions in Eastern and Southern Africa. First, we obtained reference genomes with high contiguity and gene completeness by combining Nanopore long reads and Illumina short reads. Then, we showed that the two subspecies of aardwolf might warrant species status (P. cristatus and P. septentrionalis) by comparing their genome-wide genetic differentiation to pairs of well-defined species across Carnivora with a new Genetic Differentiation index (GDI) based on only a few resequenced individuals. Finally, we obtained a genome-scale Carnivora phylogeny including the new aardwolf species.
Collapse
Affiliation(s)
- Rémi Allio
- Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de MontpellierMontpellierFrance
| | - Marie-Ka Tilak
- Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de MontpellierMontpellierFrance
| | - Celine Scornavacca
- Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de MontpellierMontpellierFrance
| | - Nico L Avenant
- National Museum and Centre for Environmental Management, University of the Free StateBloemfonteinSouth Africa
| | - Andrew C Kitchener
- Department of Natural Sciences, National Museums ScotlandEdinburghUnited Kingdom
| | - Erwan Corre
- CNRS, Sorbonne Université, CNRS, ABiMS, Station Biologique de RoscoffRoscoffFrance
| | - Benoit Nabholz
- Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de MontpellierMontpellierFrance
- Institut Universitaire de France (IUF)ParisFrance
| | - Frédéric Delsuc
- Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de MontpellierMontpellierFrance
| |
Collapse
|
5
|
Qin M, Wu S, Li A, Zhao F, Feng H, Ding L, Ruan J. LRScaf: improving draft genomes using long noisy reads. BMC Genomics 2019; 20:955. [PMID: 31818249 PMCID: PMC6902338 DOI: 10.1186/s12864-019-6337-2] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Accepted: 11/26/2019] [Indexed: 12/15/2022] Open
Abstract
Background The advent of third-generation sequencing (TGS) technologies opens the door to improve genome assembly. Long reads are promising for enhancing the quality of fragmented draft assemblies constructed from next-generation sequencing (NGS) technologies. To date, a few algorithms that are capable of improving draft assemblies have released. There are SSPACE-LongRead, OPERA-LG, SMIS, npScarf, DBG2OLC, Unicycler, and LINKS. Hybrid assembly on large genomes remains challenging, however. Results We develop a scalable and computationally efficient scaffolder, Long Reads Scaffolder (LRScaf, https://github.com/shingocat/lrscaf), that is capable of significantly boosting assembly contiguity using long reads. In this study, we summarise a comprehensive performance assessment for state-of-the-art scaffolders and LRScaf on seven organisms, i.e., E. coli, S. cerevisiae, A. thaliana, O. sativa, S. pennellii, Z. mays, and H. sapiens. LRScaf significantly improves the contiguity of draft assemblies, e.g., increasing the NGA50 value of CHM1 from 127.1 kbp to 9.4 Mbp using 20-fold coverage PacBio dataset and the NGA50 value of NA12878 from 115.3 kbp to 12.9 Mbp using 35-fold coverage Nanopore dataset. Besides, LRScaf generates the best contiguous NGA50 on A. thaliana, S. pennellii, Z. mays, and H. sapiens. Moreover, LRScaf has the shortest run time compared with other scaffolders, and the peak RAM of LRScaf remains practical for large genomes (e.g., 20.3 and 62.6 GB on CHM1 and NA12878, respectively). Conclusions The new algorithm, LRScaf, yields the best or, at least, moderate scaffold contiguity and accuracy in the shortest run time compared with other scaffolding algorithms. Furthermore, LRScaf provides a cost-effective way to improve contiguity of draft assemblies on large genomes.
Collapse
Affiliation(s)
- Mao Qin
- Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, No. 7, Pengfei Road, Dapeng District, Shenzhen, 518120, Guangdong, China
| | - Shigang Wu
- Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, No. 7, Pengfei Road, Dapeng District, Shenzhen, 518120, Guangdong, China
| | - Alun Li
- Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, No. 7, Pengfei Road, Dapeng District, Shenzhen, 518120, Guangdong, China
| | - Fengli Zhao
- Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, No. 7, Pengfei Road, Dapeng District, Shenzhen, 518120, Guangdong, China
| | - Hu Feng
- Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, No. 7, Pengfei Road, Dapeng District, Shenzhen, 518120, Guangdong, China
| | - Lulu Ding
- Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, No. 7, Pengfei Road, Dapeng District, Shenzhen, 518120, Guangdong, China
| | - Jue Ruan
- Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, No. 7, Pengfei Road, Dapeng District, Shenzhen, 518120, Guangdong, China.
| |
Collapse
|
6
|
Linking De Novo Assembly Results with Long DNA Reads Using the dnaasm-link Application. BIOMED RESEARCH INTERNATIONAL 2019; 2019:7847064. [PMID: 31111066 PMCID: PMC6487145 DOI: 10.1155/2019/7847064] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2019] [Revised: 03/25/2019] [Accepted: 03/27/2019] [Indexed: 12/14/2022]
Abstract
Currently, third-generation sequencing techniques, which make it possible to obtain much longer DNA reads compared to the next-generation sequencing technologies, are becoming more and more popular. There are many possibilities for combining data from next-generation and third-generation sequencing. Herein, we present a new application called dnaasm-link for linking contigs, the result of de novo assembly of second-generation sequencing data, with long DNA reads. Our tool includes an integrated module to fill gaps with a suitable fragment of an appropriate long DNA read, which improves the consistency of the resulting DNA sequences. This feature is very important, in particular for complex DNA regions. Our implementation is found to outperform other state-of-the-art tools in terms of speed and memory requirements, which may enable its usage for organisms with a large genome, something which is not possible in existing applications. The presented application has many advantages: (i) it significantly optimizes memory and reduces computation time; (ii) it fills gaps with an appropriate fragment of a specified long DNA read; (iii) it reduces the number of spanned and unspanned gaps in existing genome drafts. The application is freely available to all users under GNU Library or Lesser General Public License version 3.0 (LGPLv3). The demo application, Docker image, and source code can be downloaded from project homepage.
Collapse
|
7
|
Di Genova A, Ruz GA, Sagot MF, Maass A. Fast-SG: an alignment-free algorithm for hybrid assembly. Gigascience 2018; 7:4993155. [PMID: 29741627 PMCID: PMC6007556 DOI: 10.1093/gigascience/giy048] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Revised: 03/01/2018] [Accepted: 04/19/2018] [Indexed: 12/01/2022] Open
Abstract
Background Long-read sequencing technologies are the ultimate solution for genome repeats, allowing near reference-level reconstructions of large genomes. However, long-read de novo assembly pipelines are computationally intense and require a considerable amount of coverage, thereby hindering their broad application to the assembly of large genomes. Alternatively, hybrid assembly methods that combine short- and long-read sequencing technologies can reduce the time and cost required to produce de novo assemblies of large genomes. Results Here, we propose a new method, called Fast-SG, that uses a new ultrafast alignment-free algorithm specifically designed for constructing a scaffolding graph using light-weight data structures. Fast-SG can construct the graph from either short or long reads. This allows the reuse of efficient algorithms designed for short-read data and permits the definition of novel modular hybrid assembly pipelines. Using comprehensive standard datasets and benchmarks, we show how Fast-SG outperforms the state-of-the-art short-read aligners when building the scaffoldinggraph and can be used to extract linking information from either raw or error-corrected long reads. We also show how a hybrid assembly approach using Fast-SG with shallow long-read coverage (5X) and moderate computational resources can produce long-range and accurate reconstructions of the genomes of Arabidopsis thaliana (Ler-0) and human (NA12878). Conclusions Fast-SG opens a door to achieve accurate hybrid long-range reconstructions of large genomes with low effort, high portability, and low cost.
Collapse
Affiliation(s)
- Alex Di Genova
- Facultad de Ingeniería y Ciencias, Universidad Adolfo Ibáñez, Santiago, Chile
- Mathomics Bioinformatics Laboratory, Center for Mathematical Modeling, University of Chile, Av. Beauchef 851., 7th floor, Santiago, Chile
- Inria Grenoble Rhon̂e-Alpes, 655, Avenue de l’Europe, 38334 Montbonnot, France
- CNRS, UMR5558, Université Claude Bernard Lyon 1, 43, Boulevard du 11 Novembre 1918, 69622 Villeurbanne, France
- Fondap Center for Genome Regulation, Av. Blanco Encalada 2085, 3rd floor, Santiago, Chile
| | - Gonzalo A Ruz
- Facultad de Ingeniería y Ciencias, Universidad Adolfo Ibáñez, Santiago, Chile
- Center of Applied Ecology and Sustainability (CAPES), Santiago, Chile
| | - Marie-France Sagot
- Inria Grenoble Rhon̂e-Alpes, 655, Avenue de l’Europe, 38334 Montbonnot, France
- CNRS, UMR5558, Université Claude Bernard Lyon 1, 43, Boulevard du 11 Novembre 1918, 69622 Villeurbanne, France
| | - Alejandro Maass
- Mathomics Bioinformatics Laboratory, Center for Mathematical Modeling, University of Chile, Av. Beauchef 851., 7th floor, Santiago, Chile
- Fondap Center for Genome Regulation, Av. Blanco Encalada 2085, 3rd floor, Santiago, Chile
- Department of Mathematical Engineering, University of Chile, Av. Beauchef 851., 5th floor, Santiago, Chile
| |
Collapse
|