Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Sedlazeck FJ, Lee H, Darby CA, Schatz MC. Piercing the dark matter: bioinformatics of long-range sequencing and mapping. Nat Rev Genet 2019;19:329-346. [PMID: 29599501 DOI: 10.1038/s41576-018-0003-4] [Citation(s) in RCA: 291] [Impact Index Per Article: 58.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]

For:	Sedlazeck FJ, Lee H, Darby CA, Schatz MC. Piercing the dark matter: bioinformatics of long-range sequencing and mapping. Nat Rev Genet 2019;19:329-346. [PMID: 29599501 DOI: 10.1038/s41576-018-0003-4] [Citation(s) in RCA: 291] [Impact Index Per Article: 58.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]

Number

Cited by Other Article(s)

151

Methods to Study Translated Pseudogenes: Recombinant Expression and Complementation, Targeted Proteomics, and RNA Profiling. Methods Mol Biol 2021. [PMID: 34165719 DOI: 10.1007/978-1-0716-1503-4_15] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/19/2024]

152

Tunjić-Cvitanić M, Pasantes JJ, García-Souto D, Cvitanić T, Plohl M, Šatović-Vukšić E. Satellitome Analysis of the Pacific Oyster Crassostrea gigas Reveals New Pattern of Satellite DNA Organization, Highly Scattered across the Genome. Int J Mol Sci 2021;22:ijms22136798. [PMID: 34202698 PMCID: PMC8268682 DOI: 10.3390/ijms22136798] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 06/18/2021] [Accepted: 06/19/2021] [Indexed: 12/22/2022] Open

153

Tvedte ES, Gasser M, Sparklin BC, Michalski J, Hjelmen CE, Johnston JS, Zhao X, Bromley R, Tallon LJ, Sadzewicz L, Rasko DA, Dunning Hotopp JC. Comparison of long-read sequencing technologies in interrogating bacteria and fly genomes. G3 (BETHESDA, MD.) 2021;11:jkab083. [PMID: 33768248 PMCID: PMC8495745 DOI: 10.1093/g3journal/jkab083] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 03/07/2021] [Indexed: 12/14/2022]

154

Suh A, Dion-Côté AM. New Perspectives on the Evolution of Within-Individual Genome Variation and Germline/Soma Distinction. Genome Biol Evol 2021;13:evab095. [PMID: 33963843 PMCID: PMC8245192 DOI: 10.1093/gbe/evab095] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/07/2021] [Indexed: 12/19/2022] Open

155

Guiglielmoni N, Houtain A, Derzelle A, Van Doninck K, Flot JF. Overcoming uncollapsed haplotypes in long-read assemblies of non-model organisms. BMC Bioinformatics 2021;22:303. [PMID: 34090340 PMCID: PMC8178825 DOI: 10.1186/s12859-021-04118-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Accepted: 04/02/2021] [Indexed: 12/21/2022] Open

Abstract

Background

Long-read sequencing is revolutionizing genome assembly: as PacBio and Nanopore technologies become more accessible in technicity and in cost, long-read assemblers flourish and are starting to deliver chromosome-level assemblies. However, these long reads are usually error-prone, making the generation of a haploid reference out of a diploid genome a difficult enterprise. Failure to properly collapse haplotypes results in fragmented and structurally incorrect assemblies and wreaks havoc on orthology inference pipelines, yet this serious issue is rarely acknowledged and dealt with in genomic projects, and an independent, comparative benchmark of the capacity of assemblers and post-processing tools to properly collapse or purge haplotypes is still lacking.

Results

We tested different assembly strategies on the genome of the rotifer Adineta vaga, a non-model organism for which high coverages of both PacBio and Nanopore reads were available. The assemblers we tested (Canu, Flye, NextDenovo, Ra, Raven, Shasta and wtdbg2) exhibited strikingly different behaviors when dealing with highly heterozygous regions, resulting in variable amounts of uncollapsed haplotypes. Filtering reads generally improved haploid assemblies, and we also benchmarked three post-processing tools aimed at detecting and purging uncollapsed haplotypes in long-read assemblies: HaploMerger2, purge_haplotigs and purge_dups.

Conclusions

We provide a thorough evaluation of popular assemblers on a non-model eukaryote genome with variable levels of heterozygosity. Our study highlights several strategies using pre and post-processing approaches to generate haploid assemblies with high continuity and completeness. This benchmark will help users to improve haploid assemblies of non-model organisms, and evaluate the quality of their own assemblies.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12859-021-04118-3.

Collapse

156

Ono Y, Asai K, Hamada M. PBSIM2: a simulator for long-read sequencers with a novel generative model of quality scores. Bioinformatics 2021;37:589-595. [PMID: 32976553 PMCID: PMC8097687 DOI: 10.1093/bioinformatics/btaa835] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 08/20/2020] [Accepted: 09/11/2020] [Indexed: 12/21/2022] Open

157

Quan C, Li Y, Liu X, Wang Y, Ping J, Lu Y, Zhou G. Characterization of structural variation in Tibetans reveals new evidence of high-altitude adaptation and introgression. Genome Biol 2021;22:159. [PMID: 34034800 PMCID: PMC8146648 DOI: 10.1186/s13059-021-02382-3] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 05/14/2021] [Indexed: 01/09/2023] Open

Affiliation(s)

Cheng Quan Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
Yuanfeng Li Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
Xinyi Liu Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
Yahui Wang Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
Jie Ping Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
Yiming Lu Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China Hebei University, Baoding, Hebei Province 071002 People’s Republic of China
Gangqiao Zhou Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China Hebei University, Baoding, Hebei Province 071002 People’s Republic of China Collaborative Innovation Center for Personalized Cancer Medicine, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu Province 211166 People’s Republic of China Medical College of Guizhou University, Guiyang, Guizhou Province 550025 People’s Republic of China

Collapse

158

Hadi K, Yao X, Behr JM, Deshpande A, Xanthopoulakis C, Tian H, Kudman S, Rosiene J, Darmofal M, DeRose J, Mortensen R, Adney EM, Shaiber A, Gajic Z, Sigouros M, Eng K, Wala JA, Wrzeszczyński KO, Arora K, Shah M, Emde AK, Felice V, Frank MO, Darnell RB, Ghandi M, Huang F, Dewhurst S, Maciejowski J, de Lange T, Setton J, Riaz N, Reis-Filho JS, Powell S, Knowles DA, Reznik E, Mishra B, Beroukhim R, Zody MC, Robine N, Oman KM, Sanchez CA, Kuhner MK, Smith LP, Galipeau PC, Paulson TG, Reid BJ, Li X, Wilkes D, Sboner A, Mosquera JM, Elemento O, Imielinski M. Distinct Classes of Complex Structural Variation Uncovered across Thousands of Cancer Genome Graphs. Cell 2021;183:197-210.e32. [PMID: 33007263 DOI: 10.1016/j.cell.2020.08.006] [Citation(s) in RCA: 127] [Impact Index Per Article: 42.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2019] [Revised: 04/08/2020] [Accepted: 08/03/2020] [Indexed: 12/12/2022]

Affiliation(s)

Kevin Hadi Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; New York Genome Center, New York, NY 10013, USA
Xiaotong Yao Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; New York Genome Center, New York, NY 10013, USA; Tri-institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY 10021, USA
Julie M Behr Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; New York Genome Center, New York, NY 10013, USA; Tri-institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY 10021, USA
Aditya Deshpande Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; New York Genome Center, New York, NY 10013, USA; Tri-institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY 10021, USA
Charalampos Xanthopoulakis Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA
Huasong Tian Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; New York Genome Center, New York, NY 10013, USA
Sarah Kudman Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA
Joel Rosiene Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; New York Genome Center, New York, NY 10013, USA
Madison Darmofal Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; New York Genome Center, New York, NY 10013, USA; Tri-institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY 10021, USA
Joseph DeRose New York Genome Center, New York, NY 10013, USA
Rick Mortensen New York Genome Center, New York, NY 10013, USA
Emily M Adney Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; New York Genome Center, New York, NY 10013, USA
Alon Shaiber Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; New York Genome Center, New York, NY 10013, USA; Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA
Zoran Gajic New York Genome Center, New York, NY 10013, USA
Michael Sigouros Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA
Kenneth Eng Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
Jeremiah A Wala Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Departments of Medical Oncology and Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; School of Medicine, University of California, San Francisco, San Francisco, CA 94143, USA
Kazimierz O Wrzeszczyński New York Genome Center, New York, NY 10013, USA
Kanika Arora New York Genome Center, New York, NY 10013, USA
Minita Shah New York Genome Center, New York, NY 10013, USA
Anne-Katrin Emde New York Genome Center, New York, NY 10013, USA
Vanessa Felice New York Genome Center, New York, NY 10013, USA
Mayu O Frank New York Genome Center, New York, NY 10013, USA; Laboratory of Molecular Neuro-Oncology and Howard Hughes Medical Institute, The Rockefeller University, New York, NY 10065, USA
Robert B Darnell New York Genome Center, New York, NY 10013, USA; Laboratory of Molecular Neuro-Oncology and Howard Hughes Medical Institute, The Rockefeller University, New York, NY 10065, USA
Mahmoud Ghandi Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
Franklin Huang Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; School of Medicine, University of California, San Francisco, San Francisco, CA 94143, USA
Sally Dewhurst Laboratory of Cell Biology and Genetics, The Rockefeller University, New York, NY 10065, USA
John Maciejowski Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
Titia de Lange Laboratory of Cell Biology and Genetics, The Rockefeller University, New York, NY 10065, USA
Jeremy Setton Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
Nadeem Riaz Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Immunogenomics and Precision Oncology Platform, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
Jorge S Reis-Filho Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
Simon Powell Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
David A Knowles New York Genome Center, New York, NY 10013, USA; Department of Computer Science, Columbia University, New York, NY 10027, USA
Ed Reznik Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
Bud Mishra Departments of Computer Science, Mathematics and Cell Biology, Courant Institute and NYU School of Medicine, New York University, New York, NY 10012, USA
Rameen Beroukhim Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Departments of Medical Oncology and Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
Michael C Zody New York Genome Center, New York, NY 10013, USA
Nicolas Robine New York Genome Center, New York, NY 10013, USA
Kenji M Oman Divisions of Human Biology and Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
Carissa A Sanchez Divisions of Human Biology and Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
Mary K Kuhner Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
Lucian P Smith Department of Bioengineering, University of Washington, Seattle, WA 98195, USA
Patricia C Galipeau Divisions of Human Biology and Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
Thomas G Paulson Divisions of Human Biology and Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
Brian J Reid Divisions of Human Biology and Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
Xiaohong Li Divisions of Human Biology and Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
David Wilkes Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA
Andrea Sboner Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
Juan Miguel Mosquera Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA
Olivier Elemento Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
Marcin Imielinski Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY 10021, USA; New York Genome Center, New York, NY 10013, USA; Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA.

Collapse

159

Savara J, Novosád T, Gajdoš P, Kriegová E. Comparison of structural variants detected by optical mapping with long-read next-generation sequencing. Bioinformatics 2021;37:3398-3404. [PMID: 33983367 DOI: 10.1093/bioinformatics/btab359] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 04/21/2021] [Accepted: 05/08/2021] [Indexed: 12/29/2022] Open

160

Liu Y, Jiang T, Su J, Liu B, Zang T, Wang Y. SKSV: ultrafast structural variation detection from circular consensus sequencing reads. Bioinformatics 2021;37:3647-3649. [PMID: 33963826 DOI: 10.1093/bioinformatics/btab341] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 04/29/2021] [Accepted: 05/04/2021] [Indexed: 01/23/2023] Open

161

Lopes M, Louzada S, Gama-Carvalho M, Chaves R. Genomic Tackling of Human Satellite DNA: Breaking Barriers through Time. Int J Mol Sci 2021;22:4707. [PMID: 33946766 PMCID: PMC8125562 DOI: 10.3390/ijms22094707] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 04/24/2021] [Accepted: 04/27/2021] [Indexed: 12/12/2022] Open

162

Nattestad M, Aboukhalil R, Chin CS, Schatz MC. Ribbon: intuitive visualization for complex genomic variation. Bioinformatics 2021;37:413-415. [PMID: 32766814 DOI: 10.1093/bioinformatics/btaa680] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 06/15/2020] [Accepted: 07/21/2020] [Indexed: 01/08/2023] Open

163

Kronenberg ZN, Rhie A, Koren S, Concepcion GT, Peluso P, Munson KM, Porubsky D, Kuhn K, Mueller KA, Low WY, Hiendleder S, Fedrigo O, Liachko I, Hall RJ, Phillippy AM, Eichler EE, Williams JL, Smith TPL, Jarvis ED, Sullivan ST, Kingan SB. Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C. Nat Commun 2021;12:1935. [PMID: 33911078 PMCID: PMC8081726 DOI: 10.1038/s41467-020-20536-y] [Citation(s) in RCA: 52] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 11/12/2020] [Indexed: 01/27/2023] Open

Affiliation(s)

Zev N Kronenberg Phase Genomics, Seattle, WA, USA. Pacific Biosciences, Menlo Park, CA, USA.
Arang Rhie Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
Sergey Koren Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
Gregory T Concepcion Pacific Biosciences, Menlo Park, CA, USA
Paul Peluso Pacific Biosciences, Menlo Park, CA, USA
Katherine M Munson Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
David Porubsky Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
Kristen Kuhn US Meat Animal Research Center, ARS USDA, Clay Center, NE, USA
Kathryn A Mueller Phase Genomics, Seattle, WA, USA
Wai Yee Low Davies Research Centre, School of Animal and Veterinary Sciences, The University of Adelaide, Roseworthy, SA, Australia
Stefan Hiendleder Davies Research Centre, School of Animal and Veterinary Sciences, The University of Adelaide, Roseworthy, SA, Australia
Olivier Fedrigo Vertebrate Genomes Laboratory, The Rockefeller University, New York, NY, USA
Ivan Liachko Phase Genomics, Seattle, WA, USA
Richard J Hall Pacific Biosciences, Menlo Park, CA, USA
Adam M Phillippy Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
Evan E Eichler Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
John L Williams Davies Research Centre, School of Animal and Veterinary Sciences, The University of Adelaide, Roseworthy, SA, Australia Dipartimento di Scienze Animali, della Nutrizione e degli Alimenti, Università Cattolica del Sacro Cuore, 29122, Piacenza, Italy
Timothy P L Smith US Meat Animal Research Center, ARS USDA, Clay Center, NE, USA
Erich D Jarvis Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA Howard Hughes Medical Institute, Chevy Chase, MD, USA
Shawn T Sullivan Phase Genomics, Seattle, WA, USA
Sarah B Kingan Pacific Biosciences, Menlo Park, CA, USA.

Collapse

164

Wahlster L, Verboon JM, Ludwig LS, Black SC, Luo W, Garg K, Voit RA, Collins RL, Garimella K, Costello M, Chao KR, Goodrich JK, DiTroia SP, O'Donnell-Luria A, Talkowski ME, Michelson AD, Cantor AB, Sankaran VG. Familial thrombocytopenia due to a complex structural variant resulting in a WAC-ANKRD26 fusion transcript. J Exp Med 2021;218:211998. [PMID: 33857290 PMCID: PMC8056752 DOI: 10.1084/jem.20210444] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 03/09/2021] [Accepted: 03/11/2021] [Indexed: 12/11/2022] Open

Affiliation(s)

Lara Wahlster Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA.,Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA
Jeffrey M Verboon Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA.,Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA
Leif S Ludwig Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA.,Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA
Susan C Black Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA.,Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA
Wendy Luo Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA.,Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA
Kopal Garg Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA.,Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA
Richard A Voit Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA.,Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA
Ryan L Collins Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA.,Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA
Kiran Garimella Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA
Maura Costello Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA
Katherine R Chao Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA
Julia K Goodrich Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA
Stephanie P DiTroia Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA
Anne O'Donnell-Luria Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA
Michael E Talkowski Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA.,Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA
Alan D Michelson Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA
Alan B Cantor Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA
Vijay G Sankaran Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA.,Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA

Collapse

165

Garg S. Computational methods for chromosome-scale haplotype reconstruction. Genome Biol 2021;22:101. [PMID: 33845884 PMCID: PMC8040228 DOI: 10.1186/s13059-021-02328-9] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2021] [Accepted: 03/25/2021] [Indexed: 12/13/2022] Open

166

Pauper M, Kucuk E, Wenger AM, Chakraborty S, Baybayan P, Kwint M, van der Sanden B, Nelen MR, Derks R, Brunner HG, Hoischen A, Vissers LELM, Gilissen C. Long-read trio sequencing of individuals with unsolved intellectual disability. Eur J Hum Genet 2021;29:637-648. [PMID: 33257779 PMCID: PMC8115091 DOI: 10.1038/s41431-020-00770-0] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 10/27/2020] [Indexed: 02/06/2023] Open

Affiliation(s)

Marc Pauper Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
Erdi Kucuk Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands Radboud Institute for Molecular Life Sciences, Radboud University, Nijmegen, The Netherlands
Aaron M Wenger Pacific Biosciences, Menlo Park, CA, USA
Shreyasee Chakraborty Pacific Biosciences, Menlo Park, CA, USA
Primo Baybayan Pacific Biosciences, Menlo Park, CA, USA
Michael Kwint Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
Bart van der Sanden Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 HR, Nijmegen, The Netherlands
Marcel R Nelen Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
Ronny Derks Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
Han G Brunner Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands Radboud Institute for Molecular Life Sciences, Radboud University, Nijmegen, The Netherlands Department of Clinical Genetics, Maastricht University Medical Center, Maastricht, The Netherlands
Alexander Hoischen Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands Radboud Institute for Molecular Life Sciences, Radboud University, Nijmegen, The Netherlands Department of Internal Medicine, Center for Infectious Diseases (RCI), Radboud University Medical Center, Nijmegen, The Netherlands
Lisenka E L M Vissers Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 HR, Nijmegen, The Netherlands
Christian Gilissen Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands. Radboud Institute for Molecular Life Sciences, Radboud University, Nijmegen, The Netherlands.

Collapse

167

Di Genova A, Buena-Atienza E, Ossowski S, Sagot MF. Efficient hybrid de novo assembly of human genomes with WENGAN. Nat Biotechnol 2021;39:422-430. [PMID: 33318652 PMCID: PMC8041623 DOI: 10.1038/s41587-020-00747-w] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Revised: 10/08/2020] [Accepted: 10/21/2020] [Indexed: 12/12/2022]

168

Kovaka S, Fan Y, Ni B, Timp W, Schatz MC. Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED. Nat Biotechnol 2021;39:431-441. [PMID: 33257863 PMCID: PMC8567335 DOI: 10.1038/s41587-020-0731-9] [Citation(s) in RCA: 118] [Impact Index Per Article: 39.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Accepted: 10/07/2020] [Indexed: 02/07/2023]

169

Blom MPK. Opportunities and challenges for high-quality biodiversity tissue archives in the age of long-read sequencing. Mol Ecol 2021;30:5935-5948. [PMID: 33786900 DOI: 10.1111/mec.15909] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 03/06/2021] [Accepted: 03/22/2021] [Indexed: 12/11/2022]

170

Mc Cartney AM, Mahmoud M, Jochum M, Agustinho DP, Zorman B, Al Khleifat A, Dabbaghie F, K Kesharwani R, Smolka M, Dawood M, Albin D, Aliyev E, Almabrazi H, Arslan A, Balaji A, Behera S, Billingsley K, L Cameron D, Daw J, T. Dawson E, De Coster W, Du H, Dunn C, Esteban R, Jolly A, Kalra D, Liao C, Liu Y, Lu TY, M Havrilla J, M Khayat M, Marin M, Monlong J, Price S, Rafael Gener A, Ren J, Sagayaradj S, Sapoval N, Sinner C, C. Soto D, Soylev A, Subramaniyan A, Syed N, Tadimeti N, Tater P, Vats P, Vaughn J, Walker K, Wang G, Zeng Q, Zhang S, Zhao T, Kille B, Biederstedt E, Chaisson M, English A, Kronenberg Z, J. Treangen T, Hefferon T, Chin CS, Busby B, J Sedlazeck F. An international virtual hackathon to build tools for the analysis of structural variants within species ranging from coronaviruses to vertebrates. F1000Res 2021;10:246. [PMID: 34621504 PMCID: PMC8479851 DOI: 10.12688/f1000research.51477.2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/23/2021] [Indexed: 11/20/2022] Open

171

Mc Cartney AM, Mahmoud M, Jochum M, Agustinho DP, Zorman B, Al Khleifat A, Dabbaghie F, K Kesharwani R, Smolka M, Dawood M, Albin D, Aliyev E, Almabrazi H, Arslan A, Balaji A, Behera S, Billingsley K, L Cameron D, Daw J, T. Dawson E, De Coster W, Du H, Dunn C, Esteban R, Jolly A, Kalra D, Liao C, Liu Y, Lu TY, M Havrilla J, M Khayat M, Marin M, Monlong J, Price S, Rafael Gener A, Ren J, Sagayaradj S, Sapoval N, Sinner C, C. Soto D, Soylev A, Subramaniyan A, Syed N, Tadimeti N, Tater P, Vats P, Vaughn J, Walker K, Wang G, Zeng Q, Zhang S, Zhao T, Kille B, Biederstedt E, Chaisson M, English A, Kronenberg Z, J. Treangen T, Hefferon T, Chin CS, Busby B, J Sedlazeck F. An international virtual hackathon to build tools for the analysis of structural variants within species ranging from coronaviruses to vertebrates. F1000Res 2021;10:246. [PMID: 34621504 PMCID: PMC8479851 DOI: 10.12688/f1000research.51477.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/04/2021] [Indexed: 11/08/2023] Open

172

Gusic M, Prokisch H. Genetic basis of mitochondrial diseases. FEBS Lett 2021;595:1132-1158. [PMID: 33655490 DOI: 10.1002/1873-3468.14068] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 02/17/2021] [Accepted: 02/18/2021] [Indexed: 12/13/2022]

173

van Belzen IAEM, Schönhuth A, Kemmeren P, Hehir-Kwa JY. Structural variant detection in cancer genomes: computational challenges and perspectives for precision oncology. NPJ Precis Oncol 2021;5:15. [PMID: 33654267 PMCID: PMC7925608 DOI: 10.1038/s41698-021-00155-6] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Accepted: 01/12/2021] [Indexed: 01/31/2023] Open

174

Padgitt-Cobb LK, Kingan SB, Wells J, Elser J, Kronmiller B, Moore D, Concepcion G, Peluso P, Rank D, Jaiswal P, Henning J, Hendrix DA. A draft phased assembly of the diploid Cascade hop (Humulus lupulus) genome. THE PLANT GENOME 2021;14:e20072. [PMID: 33605092 DOI: 10.1002/tpg2.20072] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 10/03/2020] [Indexed: 05/25/2023]

175

Luo J, Wei Y, Lyu M, Wu Z, Liu X, Luo H, Yan C. A comprehensive review of scaffolding methods in genome assembly. Brief Bioinform 2021;22:6149347. [PMID: 33634311 DOI: 10.1093/bib/bbab033] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 01/21/2021] [Accepted: 01/22/2021] [Indexed: 12/20/2022] Open

176

Liu X, Andrews MV, Skinner JP, Johanson TM, Chong MMW. A comparison of alternative mRNA splicing in the CD4 and CD8 T cell lineages. Mol Immunol 2021;133:53-62. [PMID: 33631555 DOI: 10.1016/j.molimm.2021.02.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Revised: 01/05/2021] [Accepted: 02/08/2021] [Indexed: 12/14/2022]

177

Wang P, Meng F, Moore BM, Shiu SH. Impact of short-read sequencing on the misassembly of a plant genome. BMC Genomics 2021;22:99. [PMID: 33530937 PMCID: PMC7852129 DOI: 10.1186/s12864-021-07397-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 01/19/2021] [Indexed: 12/16/2022] Open

178

Biological computation and computational biology: survey, challenges, and discussion. Artif Intell Rev 2021. [DOI: 10.1007/s10462-020-09951-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

179

Wang Q, Liu J, Janssen JM, Le Bouteiller M, Frock RL, Gonçalves MAFV. Precise and broad scope genome editing based on high-specificity Cas9 nickases. Nucleic Acids Res 2021;49:1173-1198. [PMID: 33398349 PMCID: PMC7826261 DOI: 10.1093/nar/gkaa1236] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 12/04/2020] [Accepted: 12/08/2020] [Indexed: 12/19/2022] Open

180

Whibley A, Kelley JL, Narum SR. The changing face of genome assemblies: Guidance on achieving high-quality reference genomes. Mol Ecol Resour 2021;21:641-652. [PMID: 33326691 DOI: 10.1111/1755-0998.13312] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 12/08/2020] [Accepted: 12/11/2020] [Indexed: 12/20/2022]

Abstract

The quality of genome assemblies has improved rapidly in recent years due to continual advances in sequencing technology, assembly approaches, and quality control. In the field of molecular ecology, this has led to the development of exceptional quality genome assemblies that will be important long-term resources for broader studies into ecological, conservation, evolutionary, and population genomics of naturally occurring species. Moreover, the extent to which a single reference genome represents the diversity within a species varies: pan-genomes will become increasingly important ecological genomics resources, particularly in systems found to have considerable presence-absence variation in their functional content. Here, we highlight advances in technology that have raised the bar for genome assembly and provide guidance on standards to achieve exceptional quality reference genomes. Key recommendations include the following: (a) Genome assemblies should include long-read sequencing except in rare cases where it is effectively impossible to acquire adequately preserved samples needed for high molecular weight DNA standards. (b) At least one scaffolding approach should be included with genome assembly such as Hi-C or optical mapping. (c) Genome assemblies should be carefully evaluated, this may involve utilising short read data for genome polishing, error correction, k-mer analyses, and estimating the percent of reads that map back to an assembly. Finally, a genome assembly is most valuable if all data and methods are made publicly available and the utility of a genome for further studies is verified through examples. While these recommendations are based on current technology, we anticipate that future advances will push the field further and the molecular ecology community should continue to adopt new approaches that attain the highest quality genome assemblies.

Collapse

181

Eschenbrenner CJ, Feurtey A, Stukenbrock EH. Population Genomics of Fungal Plant Pathogens and the Analyses of Rapidly Evolving Genome Compartments. Methods Mol Biol 2021;2090:337-355. [PMID: 31975174 DOI: 10.1007/978-1-0716-0199-0_14] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

182

Du H, Diao C, Zhao P, Zhou L, Liu JF. Integrated hybrid de novo assembly technologies to obtain high-quality pig genome using short and long reads. Brief Bioinform 2021;22:6082823. [PMID: 33429431 DOI: 10.1093/bib/bbaa399] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 11/20/2020] [Accepted: 12/08/2020] [Indexed: 11/12/2022] Open

183

Morisse P, Marchet C, Limasset A, Lecroq T, Lefebvre A. Scalable long read self-correction and assembly polishing with multiple sequence alignment. Sci Rep 2021;11:761. [PMID: 33436980 PMCID: PMC7804095 DOI: 10.1038/s41598-020-80757-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 12/22/2020] [Indexed: 11/09/2022] Open

184

Holley G, Beyter D, Ingimundardottir H, Møller PL, Kristmundsdottir S, Eggertsson HP, Halldorsson BV. Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly. Genome Biol 2021;22:28. [PMID: 33419473 PMCID: PMC7792008 DOI: 10.1186/s13059-020-02244-4] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Accepted: 12/15/2020] [Indexed: 12/20/2022] Open

185

Peona V, Blom MPK, Xu L, Burri R, Sullivan S, Bunikis I, Liachko I, Haryoko T, Jønsson KA, Zhou Q, Irestedt M, Suh A. Identifying the causes and consequences of assembly gaps using a multiplatform genome assembly of a bird-of-paradise. Mol Ecol Resour 2021;21:263-286. [PMID: 32937018 PMCID: PMC7757076 DOI: 10.1111/1755-0998.13252] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 08/21/2020] [Accepted: 08/26/2020] [Indexed: 01/09/2023]

Affiliation(s)

Valentina Peona Department of Ecology and Genetics—Evolutionary BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden Department of Organismal Biology—Systematic BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden
Mozes P. K. Blom Department of Bioinformatics and GeneticsSwedish Museum of Natural HistoryStockholmSweden Museum für NaturkundeLeibniz Institut für Evolutions‐ und BiodiversitätsforschungBerlinGermany
Luohao Xu Department of Neurosciences and Developmental BiologyUniversity of ViennaViennaAustria
Reto Burri Department of Population EcologyInstitute of Ecology and EvolutionFriedrich‐Schiller‐University JenaJenaGermany
Shawn Sullivan Phase GenomicsSeattleWAUSA
Ignas Bunikis Department of Immunology, Genetics and PathologyScience for Life LaboratoryUppsala Genome CenterUppsala UniversityUppsalaSweden
Ivan Liachko Phase GenomicsSeattleWAUSA
Tri Haryoko Research Centre for BiologyMuseum Zoologicum BogorienseIndonesian Institute of Sciences (LIPI)CibinongIndonesia
Knud A. Jønsson Natural History Museum of DenmarkUniversity of CopenhagenCopenhagenDenmark
Qi Zhou Department of Neurosciences and Developmental BiologyUniversity of ViennaViennaAustria MOE Laboratory of Biosystems Homeostasis & ProtectionLife Sciences InstituteZhejiang UniversityHangzhouChina Center for Reproductive MedicineThe 2nd Affiliated HospitalSchool of MedicineZhejiang UniversityHangzhouChina
Martin Irestedt Department of Bioinformatics and GeneticsSwedish Museum of Natural HistoryStockholmSweden
Alexander Suh Department of Ecology and Genetics—Evolutionary BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden Department of Organismal Biology—Systematic BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden School of Biological Sciences—Organisms and the EnvironmentUniversity of East AngliaNorwichUK

Collapse

186

Zhang H, Jain C, Aluru S. A comprehensive evaluation of long read error correction methods. BMC Genomics 2020;21:889. [PMID: 33349243 PMCID: PMC7751105 DOI: 10.1186/s12864-020-07227-0] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2020] [Accepted: 11/12/2020] [Indexed: 01/07/2023] Open

Abstract

BACKGROUND

Third-generation single molecule sequencing technologies can sequence long reads, which is advancing the frontiers of genomics research. However, their high error rates prohibit accurate and efficient downstream analysis. This difficulty has motivated the development of many long read error correction tools, which tackle this problem through sampling redundancy and/or leveraging accurate short reads of the same biological samples. Existing studies to asses these tools use simulated data sets, and are not sufficiently comprehensive in the range of software covered or diversity of evaluation measures used.

RESULTS

In this paper, we present a categorization and review of long read error correction methods, and provide a comprehensive evaluation of the corresponding long read error correction tools. Leveraging recent real sequencing data, we establish benchmark data sets and set up evaluation criteria for a comparative assessment which includes quality of error correction as well as run-time and memory usage. We study how trimming and long read sequencing depth affect error correction in terms of length distribution and genome coverage post-correction, and the impact of error correction performance on an important application of long reads, genome assembly. We provide guidelines for practitioners for choosing among the available error correction tools and identify directions for future research.

CONCLUSIONS

Despite the high error rate of long reads, the state-of-the-art correction tools can achieve high correction quality. When short reads are available, the best hybrid methods outperform non-hybrid methods in terms of correction quality and computing resource usage. When choosing tools for use, practitioners are suggested to be careful with a few correction tools that discard reads, and check the effect of error correction tools on downstream analysis. Our evaluation code is available as open-source at https://github.com/haowenz/LRECE .

Collapse

187

Heller D, Vingron M. SVIM-asm: Structural variant detection from haploid and diploid genome assemblies. Bioinformatics 2020;36:5519-5521. [PMID: 33346817 PMCID: PMC8016491 DOI: 10.1093/bioinformatics/btaa1034] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Revised: 11/16/2020] [Accepted: 12/12/2020] [Indexed: 12/21/2022] Open

188

Bennett EP, Petersen BL, Johansen IE, Niu Y, Yang Z, Chamberlain CA, Met Ö, Wandall HH, Frödin M. INDEL detection, the 'Achilles heel' of precise genome editing: a survey of methods for accurate profiling of gene editing induced indels. Nucleic Acids Res 2020;48:11958-11981. [PMID: 33170255 PMCID: PMC7708060 DOI: 10.1093/nar/gkaa975] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 10/05/2020] [Accepted: 10/15/2020] [Indexed: 12/11/2022] Open

Abstract

Advances in genome editing technologies have enabled manipulation of genomes at the single base level. These technologies are based on programmable nucleases (PNs) that include meganucleases, zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/CRISPR-associated 9 (Cas9) nucleases and have given researchers the ability to delete, insert or replace genomic DNA in cells, tissues and whole organisms. The great flexibility in re-designing the genomic target specificity of PNs has vastly expanded the scope of gene editing applications in life science, and shows great promise for development of the next generation gene therapies. PN technologies share the principle of inducing a DNA double-strand break (DSB) at a user-specified site in the genome, followed by cellular repair of the induced DSB. PN-elicited DSBs are mainly repaired by the non-homologous end joining (NHEJ) and the microhomology-mediated end joining (MMEJ) pathways, which can elicit a variety of small insertion or deletion (indel) mutations. If indels are elicited in a protein coding sequence and shift the reading frame, targeted gene knock out (KO) can readily be achieved using either of the available PNs. Despite the ease by which gene inactivation in principle can be achieved, in practice, successful KO is not only determined by the efficiency of NHEJ and MMEJ repair; it also depends on the design and properties of the PN utilized, delivery format chosen, the preferred indel repair outcomes at the targeted site, the chromatin state of the target site and the relative activities of the repair pathways in the edited cells. These variables preclude accurate prediction of the nature and frequency of PN induced indels. A key step of any gene KO experiment therefore becomes the detection, characterization and quantification of the indel(s) induced at the targeted genomic site in cells, tissues or whole organisms. In this survey, we briefly review naturally occurring indels and their detection. Next, we review the methods that have been developed for detection of PN-induced indels. We briefly outline the experimental steps and describe the pros and cons of the various methods to help users decide a suitable method for their editing application. We highlight recent advances that enable accurate and sensitive quantification of indel events in cells regardless of their genome complexity, turning a complex pool of different indel events into informative indel profiles. Finally, we review what has been learned about PN-elicited indel formation through the use of the new methods and how this insight is helping to further advance the genome editing field.

Collapse

189

Fatima N, Petri A, Gyllensten U, Feuk L, Ameur A. Evaluation of Single-Molecule Sequencing Technologies for Structural Variant Detection in Two Swedish Human Genomes. Genes (Basel) 2020;11:E1444. [PMID: 33266238 PMCID: PMC7760597 DOI: 10.3390/genes11121444] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 11/24/2020] [Accepted: 11/26/2020] [Indexed: 01/23/2023] Open

190

Koebley SR, Mikheikin A, Leslie K, Guest D, McConnell-Wells W, Lehman JH, Al Juhaishi T, Zhang X, Roberts CH, Picco L, Toor A, Chesney A, Reed J. Digital Polymerase Chain Reaction Paired with High-Speed Atomic Force Microscopy for Quantitation and Length Analysis of DNA Length Polymorphisms. ACS NANO 2020;14:15385-15393. [PMID: 33169971 DOI: 10.1021/acsnano.0c05897] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

191

Short and long-read ultra-deep sequencing profiles emerging heterogeneity across five platform Escherichia coli strains. Metab Eng 2020;65:197-206. [PMID: 33242648 DOI: 10.1016/j.ymben.2020.11.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 10/26/2020] [Accepted: 11/12/2020] [Indexed: 11/24/2022]

Abstract

Reprogramming organisms for large-scale bioproduction counters their evolutionary objectives of fast growth and often leads to mutational collapse of the engineered production pathways during cultivation. Yet, the mutational susceptibility of academic and industrial Escherichia coli bioproduction host strains are poorly understood. In this study, we apply 2^nd and 3^rd generation deep sequencing to profile simultaneous modes of genetic heterogeneity that decimate engineered biosynthetic production in five popular E. coli hosts BL21(DE3), TOP10, MG1655, W, and W3110 producing 2,3-butanediol and mevalonic acid. Combining short-read and long-read sequencing, we detect strain and sequence-specific mutational modes including single nucleotide polymorphism, inversion, and mobile element transposition, as well as complex structural variations that disrupt the integrity of the engineered biosynthetic pathway. Our analysis suggests that organism engineers should avoid chassis strains hosting active insertion sequence (IS) subfamilies such as IS1 and IS10 present in popular E. coli TOP10. We also recommend monitoring for increased mutagenicity in the pathway transcription initiation regions and recombinogenic repeats. Together, short and long sequencing reads identified latent low-frequency mutation events such as a short detrimental inversion within a pathway gene, driven by 8-bp short inverted repeats. This demonstrates the power of combining ultra-deep DNA sequencing technologies to profile genetic heterogeneities of engineered constructs and explore the markedly different mutational landscapes of common E. coli host strains. The observed multitude of evolving variants underlines the usefulness of early mutational profiling for new synthetic pathways designed to sustain in organisms over long cultivation scales.

Collapse

192

Murphy WJ, Foley NM, Bredemeyer KR, Gatesy J, Springer MS. Phylogenomics and the Genetic Architecture of the Placental Mammal Radiation. Annu Rev Anim Biosci 2020;9:29-53. [PMID: 33228377 DOI: 10.1146/annurev-animal-061220-023149] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

193

Integrative analysis of structural variations using short-reads and linked-reads yields highly specific and sensitive predictions. PLoS Comput Biol 2020;16:e1008397. [PMID: 33226985 PMCID: PMC7721175 DOI: 10.1371/journal.pcbi.1008397] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Revised: 12/07/2020] [Accepted: 09/24/2020] [Indexed: 11/19/2022] Open

Abstract

Genetic diseases are driven by aberrations of the human genome. Identification of such aberrations including structural variations (SVs) is key to our understanding. Conventional short-reads whole genome sequencing (cWGS) can identify SVs to base-pair resolution, but utilizes only short-range information and suffers from high false discovery rate (FDR). Linked-reads sequencing (10XWGS) utilizes long-range information by linkage of short-reads originating from the same large DNA molecule. This can mitigate alignment-based artefacts especially in repetitive regions and should enable better prediction of SVs. However, an unbiased evaluation of this technology is not available. In this study, we performed a comprehensive analysis of different types and sizes of SVs predicted by both the technologies and validated with an independent PCR based approach. The SVs commonly identified by both the technologies were highly specific, while validation rate dropped for uncommon events. A particularly high FDR was observed for SVs only found by 10XWGS. To improve FDR and sensitivity, statistical models for both the technologies were trained. Using our approach, we characterized SVs from the MCF7 cell line and a primary breast cancer tumor with high precision. This approach improves SV prediction and can therefore help in understanding the underlying genetics in various diseases.

Cancer and many other diseases are often driven by structural rearrangements in the patients. Their precise identification is necessary to understand evolution and cure for the disease. In this study, we have compared two sequencing technologies for the identification of structural variations i.e. Illumina’s short-reads and 10X Genomics linked-reads sequencing. Short-reads sequencing is already known to have high false discovery rate for structural variations, while, an unbiased performance evaluation of linked-reads sequencing is missing. Hence, we evaluate the performance of these two technologies using computational and PCR based methodologies. Moreover, we also present a statistical approach to increase their performance, supporting better detection of structural variations and thus further research into disease biology.

Collapse

194

Lee N, Park MJ, Song W, Jeon K, Jeong S. Currently Applied Molecular Assays for Identifying ESR1 Mutations in Patients with Advanced Breast Cancer. Int J Mol Sci 2020;21:ijms21228807. [PMID: 33233830 PMCID: PMC7699999 DOI: 10.3390/ijms21228807] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 11/17/2020] [Accepted: 11/19/2020] [Indexed: 12/11/2022] Open

195

Rubin MA, Bristow RG, Thienger PD, Dive C, Imielinski M. Impact of Lineage Plasticity to and from a Neuroendocrine Phenotype on Progression and Response in Prostate and Lung Cancers. Mol Cell 2020;80:562-577. [PMID: 33217316 PMCID: PMC8399907 DOI: 10.1016/j.molcel.2020.10.033] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 09/06/2020] [Accepted: 10/22/2020] [Indexed: 02/07/2023]

196

Benaud N, Edwards RJ, Amos TG, D'Agostino PM, Gutiérrez-Chávez C, Montgomery K, Nicetic I, Ferrari BC. Antarctic desert soil bacteria exhibit high novel natural product potential, evaluated through long-read genome sequencing and comparative genomics. Environ Microbiol 2020;23:3646-3664. [PMID: 33140504 DOI: 10.1111/1462-2920.15300] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Accepted: 10/29/2020] [Indexed: 11/30/2022]

197

Kadota M, Nishimura O, Miura H, Tanaka K, Hiratani I, Kuraku S. Multifaceted Hi-C benchmarking: what makes a difference in chromosome-scale genome scaffolding? Gigascience 2020;9:5695848. [PMID: 31919520 PMCID: PMC6952475 DOI: 10.1093/gigascience/giz158] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Revised: 10/23/2019] [Accepted: 12/02/2019] [Indexed: 12/28/2022] Open

198

Logsdon GA, Vollger MR, Eichler EE. Long-read human genome sequencing and its applications. Nat Rev Genet 2020;21:597-614. [PMID: 32504078 PMCID: PMC7877196 DOI: 10.1038/s41576-020-0236-x] [Citation(s) in RCA: 457] [Impact Index Per Article: 114.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/31/2020] [Indexed: 12/27/2022]

199

Implications of germline copy-number variations in psychiatric disorders: review of large-scale genetic studies. J Hum Genet 2020;66:25-37. [PMID: 32958875 DOI: 10.1038/s10038-020-00838-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 08/28/2020] [Accepted: 09/01/2020] [Indexed: 02/07/2023]

200

Penouilh-Suzette C, Fourré S, Besnard G, Godiard L, Pecrix Y. A simple method for high molecular-weight genomic DNA extraction suitable for long-read sequencing from spores of an obligate biotroph oomycete. J Microbiol Methods 2020;178:106054. [PMID: 32926900 DOI: 10.1016/j.mimet.2020.106054] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Revised: 08/09/2020] [Accepted: 09/07/2020] [Indexed: 10/23/2022]

Abstract

Long-read sequencing technologies are having a major impact on our approaches to studying non-model organisms and microbial communities. By significantly reducing the cost and facilitating the genome assembly pipelines, any laboratory can now develop its own genomics program regardless of the complexity of the genome studied. The most crucial current challenge is to develop efficient protocols for extracting genomic DNA (gDNA) with high quality and integrity adapted to the organism of interest. This can be particularly complex for obligate pathogens that must maintain intimate interactions inside infected host tissues. Here we propose a simple and cost-effective method for high molecular weight gDNA extraction from spores of Plasmopara halstedii, an obligate biotroph oomycete pathogen responsible for downy mildew in sunflower. We optimized the yield, the quality and the integrity of the extracted gDNA by fine-tuning three critical parameters, the grinding, the lysis temperature and the lysis duration. We obtained gDNA with a fragment size distribution reaching a peak ranging from 79 to 145 kb. More than half of the extracted gDNA consisted of DNA fragments larger than 42 kb, with 23% of fragments larger than 100 kb. We then demonstrated the relevance of this protocol for long-read sequencing using PacBio RSII technology. With this protocol, we were able to obtain a mean read length of 9.3 kb, a max read length of 71 kb and an N50 of 13.3 kb. The development of such DNA extraction protocols is an essential prerequisite for fully exploiting technologies requiring high molecular weight gDNA (e.g. long-read sequencing or optical mapping). These technological advances will help generate data to answer questions such as the role of newly duplicated gene clusters, repeated regions, genomic structural variations or to define number of chromosomes that still remains undefined in many species of pathogenic fungi and oomycetes.

Collapse