401
|
Chiu R, Rajan-Babu IS, Friedman JM, Birol I. Straglr: discovering and genotyping tandem repeat expansions using whole genome long-read sequences. Genome Biol 2021; 22:224. [PMID: 34389037 PMCID: PMC8361843 DOI: 10.1186/s13059-021-02447-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 07/26/2021] [Indexed: 12/11/2022] Open
Abstract
Tandem repeat (TR) expansion is the underlying cause of over 40 neurological disorders. Long-read sequencing offers an exciting avenue over conventional technologies for detecting TR expansions. Here, we present Straglr, a robust software tool for both targeted genotyping and novel expansion detection from long-read alignments. We benchmark Straglr using various simulations, targeted genotyping data of cell lines carrying expansions of known diseases, and whole genome sequencing data with chromosome-scale assembly. Our results suggest that Straglr may be useful for investigating disease-associated TR expansions using long-read sequencing.
Collapse
Affiliation(s)
- Readman Chiu
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, V5Z 4S6, Canada
| | - Indhu-Shree Rajan-Babu
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
- BC Children's Hospital Research Institute, Vancouver, BC, V5Z 4H4, Canada
- Department of Medical and Molecular Genetics, King's College London, Strand, London, WC2R 2LS, UK
| | - Jan M Friedman
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
- BC Children's Hospital Research Institute, Vancouver, BC, V5Z 4H4, Canada
| | - Inanc Birol
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, V5Z 4S6, Canada.
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada.
| |
Collapse
|
402
|
Tedersoo L, Albertsen M, Anslan S, Callahan B. Perspectives and Benefits of High-Throughput Long-Read Sequencing in Microbial Ecology. Appl Environ Microbiol 2021; 87:e0062621. [PMID: 34132589 PMCID: PMC8357291 DOI: 10.1128/aem.00626-21] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Short-read, high-throughput sequencing (HTS) methods have yielded numerous important insights into microbial ecology and function. Yet, in many instances short-read HTS techniques are suboptimal, for example, by providing insufficient phylogenetic resolution or low integrity of assembled genomes. Single-molecule and synthetic long-read (SLR) HTS methods have successfully ameliorated these limitations. In addition, nanopore sequencing has generated a number of unique analysis opportunities, such as rapid molecular diagnostics and direct RNA sequencing, and both Pacific Biosciences (PacBio) and nanopore sequencing support detection of epigenetic modifications. Although initially suffering from relatively low sequence quality, recent advances have greatly improved the accuracy of long-read sequencing technologies. In spite of great technological progress in recent years, the long-read HTS methods (PacBio and nanopore sequencing) are still relatively costly, require large amounts of high-quality starting material, and commonly need specific solutions in various analysis steps. Despite these challenges, long-read sequencing technologies offer high-quality, cutting-edge alternatives for testing hypotheses about microbiome structure and functioning as well as assembly of eukaryote genomes from complex environmental DNA samples.
Collapse
Affiliation(s)
- Leho Tedersoo
- Mycology and Microbiology Center, University of Tartu, Tartu, Estonia
| | - Mads Albertsen
- Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Sten Anslan
- Mycology and Microbiology Center, University of Tartu, Tartu, Estonia
- Braunschweig University of Technology, Zoological Institute, Braunschweig, Germany
| | - Benjamin Callahan
- Department of Population Health and Pathobiology, College of Veterinary Medicine and Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, USA
| |
Collapse
|
403
|
Wang T, Loo CE, Kohli RM. Enzymatic approaches for profiling cytosine methylation and hydroxymethylation. Mol Metab 2021; 57:101314. [PMID: 34375743 PMCID: PMC8829811 DOI: 10.1016/j.molmet.2021.101314] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/27/2021] [Revised: 07/26/2021] [Accepted: 08/03/2021] [Indexed: 12/20/2022] Open
Abstract
Background In mammals, modifications to cytosine bases, particularly in cytosine-guanine (CpG) dinucleotide contexts, play a major role in shaping the epigenome. The canonical epigenetic mark is 5-methylcytosine (5mC), but oxidized versions of 5mC, including 5-hydroxymethylcytosine (5hmC), are now known to be important players in epigenomic dynamics. Understanding the functional role of these modifications in gene regulation, normal development, and pathological conditions requires the ability to localize these modifications in genomic DNA. The classical approach for sequencing cytosine modifications has involved differential deamination via the chemical sodium bisulfite; however, bisulfite is destructive, limiting its utility in important biological or clinical settings where detection of low frequency populations is critical. Additionally, bisulfite fails to resolve 5mC from 5hmC. Scope of review To summarize how enzymatic rather than chemical approaches can be leveraged to localize and resolve different cytosine modifications in a non-destructive manner. Major conclusions Nature offers a suite of enzymes with biological roles in cytosine modification in organisms spanning from bacteriophages to mammals. These enzymatic activities include methylation by DNA methyltransferases, oxidation of 5mC by TET family enzymes, hypermodification of 5hmC by glucosyltransferases, and the generation of transition mutations from cytosine to uracil by DNA deaminases. Here, we describe how insights into the natural reactivities of these DNA-modifying enzymes can be leveraged to convert them into powerful biotechnological tools. Application of these enzymes in sequencing can be accomplished by relying on their natural activity, exploiting their ability to discriminate between cytosine modification states, reacting them with functionalized substrate analogs to introduce chemical handles, or engineering the DNA-modifying enzymes to take on new reactivities. We describe how these enzymatic reactions have been combined and permuted to localize DNA modifications with high specificity and without the destructive limitations posed by chemical methods for epigenetic sequencing. Chemical sequencing methods damage DNA and can confound cytosine modifications. DNA modifying enzymes offer non-destructive and selective biotechnological tools. DNA deaminases, methyltransferases, oxygenases and glucosyltransferases can be used. Permuting enzymes with various activities can reveal distinct cytosine states. Engineered enzymes utilizing unnatural co-substrates expand sequencing scope.
Collapse
Affiliation(s)
- Tong Wang
- Graduate Group in Biochemistry and Molecular Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Christian E Loo
- Graduate Group in Biochemistry and Molecular Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Rahul M Kohli
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
404
|
Traylor M, Malik R, Gesierich B, Dichgans M. The BS variant of C4 protects against age-related loss of white matter microstructural integrity. Brain 2021; 145:295-304. [PMID: 34358307 DOI: 10.1093/brain/awab261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 05/12/2021] [Accepted: 06/14/2021] [Indexed: 11/13/2022] Open
Abstract
Age-related loss of white matter microstructural integrity is a major determinant of cognitive decline, dementia, and gait disorders. However, the mechanisms and molecular pathways that contribute to this loss of integrity remain elusive. We performed a GWAS of white matter microstructural integrity as quantified by diffusion MRI metrics (mean diffusivity, MD; and fractional anisotropy, FA) in up to 31,128 individuals from UK Biobank (age 45-81 years) based on a 2 degrees of freedom (2df) test of single nucleotide polymorphism (SNP) and SNP x age effects. We identified 18 loci that were associated at genome-wide significance with either MD (N = 16) or FA (N = 6). Among the top loci was a region on chromosome 6 encoding the human major histocompatibility complex (MHC). Variants in the MHC region were strongly associated with both MD (best SNP: 6:28866209_TTTTG_T, beta(SE)=-0.069(0.009); 2df p = 6.5x10-15) and FA (best SNP: rs3129787, beta(SE)=-0.056(0.008); 2df p = 3.5x10-12). Of the imputed HLA alleles and complement component 4 (C4) structural haplotype variants in the human MHC, the strongest association was with the C4-BS variant (for MD: beta(SE)=-0.070(0.010); p = 2.7x10-11; for FA: beta(SE)=-0.054(0.011); p = 1.6x10-7). After conditioning on C4-BS no associations with HLA alleles remained significant. The protective influence of C4-BS was stronger in older subjects (age ≥ 65; interaction p = 0.0019 (MD), p = 0.015 (FA)) and in subjects without a history of smoking (interaction p = 0.00093 (MD), p = 0.021 (FA)). Taken together, our findings demonstrate a role of the complement system and of gene-environment interactions in age-related loss of white matter microstructural integrity.
Collapse
Affiliation(s)
- Matthew Traylor
- Clinical Pharmacology, William Harvey Research Institute, Queen Mary University of London, London, UK.,The Barts Heart Centre and NIHR Barts Biomedical Research Centre-Barts Health NHS Trust, The William Harvey Research Institute, Queen Mary University London, London, UK
| | - Rainer Malik
- Institute for Stroke and Dementia Research (ISD), University Hospital, LMU Munich, Munich, Germany
| | - Benno Gesierich
- Institute for Stroke and Dementia Research (ISD), University Hospital, LMU Munich, Munich, Germany
| | - Martin Dichgans
- Institute for Stroke and Dementia Research (ISD), University Hospital, LMU Munich, Munich, Germany.,German Centre for Neurodegenerative Diseases (DZNE, Munich), Munich, Germany.,Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
| |
Collapse
|
405
|
Miller DE, Sulovari A, Wang T, Loucks H, Hoekzema K, Munson KM, Lewis AP, Fuerte EPA, Paschal CR, Walsh T, Thies J, Bennett JT, Glass I, Dipple KM, Patterson K, Bonkowski ES, Nelson Z, Squire A, Sikes M, Beckman E, Bennett RL, Earl D, Lee W, Allikmets R, Perlman SJ, Chow P, Hing AV, Wenger TL, Adam MP, Sun A, Lam C, Chang I, Zou X, Austin SL, Huggins E, Safi A, Iyengar AK, Reddy TE, Majoros WH, Allen AS, Crawford GE, Kishnani PS, King MC, Cherry T, Chong JX, Bamshad MJ, Nickerson DA, Mefford HC, Doherty D, Eichler EE. Targeted long-read sequencing identifies missing disease-causing variation. Am J Hum Genet 2021; 108:1436-1449. [PMID: 34216551 PMCID: PMC8387463 DOI: 10.1016/j.ajhg.2021.06.006] [Citation(s) in RCA: 106] [Impact Index Per Article: 35.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 06/07/2021] [Indexed: 12/28/2022] Open
Abstract
Despite widespread clinical genetic testing, many individuals with suspected genetic conditions lack a precise diagnosis, limiting their opportunity to take advantage of state-of-the-art treatments. In some cases, testing reveals difficult-to-evaluate structural differences, candidate variants that do not fully explain the phenotype, single pathogenic variants in recessive disorders, or no variants in genes of interest. Thus, there is a need for better tools to identify a precise genetic diagnosis in individuals when conventional testing approaches have been exhausted. We performed targeted long-read sequencing (T-LRS) using adaptive sampling on the Oxford Nanopore platform on 40 individuals, 10 of whom lacked a complete molecular diagnosis. We computationally targeted up to 151 Mbp of sequence per individual and searched for pathogenic substitutions, structural variants, and methylation differences using a single data source. We detected all genomic aberrations-including single-nucleotide variants, copy number changes, repeat expansions, and methylation differences-identified by prior clinical testing. In 8/8 individuals with complex structural rearrangements, T-LRS enabled more precise resolution of the mutation, leading to changes in clinical management in one case. In ten individuals with suspected Mendelian conditions lacking a precise genetic diagnosis, T-LRS identified pathogenic or likely pathogenic variants in six and variants of uncertain significance in two others. T-LRS accurately identifies pathogenic structural variants, resolves complex rearrangements, and identifies Mendelian variants not detected by other technologies. T-LRS represents an efficient and cost-effective strategy to evaluate high-priority genes and regions or complex clinical testing results.
Collapse
Affiliation(s)
- Danny E Miller
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA.
| | - Arvis Sulovari
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Tianyun Wang
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Hailey Loucks
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Edith P Almanza Fuerte
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - Catherine R Paschal
- Department of Laboratories, Seattle Children's Hospital, Seattle, WA 98105, USA; Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA
| | - Tom Walsh
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA 98195, USA
| | - Jenny Thies
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - James T Bennett
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA; Department of Laboratories, Seattle Children's Hospital, Seattle, WA 98105, USA; Center for Developmental Biology and Regenerative Medicine, Seattle Children's Research Institute, Seattle, WA 98101, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA
| | - Ian Glass
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - Katrina M Dipple
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA; Center for Clinical and Translational Research, Seattle Children's Research Institute, Seattle, WA 98101, USA
| | - Karynne Patterson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Emily S Bonkowski
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - Zoe Nelson
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - Audrey Squire
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - Megan Sikes
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - Erika Beckman
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - Robin L Bennett
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA 98195, USA
| | - Dawn Earl
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - Winston Lee
- Department of Genetics and Development, Columbia University, New York, NY 10032, USA; Department of Ophthalmology, Columbia University, New York, NY 10032, USA
| | - Rando Allikmets
- Department of Ophthalmology, Columbia University, New York, NY 10032, USA; Department of Pathology and Cell Biology, Columbia University, New York, NY 10032, USA
| | - Seth J Perlman
- Department of Neurology, Seattle Children's Hospital, University of Washington, Seattle, WA 98105, USA
| | - Penny Chow
- Department of Pediatrics, Division of Craniofacial Medicine, University of Washington, Seattle, WA 98195, USA
| | - Anne V Hing
- Department of Pediatrics, Division of Craniofacial Medicine, University of Washington, Seattle, WA 98195, USA
| | - Tara L Wenger
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - Margaret P Adam
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - Angela Sun
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA; Center for Clinical and Translational Research, Seattle Children's Research Institute, Seattle, WA 98101, USA
| | - Christina Lam
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA; Center for Integrative Brain Research, Seattle Children's Research Institute, Seattle, WA 98101, USA
| | - Irene Chang
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - Xue Zou
- Program in Computational Biology & Bioinformatics, Duke University, Durham, NC 27710, USA
| | - Stephanie L Austin
- Department of Pediatrics, Division of Medical Genetics, Duke University, Durham, NC 27708, USA
| | - Erin Huggins
- Department of Pediatrics, Division of Medical Genetics, Duke University, Durham, NC 27708, USA
| | - Alexias Safi
- Department of Pediatrics, Division of Medical Genetics, Duke University, Durham, NC 27708, USA
| | - Apoorva K Iyengar
- Department of Biostatistics and Bioinformatics, Duke University; Durham, NC 27708, USA; University Program in Genetics and Genomics, Duke University; Durham, NC 27708, USA
| | - Timothy E Reddy
- Department of Biostatistics and Bioinformatics, Duke University; Durham, NC 27708, USA
| | - William H Majoros
- Department of Biostatistics and Bioinformatics, Duke University; Durham, NC 27708, USA
| | - Andrew S Allen
- Department of Biostatistics and Bioinformatics, Duke University; Durham, NC 27708, USA
| | - Gregory E Crawford
- Department of Pediatrics, Division of Medical Genetics, Duke University, Durham, NC 27708, USA
| | - Priya S Kishnani
- Department of Pediatrics, Division of Medical Genetics, Duke University, Durham, NC 27708, USA
| | - Mary-Claire King
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA 98195, USA
| | - Tim Cherry
- Center for Developmental Biology and Regenerative Medicine, Seattle Children's Research Institute, Seattle, WA 98101, USA
| | - Jessica X Chong
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA
| | - Michael J Bamshad
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA
| | - Deborah A Nickerson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA
| | - Heather C Mefford
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA
| | - Dan Doherty
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA; Department of Pediatrics, Division of Developmental Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA 98105, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
406
|
Mochdia K, Tamaki S. Transcription Factor-Based Genetic Engineering in Microalgae. PLANTS 2021; 10:plants10081602. [PMID: 34451646 PMCID: PMC8399792 DOI: 10.3390/plants10081602] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 07/16/2021] [Accepted: 07/30/2021] [Indexed: 11/16/2022]
Abstract
Sequence-specific DNA-binding transcription factors (TFs) are key components of gene regulatory networks. Advances in high-throughput sequencing have facilitated the rapid acquisition of whole genome assembly and TF repertoires in microalgal species. In this review, we summarize recent advances in gene discovery and functional analyses, especially for transcription factors in microalgal species. Specifically, we provide examples of the genome-scale identification of transcription factors in genome-sequenced microalgal species and showcase their application in the discovery of regulators involved in various cellular functions. Herein, we highlight TF-based genetic engineering as a promising framework for designing microalgal strains for microalgal-based bioproduction.
Collapse
Affiliation(s)
- Keiichi Mochdia
- RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama 230-0045, Japan
- Kihara Institute for Biological Research, Yokohama City University, Totsuka-ku, Yokohama 244-0813, Japan
- RIKEN Baton Zone Program, Tsurumi-ku, Yokohama 230-0045, Japan;
- School of Information and Data Sciences, Nagasaki University, Bunkyo-machi, Nagasaki 852-8521, Japan
- Correspondence: ; Tel.: +81-045-503-9111
| | - Shun Tamaki
- RIKEN Baton Zone Program, Tsurumi-ku, Yokohama 230-0045, Japan;
| |
Collapse
|
407
|
Wang C, Lv H, Ling X, Li H, Diao F, Dai J, Du J, Chen T, Xi Q, Zhao Y, Zhou K, Xu B, Han X, Liu X, Peng M, Chen C, Tao S, Huang L, Liu C, Wen M, Jiang Y, Jiang T, Lu C, Wu W, Wu D, Chen M, Lin Y, Guo X, Huo R, Liu J, Ma H, Jin G, Xia Y, Sha J, Shen H, Hu Z. Association of assisted reproductive technology, germline de novo mutations and congenital heart defects in a prospective birth cohort study. Cell Res 2021; 31:919-928. [PMID: 34108666 PMCID: PMC8324888 DOI: 10.1038/s41422-021-00521-w] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 05/17/2021] [Indexed: 01/05/2023] Open
Abstract
Emerging evidence suggests that children conceived through assisted reproductive technology (ART) have a higher risk of congenital heart defects (CHDs) even when there is no family history. De novo mutation (DNM) is a well-known cause of sporadic congenital diseases; however, whether ART procedures increase the number of germline DNM (gDNM) has not yet been well studied. Here, we performed whole-genome sequencing of 1137 individuals from 160 families conceived through ART and 205 families conceived spontaneously. Children conceived via ART carried 4.59 more gDNMs than children conceived spontaneously, including 3.32 paternal and 1.26 maternal DNMs, after correcting for parental age at conception, cigarette smoking, alcohol drinking, and exercise behaviors. Paternal DNMs in offspring conceived via ART are characterized by C>T substitutions at CpG sites, which potentially affect protein-coding genes and are significantly associated with the increased risk of CHD. In addition, the accumulation of non-coding functional mutations was independently associated with CHD and 87.9% of the mutations were originated from the father. Among ART offspring, infertility of the father was associated with elevated paternal DNMs; usage of both recombinant and urinary follicle-stimulating hormone and high-dosage human chorionic gonadotropin trigger was associated with an increase of maternal DNMs. In sum, the increased gDNMs in offspring conceived by ART were primarily originated from fathers, indicating that ART itself may not be a major reason for the accumulation of gDNMs. Our findings emphasize the importance of evaluating the germline status of the fathers in families with the use of ART.
Collapse
Affiliation(s)
- Cheng Wang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Bioinformatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Hong Lv
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
- State Key Laboratory of Reproductive Medicine (Suzhou Centre), The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China
| | - Xiufeng Ling
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Reproduction, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Hospital, Nanjing, Jiangsu, China
| | - Hong Li
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Reproductive Genetic Center, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China
| | - Feiyang Diao
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Clinical Center of Reproductive Medicine, The First Affiliated Hospital of Nanjing Medical University, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Juncheng Dai
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Jiangbo Du
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Ting Chen
- Scientific Education Section, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Hospital, Nanjing, Jiangsu, China
| | - Qi Xi
- Department of Obstetrics, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China
| | - Yang Zhao
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Kun Zhou
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Bo Xu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Xiumei Han
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Xiaoyu Liu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Meijuan Peng
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Congcong Chen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Shiyao Tao
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Lei Huang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Cong Liu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Mingyang Wen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Yangqian Jiang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Tao Jiang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Chuncheng Lu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Wei Wu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Di Wu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Minjian Chen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Yuan Lin
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- State Key Laboratory of Reproductive Medicine (Suzhou Centre), The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China
- Department of Maternal, Child and Adolescent Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Xuejiang Guo
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Ran Huo
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Jiayin Liu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- State Key Laboratory of Reproductive Medicine (Suzhou Centre), The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China
- Clinical Center of Reproductive Medicine, The First Affiliated Hospital of Nanjing Medical University, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Hongxia Ma
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Guangfu Jin
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Yankai Xia
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Jiahao Sha
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Hongbing Shen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China.
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China.
- State Key Laboratory of Reproductive Medicine (Suzhou Centre), The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China.
| | - Zhibin Hu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China.
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China.
- State Key Laboratory of Reproductive Medicine (Suzhou Centre), The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China.
| |
Collapse
|
408
|
Course MM, Sulovari A, Gudsnuk K, Eichler EE, Valdmanis PN. Characterizing nucleotide variation and expansion dynamics in human-specific variable number tandem repeats. Genome Res 2021; 31:1313-1324. [PMID: 34244228 PMCID: PMC8327921 DOI: 10.1101/gr.275560.121] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 06/25/2021] [Indexed: 12/14/2022]
Abstract
There are more than 55,000 variable number tandem repeats (VNTRs) in the human genome, notable for both their striking polymorphism and mutability. Despite their role in human evolution and genomic variation, they have yet to be studied collectively and in detail, partially owing to their large size, variability, and predominant location in noncoding regions. Here, we examine 467 VNTRs that are human-specific expansions, unique to one location in the genome, and not associated with retrotransposons. We leverage publicly available long-read genomes, including from the Human Genome Structural Variant Consortium, to ascertain the exact nucleotide composition of these VNTRs and compare their composition of alleles. We then confirm repeat unit composition in more than 3000 short-read samples from the 1000 Genomes Project. Our analysis reveals that these VNTRs contain highly structured repeat motif organization, modified by frequent deletion and duplication events. Although overall VNTR compositions tend to remain similar between 1000 Genomes Project superpopulations, we describe a notable exception with substantial differences in repeat composition (in PCBP3), as well as several VNTRs that are significantly different in length between superpopulations (in ART1, PROP1, DYNC2I1, and LOC102723906). We also observe that most of these VNTRs are expanded in archaic human genomes, yet remain stable in length between single generations. Collectively, our findings indicate that repeat motif variability, repeat composition, and repeat length are all informative modalities to consider when characterizing VNTRs and their contribution to genomic variation.
Collapse
Affiliation(s)
- Meredith M Course
- Division of Medical Genetics, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Arvis Sulovari
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Kathryn Gudsnuk
- Division of Medical Genetics, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| | - Paul N Valdmanis
- Division of Medical Genetics, University of Washington School of Medicine, Seattle, Washington 98195, USA
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
409
|
Sakamoto Y, Zaha S, Suzuki Y, Seki M, Suzuki A. Application of long-read sequencing to the detection of structural variants in human cancer genomes. Comput Struct Biotechnol J 2021; 19:4207-4216. [PMID: 34527193 PMCID: PMC8350331 DOI: 10.1016/j.csbj.2021.07.030] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 07/20/2021] [Accepted: 07/25/2021] [Indexed: 01/02/2023] Open
Abstract
In recent years, the so-called long-read sequencing technology has had a substantial impact on various aspects of genome sciences. Here, we introduce recent studies of cancerous structural variants (SVs) using long-read sequencing technologies, namely Pacific Biosciences (PacBio) sequencers, Oxford Nanopore Technologies (ONT) sequencers, and linked-read methods. By taking advantage of long-read lengths, these technologies have enabled the precise detection of SVs, including long insertions by transposable elements, such as LINE-1. In addition to SV detection, the epigenome status (including DNA methylation and haplotype information) surrounding SV loci has also been unveiled by long-read sequencing technologies, to identify the effects of SVs. Among the various research fields in which long-read sequencing has been applied, cancer genomics has shown the most remarkable advances. In fact, many studies are beginning to shed light on the detection of SVs and the elucidation of their complex structures in various types of cancer. In the particular case of cancers, we summarize the technical limitations of the application of this technology to the analysis of clinical samples. We will introduce recent achievements from this viewpoint. However, a similar approach will be started for other applications in the near future. Therefore, by complementing the current short-read sequencing analysis, long-read sequencing should reveal the complex nature of human genomes in their healthy and disease states, which will open a new opportunity for a better understanding of disease development and for a novel strategy for drug development.
Collapse
Affiliation(s)
- Yoshitaka Sakamoto
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
| | - Suzuko Zaha
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
| | - Yutaka Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
| | - Masahide Seki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
| | - Ayako Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
| |
Collapse
|
410
|
Wang L, Liu W, Tang JW, Wang JJ, Liu QH, Wen PB, Wang MM, Pan YC, Gu B, Zhang X. Applications of Raman Spectroscopy in Bacterial Infections: Principles, Advantages, and Shortcomings. Front Microbiol 2021; 12:683580. [PMID: 34349740 PMCID: PMC8327204 DOI: 10.3389/fmicb.2021.683580] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Accepted: 06/17/2021] [Indexed: 12/13/2022] Open
Abstract
Infectious diseases caused by bacterial pathogens are important public issues. In addition, due to the overuse of antibiotics, many multidrug-resistant bacterial pathogens have been widely encountered in clinical settings. Thus, the fast identification of bacteria pathogens and profiling of antibiotic resistance could greatly facilitate the precise treatment strategy of infectious diseases. So far, many conventional and molecular methods, both manual or automatized, have been developed for in vitro diagnostics, which have been proven to be accurate, reliable, and time efficient. Although Raman spectroscopy (RS) is an established technique in various fields such as geochemistry and material science, it is still considered as an emerging tool in research and diagnosis of infectious diseases. Based on current studies, it is too early to claim that RS may provide practical guidelines for microbiologists and clinicians because there is still a gap between basic research and clinical implementation. However, due to the promising prospects of label-free detection and noninvasive identification of bacterial infections and antibiotic resistance in several single steps, it is necessary to have an overview of the technique in terms of its strong points and shortcomings. Thus, in this review, we went through recent studies of RS in the field of infectious diseases, highlighting the application potentials of the technique and also current challenges that prevent its real-world applications.
Collapse
Affiliation(s)
- Liang Wang
- Institute Pasteur of Shanghai, Chinese Academy of Sciences, Shanghai, China
| | - Wei Liu
- Department of Bioinformatics, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, China
| | - Jia-Wei Tang
- Department of Bioinformatics, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, China
| | - Jun-Jiao Wang
- Department of Bioinformatics, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, China
| | - Qing-Hua Liu
- State Key Laboratory of Quality Research in Chinese Medicines, Macau University of Science and Technology, Taipa, China
| | - Peng-Bo Wen
- Department of Bioinformatics, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, China
| | - Meng-Meng Wang
- Jiangsu Key Laboratory of New Drug Research and Clinical Pharmacy, School of Pharmacy, Xuzhou Medical University, Xuzhou, China
| | - Ya-Cheng Pan
- School of Life Sciences, Xuzhou Medical University, Xuzhou, China
| | - Bing Gu
- Laboratory Medicine, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Xiao Zhang
- Department of Bioinformatics, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, China
| |
Collapse
|
411
|
Jain R, Habermann BH, Mignot T. Complete Genome Assembly of Myxococcus xanthus Strain DZ2 Using Long High-Fidelity (HiFi) Reads Generated with PacBio Technology. Microbiol Resour Announc 2021; 10:e0053021. [PMID: 34264106 PMCID: PMC8281077 DOI: 10.1128/mra.00530-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 06/18/2021] [Indexed: 01/28/2023] Open
Abstract
Myxococcus xanthus is a Gram-negative social bacterium belonging to the order Myxococcales of the class Deltaproteobacteria. It is a facultative social predator found in soils across the globe and is thought to be crucial for the microbial ecosystem. Here, we report a complete high-quality reference genome of the M. xanthus strain DZ2.
Collapse
Affiliation(s)
- Rikesh Jain
- Aix-Marseille Université, CNRS, Institut de Microbiologie de la Méditerranée (UMR 7283), Turing Center for Living Systems, Marseille, France
- Aix-Marseille Université, CNRS, Institut de Biologie du Développement de Marseille (UMR 7288), Turing Center for Living Systems, Marseille, France
| | - Bianca H. Habermann
- Aix-Marseille Université, CNRS, Institut de Biologie du Développement de Marseille (UMR 7288), Turing Center for Living Systems, Marseille, France
| | - Tâm Mignot
- Aix-Marseille Université, CNRS, Institut de Microbiologie de la Méditerranée (UMR 7283), Turing Center for Living Systems, Marseille, France
| |
Collapse
|
412
|
Jones A, Torkel C, Stanley D, Nasim J, Borevitz J, Schwessinger B. High-molecular weight DNA extraction, clean-up and size selection for long-read sequencing. PLoS One 2021; 16:e0253830. [PMID: 34264958 PMCID: PMC8282028 DOI: 10.1371/journal.pone.0253830] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 06/09/2021] [Indexed: 12/01/2022] Open
Abstract
Rapid advancements in long-read sequencing technologies have transformed read lengths from bps to Mbps, which has enabled chromosome-scale genome assemblies. However, read lengths are now becoming limited by the extraction of pure high-molecular weight DNA suitable for long-read sequencing, which is particularly challenging in plants and fungi. To overcome this, we present a protocol collection; high-molecular weight DNA extraction, clean-up and size selection for long-read sequencing. We optimised a gentle magnetic bead based high-molecular weight DNA extraction, which is presented here in detail. The protocol circumvents spin columns and high-centrifugation, to limit DNA fragmentation. The protocol is scalable based on tissue input, which can be used on many species of plants, fungi, reptiles and bacteria. It is also cost effective compared to kit-based protocols and hence applicable at scale in low resource settings. An optional sorbitol wash is listed and is highly recommended for plant and fungal tissues. To further remove any remaining contaminants such as phenols and polysaccharides, optional DNA clean-up and size selection strategies are given. This protocol collection is suitable for all common long-read sequencing platforms, such as technologies offered by PacBio and Oxford Nanopore. Using these protocols, sequencing on the Oxford Nanopore MinION can achieve read length N50 values of 30-50 kb, with reads exceeding 200 kb and outputs ranging from 15-30 Gbp. This has been routinely achieved with various plant, fungi, animal and bacteria samples.
Collapse
Affiliation(s)
- Ashley Jones
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Cynthia Torkel
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - David Stanley
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
- Diversity Arrays Technology, Bruce, Australian Capital Territory, Australia
| | - Jamila Nasim
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
- Soil Carbon Co., Orange, New South Wales, Australia
| | - Justin Borevitz
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Benjamin Schwessinger
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| |
Collapse
|
413
|
Rosenbaum JN, Berry AB, Church AJ, Crooks K, Gagan JR, López-Terrada D, Pfeifer JD, Rennert H, Schrijver I, Snow AN, Wu D, Ewalt MD. A Curriculum for Genomic Education of Molecular Genetic Pathology Fellows: A Report of the Association for Molecular Pathology Training and Education Committee. J Mol Diagn 2021; 23:1218-1240. [PMID: 34245921 DOI: 10.1016/j.jmoldx.2021.07.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 06/16/2021] [Accepted: 07/01/2021] [Indexed: 12/19/2022] Open
Abstract
Molecular genetic pathology (MGP) is a subspecialty of pathology and medical genetics and genomics. Genomic testing, which we define as that which generates large data sets and interrogates large segments of the genome in a single assay, is increasingly recognized as essential for optimal patient care through precision medicine. The most common genomic testing technologies in clinical laboratories are next-generation sequencing and microarray. It is essential to train in these methods and to consider the data generated in the context of the diagnosis, medical history, and other clinical findings of individual patients. Accordingly, updating the MGP fellowship curriculum to include genomics is timely, important, and challenging. At the completion of training, an MGP fellow should be capable of independently interpreting and signing out results of a wide range of genomic assays and, given the appropriate context and institutional support, of developing and validating new assays in compliance with applicable regulations. The Genomics Task Force of the MGP Program Directors, a working group of the Association for Molecular Pathology Training and Education Committee, has developed a genomics curriculum framework and recommendations specific to the MGP fellowship. These recommendations are presented for consideration and implementation by MGP fellowship programs with the understanding that MGP programs exist in a diversity of clinical practice environments with a spectrum of available resources.
Collapse
Affiliation(s)
- Jason N Rosenbaum
- Molecular Genetic Pathology Fellow Training in Genomics Task Force of the Training and Education Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Anna B Berry
- Molecular Genetic Pathology Fellow Training in Genomics Task Force of the Training and Education Committee, Association for Molecular Pathology, Rockville, Maryland; Swedish Cancer Institute and Institute of Systems Biology, Seattle, Washington
| | - Alanna J Church
- Molecular Genetic Pathology Fellow Training in Genomics Task Force of the Training and Education Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, Boston Children's Hospital, Boston, Massachusetts
| | - Kristy Crooks
- Molecular Genetic Pathology Fellow Training in Genomics Task Force of the Training and Education Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, University of Colorado Anschutz Medical Campus, Aurora, Colorado
| | - Jeffrey R Gagan
- Molecular Genetic Pathology Fellow Training in Genomics Task Force of the Training and Education Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, University of Texas Southwestern Medical Center, Dallas, Texas
| | - Dolores López-Terrada
- Molecular Genetic Pathology Fellow Training in Genomics Task Force of the Training and Education Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, Baylor College of Medicine, Houston, Texas
| | - John D Pfeifer
- Molecular Genetic Pathology Fellow Training in Genomics Task Force of the Training and Education Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, Washington University School of Medicine, St. Louis, Missouri
| | - Hanna Rennert
- Molecular Genetic Pathology Fellow Training in Genomics Task Force of the Training and Education Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, New York
| | - Iris Schrijver
- Molecular Genetic Pathology Fellow Training in Genomics Task Force of the Training and Education Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, Stanford University School of Medicine, Stanford, California
| | - Anthony N Snow
- Molecular Genetic Pathology Fellow Training in Genomics Task Force of the Training and Education Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, University of Iowa Hospitals and Clinics, Iowa City, Iowa
| | - David Wu
- Molecular Genetic Pathology Fellow Training in Genomics Task Force of the Training and Education Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Laboratory Medicine and Pathology, University of Washington, Seattle, Washington
| | - Mark D Ewalt
- Molecular Genetic Pathology Fellow Training in Genomics Task Force of the Training and Education Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York.
| |
Collapse
|
414
|
A simple approach for effective shearing and reliable concentration measurement of ultra-high-molecular-weight DNA. Biotechniques 2021; 71:439-444. [PMID: 34232102 DOI: 10.2144/btn-2021-0051] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Pipetting and concentration measurement of viscous ultra-high-molecular-weight (UHMW) DNA samples is challenging and often highly imprecise. Effective guidelines for handling UHMW samples are missing in the field. Herein, a simple and low-cost workflow is presented that enables accurate pipetting and reliable concentration measurement. Central to the workflow is the shearing of representative small aliquots of UHMW DNA samples to a fragment size <150 kb by vortexing them for 1 min with a glass bead in a round-bottomed 2-ml tube. Additionally, a solution is provided for accurate quantitation of high-molecular-weight DNA with fluorometric (Qubit [Thermo Fisher Scientific, MA, USA]) methods by using an appropriate genomic DNA standard, resulting in values that match spectrophotometric (Nanodrop [Thermo Fisher Scientific]) optical density readings.
Collapse
|
415
|
Reyes CJ, Laabs BH, Schaake S, Lüth T, Ardicoglu R, Rakovic A, Grütz K, Alvarez-Fischer D, Jamora RD, Rosales RL, Weyers I, König IR, Brüggemann N, Klein C, Dobricic V, Westenberger A, Trinh J. Brain Regional Differences in Hexanucleotide Repeat Length in X-Linked Dystonia-Parkinsonism Using Nanopore Sequencing. NEUROLOGY-GENETICS 2021; 7:e608. [PMID: 34250228 PMCID: PMC8265576 DOI: 10.1212/nxg.0000000000000608] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 06/03/2021] [Indexed: 12/14/2022]
Abstract
Objective Our study investigated the presence of regional differences in hexanucleotide repeat number in postmortem brain tissues of 2 patients with X-linked dystonia-parkinsonism (XDP), a combined dystonia-parkinsonism syndrome modified by a (CCCTCT)n repeat within the causal SINE-VNTR-Alu retrotransposon insertion in the TAF1 gene. Methods Genomic DNA was extracted from blood and postmortem brain samples, including the basal ganglia and cortex from both patients and from the cerebellum, midbrain, and pituitary gland from 1 patient. Repeat sizing was performed using fragment analysis, small-pool PCR-based Southern blotting, and Oxford nanopore sequencing. Results The basal ganglia (p < 0.001) and cerebellum (p < 0.001) showed higher median repeat numbers and higher degrees of repeat instability compared with blood. Conclusions Somatic repeat instability may predominate in brain regions selectively affected in XDP, thereby hinting at its potential role in disease manifestation and modification.
Collapse
Affiliation(s)
- Charles Jourdan Reyes
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Björn-Hergen Laabs
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Susen Schaake
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Theresa Lüth
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Raphaela Ardicoglu
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Aleksandar Rakovic
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Karen Grütz
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Daniel Alvarez-Fischer
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Roland Dominic Jamora
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Raymond L Rosales
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Imke Weyers
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Inke R König
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Norbert Brüggemann
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Christine Klein
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Valerija Dobricic
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Ana Westenberger
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| | - Joanne Trinh
- Institute of Neurogenetics (C.J.R., S.S., T.L., R.A., A.R., K.G., D.A.-F., N.B., C.K., V.D., A.W., J.T.), University of Lübeck, and Institute of Medical Biometry and Statistics (B.-H.L., I.R.K.), University of Lübeck, Germany; Department of Neurosciences (R.D.J.), College of Medicine-Philippine General Hospital, University of the Philippines Manila; Department of Neurology and Psychiatry (R.L.R.), University of Santo Tomas Hospital, Manila, Philippines; Institute of Anatomy (I.W.), Department of Neurology (N.B.), and Lübeck Interdisciplinary Platform for Genome Analytics (V.D.), University of Lübeck, Germany
| |
Collapse
|
416
|
Jayakodi M, Schreiber M, Stein N, Mascher M. Building pan-genome infrastructures for crop plants and their use in association genetics. DNA Res 2021; 28:6117190. [PMID: 33484244 PMCID: PMC7934568 DOI: 10.1093/dnares/dsaa030] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Indexed: 12/20/2022] Open
Abstract
Pan-genomic studies aim at representing the entire sequence diversity within a species to provide useful resources for evolutionary studies, functional genomics and breeding of cultivated plants. Cost reductions in high-throughput sequencing and advances in sequence assembly algorithms have made it possible to create multiple reference genomes along with a catalogue of all forms of genetic variations in plant species with large and complex or polyploid genomes. In this review, we summarize the current approaches to building pan-genomes as an in silico representation of plant sequence diversity and outline relevant methods for their effective utilization in linking structural with phenotypic variation. We propose as future research avenues (i) transcriptomic and epigenomic studies across multiple reference genomes and (ii) the development of user-friendly and feature-rich pan-genome browsers.
Collapse
Affiliation(s)
- Murukarthick Jayakodi
- Department of Genebank, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Mona Schreiber
- Department of Genebank, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Nils Stein
- Department of Genebank, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany.,Center for Integrated Breeding Research (CiBreed), Georg-August-University Göttingen, Göttingen, Germany
| | - Martin Mascher
- Department of Genebank, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany.,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Saxony, Germany
| |
Collapse
|
417
|
Li H, Dawood M, Khayat MM, Farek JR, Jhangiani SN, Khan ZM, Mitani T, Coban-Akdemir Z, Lupski JR, Venner E, Posey JE, Sabo A, Gibbs RA. Exome variant discrepancies due to reference-genome differences. Am J Hum Genet 2021; 108:1239-1250. [PMID: 34129815 PMCID: PMC8322936 DOI: 10.1016/j.ajhg.2021.05.011] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 05/19/2021] [Indexed: 12/15/2022] Open
Abstract
Despite release of the GRCh38 human reference genome more than seven years ago, GRCh37 remains more widely used by most research and clinical laboratories. To date, no study has quantified the impact of utilizing different reference assemblies for the identification of variants associated with rare and common diseases from large-scale exome-sequencing data. By calling variants on both the GRCh37 and GRCh38 references, we identified single-nucleotide variants (SNVs) and insertion-deletions (indels) in 1,572 exomes from participants with Mendelian diseases and their family members. We found that a total of 1.5% of SNVs and 2.0% of indels were discordant when different references were used. Notably, 76.6% of the discordant variants were clustered within discrete discordant reference patches (DISCREPs) comprising only 0.9% of loci targeted by exome sequencing. These DISCREPs were enriched for genomic elements including segmental duplications, fix patch sequences, and loci known to contain alternate haplotypes. We identified 206 genes significantly enriched for discordant variants, most of which were in DISCREPs and caused by multi-mapped reads on the reference assembly that lacked the variant call. Among these 206 genes, eight are implicated in known Mendelian diseases and 53 are associated with common phenotypes from genome-wide association studies. In addition, variant interpretations could also be influenced by the reference after lifting-over variant loci to another assembly. Overall, we identified genes and genomic loci affected by reference assembly choice, including genes associated with Mendelian disorders and complex human diseases that require careful evaluation in both research and clinical applications.
Collapse
Affiliation(s)
- He Li
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Moez Dawood
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Medical Scientist Training Program, Baylor College of Medicine, Houston, TX 77030, USA
| | - Michael M Khayat
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Jesse R Farek
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Shalini N Jhangiani
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Ziad M Khan
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Tadahiro Mitani
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Zeynep Coban-Akdemir
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - James R Lupski
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Department of Pediatrics, Texas Children's Hospital, Houston, TX 77030, USA
| | - Eric Venner
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Jennifer E Posey
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Aniko Sabo
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Richard A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA.
| |
Collapse
|
418
|
Lin B, Hui J, Mao H. Nanopore Technology and Its Applications in Gene Sequencing. BIOSENSORS-BASEL 2021; 11:bios11070214. [PMID: 34208844 PMCID: PMC8301755 DOI: 10.3390/bios11070214] [Citation(s) in RCA: 63] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 06/22/2021] [Accepted: 06/25/2021] [Indexed: 12/14/2022]
Abstract
In recent years, nanopore technology has become increasingly important in the field of life science and biomedical research. By embedding a nano-scale hole in a thin membrane and measuring the electrochemical signal, nanopore technology can be used to investigate the nucleic acids and other biomacromolecules. One of the most successful applications of nanopore technology, the Oxford Nanopore Technology, marks the beginning of the fourth generation of gene sequencing technology. In this review, the operational principle and the technology for signal processing of the nanopore gene sequencing are documented. Moreover, this review focuses on the applications using nanopore gene sequencing technology, including the diagnosis of cancer, detection of viruses and other microbes, and the assembly of genomes. These applications show that nanopore technology is promising in the field of biological and biomedical sensing.
Collapse
Affiliation(s)
- Bo Lin
- State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China; (B.L.); (J.H.)
- Center of Materials Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jianan Hui
- State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China; (B.L.); (J.H.)
| | - Hongju Mao
- State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China; (B.L.); (J.H.)
- Center of Materials Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
- Correspondence: ; Tel.: +86-21-62511070-8707
| |
Collapse
|
419
|
Woo EG, Tayebi N, Sidransky E. Next-Generation Sequencing Analysis of GBA1: The Challenge of Detecting Complex Recombinant Alleles. Front Genet 2021; 12:684067. [PMID: 34234814 PMCID: PMC8255797 DOI: 10.3389/fgene.2021.684067] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 05/27/2021] [Indexed: 01/23/2023] Open
Affiliation(s)
- Elizabeth G Woo
- Medical Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, United States
| | - Nahid Tayebi
- Medical Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, United States
| | - Ellen Sidransky
- Medical Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, United States
| |
Collapse
|
420
|
Tvedte ES, Gasser M, Sparklin BC, Michalski J, Hjelmen CE, Johnston JS, Zhao X, Bromley R, Tallon LJ, Sadzewicz L, Rasko DA, Dunning Hotopp JC. Comparison of long-read sequencing technologies in interrogating bacteria and fly genomes. G3 (BETHESDA, MD.) 2021; 11:jkab083. [PMID: 33768248 PMCID: PMC8495745 DOI: 10.1093/g3journal/jkab083] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 03/07/2021] [Indexed: 12/14/2022]
Abstract
The newest generation of DNA sequencing technology is highlighted by the ability to generate sequence reads hundreds of kilobases in length. Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) have pioneered competitive long read platforms, with more recent work focused on improving sequencing throughput and per-base accuracy. We used whole-genome sequencing data produced by three PacBio protocols (Sequel II CLR, Sequel II HiFi, RS II) and two ONT protocols (Rapid Sequencing and Ligation Sequencing) to compare assemblies of the bacteria Escherichia coli and the fruit fly Drosophila ananassae. In both organisms tested, Sequel II assemblies had the highest consensus accuracy, even after accounting for differences in sequencing throughput. ONT and PacBio CLR had the longest reads sequenced compared to PacBio RS II and HiFi, and genome contiguity was highest when assembling these datasets. ONT Rapid Sequencing libraries had the fewest chimeric reads in addition to superior quantification of E. coli plasmids versus ligation-based libraries. The quality of assemblies can be enhanced by adopting hybrid approaches using Illumina libraries for bacterial genome assembly or polishing eukaryotic genome assemblies, and an ONT-Illumina hybrid approach would be more cost-effective for many users. Genome-wide DNA methylation could be detected using both technologies, however ONT libraries enabled the identification of a broader range of known E. coli methyltransferase recognition motifs in addition to undocumented D. ananassae motifs. The ideal choice of long read technology may depend on several factors including the question or hypothesis under examination. No single technology outperformed others in all metrics examined.
Collapse
Affiliation(s)
- Eric S Tvedte
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Mark Gasser
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Benjamin C Sparklin
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Jane Michalski
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Carl E Hjelmen
- Department of Biology, Texas A&M University, College Station, TX 77843, USA
| | - J Spencer Johnston
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
| | - Xuechu Zhao
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Robin Bromley
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Luke J Tallon
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Lisa Sadzewicz
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - David A Rasko
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Julie C Dunning Hotopp
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA
- Greenebaum Cancer Center, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| |
Collapse
|
421
|
Guo M, Li S, Zhou Y, Li M, Wen Z. Comparative Analysis for the Performance of Long-Read-Based Structural Variation Detection Pipelines in Tandem Repeat Regions. Front Pharmacol 2021; 12:658072. [PMID: 34163355 PMCID: PMC8215501 DOI: 10.3389/fphar.2021.658072] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Accepted: 05/14/2021] [Indexed: 12/04/2022] Open
Abstract
There has been growing recognition of the vital links between structural variations (SVs) and diverse diseases. Research suggests that, with much longer DNA fragments and abundant contextual information, long-read technologies have advantages in SV detection even in complex repetitive regions. So far, several pipelines for calling SVs from long-read sequencing data have been proposed and used in human genome research. However, the performance of these pipelines is still lack of deep exploration and adequate comparison. In this study, we comprehensively evaluated the performance of three commonly used long-read SV detection pipelines, namely PBSV, Sniffles and PBHoney, especially the performance on detecting the SVs in tandem repeat regions (TRRs). Evaluated by using a robust benchmark for germline SV detection as the gold standard, we thoroughly estimated the precision, recall and F1 score of insertions and deletions detected by the pipelines. Our results revealed that all these pipelines clearly exhibited better performance outside TRRs than that in TRRs. The F1 scores of Sniffles in and outside TRRs were 0.60 and 0.76, respectively. The performance of PBSV was similar to that of Sniffles, and was generally higher than that of PBHoney. In conclusion, our findings can be benefit for choosing the appropriate pipelines in real practice and are good complementary to the application of long-read sequencing technologies in the research of rare diseases.
Collapse
Affiliation(s)
- Mingkun Guo
- College of Chemistry, Sichuan University, Chengdu, China
| | - Shihai Li
- College of Chemistry, Sichuan University, Chengdu, China
| | - Yifan Zhou
- College of Chemistry, Sichuan University, Chengdu, China
| | - Menglong Li
- College of Chemistry, Sichuan University, Chengdu, China
| | - Zhining Wen
- College of Chemistry, Sichuan University, Chengdu, China.,Medical Big Data Center, Sichuan University, Chengdu, China
| |
Collapse
|
422
|
Khorsand P, Denti L, Bonizzoni P, Chikhi R, Hormozdiari F. Comparative genome analysis using sample-specific string detection in accurate long reads. BIOINFORMATICS ADVANCES 2021; 1:vbab005. [PMID: 36700094 PMCID: PMC9710709 DOI: 10.1093/bioadv/vbab005] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Motivation Comparative genome analysis of two or more whole-genome sequenced (WGS) samples is at the core of most applications in genomics. These include the discovery of genomic differences segregating in populations, case-control analysis in common diseases and diagnosing rare disorders. With the current progress of accurate long-read sequencing technologies (e.g. circular consensus sequencing from PacBio sequencers), we can dive into studying repeat regions of the genome (e.g. segmental duplications) and hard-to-detect variants (e.g. complex structural variants). Results We propose a novel framework for comparative genome analysis through the discovery of strings that are specific to one genome ('samples-specific' strings). We have developed a novel, accurate and efficient computational method for the discovery of sample-specific strings between two groups of WGS samples. The proposed approach will give us the ability to perform comparative genome analysis without the need to map the reads and is not hindered by shortcomings of the reference genome and mapping algorithms. We show that the proposed approach is capable of accurately finding sample-specific strings representing nearly all variation (>98%) reported across pairs or trios of WGS samples using accurate long reads (e.g. PacBio HiFi data). Availability and implementation Data, code and instructions for reproducing the results presented in this manuscript are publicly available at https://github.com/Parsoa/PingPong. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | - Luca Denti
- Department of Computational Biology, Institut Pasteur, Paris 75015, France
| | | | - Paola Bonizzoni
- Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milano, 20126, Italy,To whom correspondence should be addressed. or or
| | - Rayan Chikhi
- Department of Computational Biology, Institut Pasteur, Paris 75015, France,To whom correspondence should be addressed. or or
| | - Fereydoun Hormozdiari
- Genome Center, UC Davis, Davis, CA 95616, USA,UC Davis MIND Institute, Sacramento, CA 95817, USA,Department of Biochemistry and Molecular Medicine, Sacramento, UC Davis, Sacramento, CA 95817, USA,To whom correspondence should be addressed. or or
| |
Collapse
|
423
|
Sommers P, Chatterjee A, Varsani A, Trubl G. Integrating Viral Metagenomics into an Ecological Framework. Annu Rev Virol 2021; 8:133-158. [PMID: 34033501 DOI: 10.1146/annurev-virology-010421-053015] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Viral metagenomics has expanded our knowledge of the ecology of uncultured viruses, within both environmental (e.g., terrestrial and aquatic) and host-associated (e.g., plants and animals, including humans) contexts. Here, we emphasize the implementation of an ecological framework in viral metagenomic studies to address questions in virology rarely considered ecological, which can change our perception of viruses and how they interact with their surroundings. An ecological framework explicitly considers diverse variants of viruses in populations that make up communities of interacting viruses, with ecosystem-level effects. It provides a structure for the study of the diversity, distributions, dynamics, and interactions of viruses with one another, hosts, and the ecosystem, including interactions with abiotic factors. An ecological framework in viral metagenomics stands poised to broadly expand our knowledge in basic and applied virology. We highlight specific fundamental research needs to capitalize on its potential and advance the field. Expected final online publication date for the Annual Review of Virology, Volume 8 is September 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Pacifica Sommers
- Department of Ecology and Evolutionary Biology, University of Colorado at Boulder, Boulder, Colorado 80309, USA.,These authors contributed equally to this article
| | - Anushila Chatterjee
- Department of Ecology and Evolutionary Biology, University of Colorado at Boulder, Boulder, Colorado 80309, USA.,These authors contributed equally to this article
| | - Arvind Varsani
- The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, Arizona 85287, USA; .,Structural Biology Research Unit, Department of Integrative Biomedical Sciences, University of Cape Town, Observatory 7925, South Africa
| | - Gareth Trubl
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| |
Collapse
|
424
|
Sakamoto Y, Zaha S, Nagasawa S, Miyake S, Kojima Y, Suzuki A, Suzuki Y, Seki M. Long-read whole-genome methylation patterning using enzymatic base conversion and nanopore sequencing. Nucleic Acids Res 2021; 49:e81. [PMID: 34019650 PMCID: PMC8373077 DOI: 10.1093/nar/gkab397] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Revised: 04/09/2021] [Accepted: 04/30/2021] [Indexed: 12/14/2022] Open
Abstract
Long-read whole-genome sequencing analysis of DNA methylation would provide useful information on the chromosomal context of gene expression regulation. Here we describe the development of a method that improves the read length generated by using the bisulfite-sequencing-based approach. In this method, we combined recently developed enzymatic base conversion, where an unmethylated cytosine (C) should be converted to thymine (T), with nanopore sequencing. After methylation-sensitive base conversion, the sequencing library was constructed using long-range polymerase chain reaction. This type of analysis is possible using a minimum of 1 ng genomic DNA, and an N50 read length of 3.4–7.6 kb is achieved. To analyze the produced data, which contained a substantial number of base mismatches due to sequence conversion and an inaccurate base read of the nanopore sequencing, a new analytical pipeline was constructed. To demonstrate the performance of long-read methylation sequencing, breast cancer cell lines and clinical specimens were subjected to analysis, which revealed the chromosomal methylation context of key cancer-related genes, allele-specific methylated genes, and repetitive or deletion regions. This method should convert the intractable specimens for which the amount of available genomic DNA is limited to the tractable targets.
Collapse
Affiliation(s)
- Yoshitaka Sakamoto
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan
| | - Suzuko Zaha
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan
| | - Satoi Nagasawa
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan
| | - Shuhei Miyake
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan
| | - Yasuyuki Kojima
- Division of Breast and Endocrine Surgery, Department of Surgery, St. Marianna University School of Medicine, Kawasaki, Kanagawa, Japan
| | - Ayako Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan
| | - Yutaka Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan
| | - Masahide Seki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan
| |
Collapse
|
425
|
Abstract
The first gapless, telomere-to-telomere sequence of a human autosome, chromosome 8, is complete. Sequencing and assembly of the corresponding centromere in the chimpanzee, orangutan and macaque reveals details of its rapid evolution over the past 25 million years.
Collapse
Affiliation(s)
- Glennis A. Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
426
|
Kim K, Kim M, Kim Y, Lee D, Jung I. Hi-C as a molecular rangefinder to examine genomic rearrangements. Semin Cell Dev Biol 2021; 121:161-170. [PMID: 33992531 DOI: 10.1016/j.semcdb.2021.04.024] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 04/26/2021] [Accepted: 04/27/2021] [Indexed: 11/16/2022]
Abstract
The mammalian genome is highly packed into the nucleus. Over the past decade, the development of Hi-C has contributed significantly to our understanding of the three-dimensional (3D) chromatin structure, uncovering the principles and functions of higher-order chromatin organizations. Recent studies have repositioned its property in spatial proximity measurement to address challenging problems in genome analyses including genome assembly, haplotype phasing, and the detection of genomic rearrangements. In particular, the power of Hi-C in detecting large-scale structural variations (SVs) in the cancer genome has been demonstrated, which is challenging to be addressed solely with short-read-based whole-genome sequencing analyses. In this review, we first provide a comprehensive view of Hi-C as an intuitive and effective SV detection tool. Then, we introduce recently developed bioinformatics tools utilizing Hi-C to investigate genomic rearrangements. Finally, we discuss the potential application of single-cell Hi-C to address the heterogeneity of genomic rearrangements and sub-population identification in the cancer genome.
Collapse
Affiliation(s)
- Kyukwang Kim
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
| | - Mooyoung Kim
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
| | - Yubin Kim
- Department of Life Science, University of Seoul, Seoul 02504, Republic of Korea
| | - Dongsung Lee
- Department of Life Science, University of Seoul, Seoul 02504, Republic of Korea.
| | - Inkyung Jung
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea.
| |
Collapse
|
427
|
Purugganan MD, Jackson SA. Advancing crop genomics from lab to field. Nat Genet 2021; 53:595-601. [PMID: 33958781 DOI: 10.1038/s41588-021-00866-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Accepted: 03/22/2021] [Indexed: 01/23/2023]
Abstract
Crop genomics remains a key element in ensuring scientific progress to secure global food security. It has been two decades since the sequence of the first plant genome, that of Arabidopsis thaliana, was released, and soon after that the draft sequencing of the rice genome was completed. Since then, the genomes of more than 100 crops have been sequenced, plant genome research has expanded across multiple fronts and the next few years promise to bring further advances spurred by the advent of new technologies and approaches. We are likely to see continued innovations in crop genome sequencing, genetic mapping and the acquisition of multiple levels of biological data. There will be exciting opportunities to integrate genome-scale information across multiple scales of biological organization, leading to advances in our mechanistic understanding of crop biological processes, which will, in turn, provide greater impetus for translation of laboratory results to the field.
Collapse
Affiliation(s)
- Michael D Purugganan
- Center for Genomics and Systems Biology, New York University, New York, NY, USA. .,Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates.
| | | |
Collapse
|
428
|
Mao Y, Catacchio CR, Hillier LW, Porubsky D, Li R, Sulovari A, Fernandes JD, Montinaro F, Gordon DS, Storer JM, Haukness M, Fiddes IT, Murali SC, Dishuck PC, Hsieh P, Harvey WT, Audano PA, Mercuri L, Piccolo I, Antonacci F, Munson KM, Lewis AP, Baker C, Underwood JG, Hoekzema K, Huang TH, Sorensen M, Walker JA, Hoffman J, Thibaud-Nissen F, Salama SR, Pang AWC, Lee J, Hastie AR, Paten B, Batzer MA, Diekhans M, Ventura M, Eichler EE. A high-quality bonobo genome refines the analysis of hominid evolution. Nature 2021; 594:77-81. [PMID: 33953399 PMCID: PMC8172381 DOI: 10.1038/s41586-021-03519-x] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Accepted: 04/07/2021] [Indexed: 12/17/2022]
Abstract
The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation1,2. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes1,3–5 and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome. A high-quality bonobo genome assembly provides insights into incomplete lineage sorting in hominids and its relevance to gene evolution and the genetic relationship among living hominids.
Collapse
Affiliation(s)
- Yafei Mao
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - LaDeana W Hillier
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Ruiyang Li
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Arvis Sulovari
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Jason D Fernandes
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Francesco Montinaro
- Department of Biology, University of Bari, Bari, Italy.,Estonian Biocentre, Institute of Genomics, Tartu, Estonia
| | - David S Gordon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | | | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Ian T Fiddes
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Shwetha Canchi Murali
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - PingHsun Hsieh
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Peter A Audano
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | | | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Carl Baker
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Tzu-Hsueh Huang
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Melanie Sorensen
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Jerilyn A Walker
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Jinna Hoffman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Sofie R Salama
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA.,Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | | | - Joyce Lee
- Bionano Genomics, San Diego, CA, USA
| | | | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Mark A Batzer
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Mario Ventura
- Department of Biology, University of Bari, Bari, Italy.
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA. .,Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
429
|
Affiliation(s)
- Zhao Zhang
- MOE Key Laboratory of Metabolism and Molecular Medicine, Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fudan University, Shanghai, China.,Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston McGovern Medical School, Houston, TX, USA
| | - Leng Han
- Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston McGovern Medical School, Houston, TX, USA. .,Center for Epigenetics and Disease Prevention, Institute of Biosciences and Technology, Texas A&M University, Houston, TX, USA.
| |
Collapse
|
430
|
Reply: ATP10B variants in Parkinson's disease-a large cohort study in Chinese mainland population. Acta Neuropathol 2021; 141:807-808. [PMID: 33599815 PMCID: PMC8043888 DOI: 10.1007/s00401-021-02281-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 02/03/2021] [Accepted: 02/03/2021] [Indexed: 11/01/2022]
|
431
|
Lopes M, Louzada S, Gama-Carvalho M, Chaves R. Genomic Tackling of Human Satellite DNA: Breaking Barriers through Time. Int J Mol Sci 2021; 22:4707. [PMID: 33946766 PMCID: PMC8125562 DOI: 10.3390/ijms22094707] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 04/24/2021] [Accepted: 04/27/2021] [Indexed: 12/12/2022] Open
Abstract
(Peri)centromeric repetitive sequences and, more specifically, satellite DNA (satDNA) sequences, constitute a major human genomic component. SatDNA sequences can vary on a large number of features, including nucleotide composition, complexity, and abundance. Several satDNA families have been identified and characterized in the human genome through time, albeit at different speeds. Human satDNA families present a high degree of sub-variability, leading to the definition of various subfamilies with different organization and clustered localization. Evolution of satDNA analysis has enabled the progressive characterization of satDNA features. Despite recent advances in the sequencing of centromeric arrays, comprehensive genomic studies to assess their variability are still required to provide accurate and proportional representation of satDNA (peri)centromeric/acrocentric short arm sequences. Approaches combining multiple techniques have been successfully applied and seem to be the path to follow for generating integrated knowledge in the promising field of human satDNA biology.
Collapse
Affiliation(s)
- Mariana Lopes
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (M.L.); (S.L.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| | - Sandra Louzada
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (M.L.); (S.L.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| | - Margarida Gama-Carvalho
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| | - Raquel Chaves
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (M.L.); (S.L.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| |
Collapse
|
432
|
Giles HH, Hegde MR, Lyon E, Stanley CM, Kerr ID, Garlapow ME, Eggington JM. The Science and Art of Clinical Genetic Variant Classification and Its Impact on Test Accuracy. Annu Rev Genomics Hum Genet 2021; 22:285-307. [PMID: 33900788 DOI: 10.1146/annurev-genom-121620-082709] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Clinical genetic variant classification science is a growing subspecialty of clinical genetics and genomics. The field's continued improvement is essential for the success of precision medicine in both germline (hereditary) and somatic (oncology) contexts. This review focuses on variant classification for DNA next-generation sequencing tests. We first summarize current limitations in variant discovery and definition, and then describe the current five- and four-tier classification systems outlined in dominant standards and guideline publications for germline and somatic tests, respectively. We then discuss measures of variant classification discordance and the field's bias for positive results, as well as considerations for panel size and population screening in the context of estimates of positive predictive value thatincorporate estimated variant classification imperfections. Finally, we share opinions on the current state of variant classification from some of the authors of the most widely used standards and guideline publications and from other domain experts.
Collapse
Affiliation(s)
- Hunter H Giles
- Center for Genomic Interpretation, Sandy, Utah 84092, USA; , ,
| | - Madhuri R Hegde
- PerkinElmer Genomics, Waltham, Massachusetts 02450, USA; .,Department of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Elaine Lyon
- HudsonAlpha Clinical Services Lab, Huntsville, Alabama 35806, USA;
| | - Christine M Stanley
- C2i Genomics, Cambridge, Massachusetts 02139, USA.,Variantyx, Framingham, Massachusetts 01701, USA;
| | | | | | | |
Collapse
|
433
|
Jeffet J, Margalit S, Michaeli Y, Ebenstein Y. Single-molecule optical genome mapping in nanochannels: multidisciplinarity at the nanoscale. Essays Biochem 2021; 65:51-66. [PMID: 33739394 PMCID: PMC8056043 DOI: 10.1042/ebc20200021] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 02/24/2021] [Accepted: 02/26/2021] [Indexed: 12/12/2022]
Abstract
The human genome contains multiple layers of information that extend beyond the genetic sequence. In fact, identical genetics do not necessarily yield identical phenotypes as evident for the case of two different cell types in the human body. The great variation in structure and function displayed by cells with identical genetic background is attributed to additional genomic information content. This includes large-scale genetic aberrations, as well as diverse epigenetic patterns that are crucial for regulating specific cell functions. These genetic and epigenetic patterns operate in concert in order to maintain specific cellular functions in health and disease. Single-molecule optical genome mapping is a high-throughput genome analysis method that is based on imaging long chromosomal fragments stretched in nanochannel arrays. The access to long DNA molecules coupled with fluorescent tagging of various genomic information presents a unique opportunity to study genetic and epigenetic patterns in the genome at a single-molecule level over large genomic distances. Optical mapping entwines synergistically chemical, physical, and computational advancements, to uncover invaluable biological insights, inaccessible by sequencing technologies. Here we describe the method's basic principles of operation, and review the various available mechanisms to fluorescently tag genomic information. We present some of the recent biological and clinical impact enabled by optical mapping and present recent approaches for increasing the method's resolution and accuracy. Finally, we discuss how multiple layers of genomic information may be mapped simultaneously on the same DNA molecule, thus paving the way for characterizing multiple genomic observables on individual DNA molecules.
Collapse
Affiliation(s)
- Jonathan Jeffet
- Raymond and Beverly Sackler Faculty of Exact Sciences, Center for Nanoscience and Nanotechnology, Center for Light Matter Interaction, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Sapir Margalit
- Raymond and Beverly Sackler Faculty of Exact Sciences, Center for Nanoscience and Nanotechnology, Center for Light Matter Interaction, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Yael Michaeli
- Raymond and Beverly Sackler Faculty of Exact Sciences, Center for Nanoscience and Nanotechnology, Center for Light Matter Interaction, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Yuval Ebenstein
- Raymond and Beverly Sackler Faculty of Exact Sciences, Center for Nanoscience and Nanotechnology, Center for Light Matter Interaction, Tel Aviv University, Tel Aviv 6997801, Israel
| |
Collapse
|
434
|
Garg S. Computational methods for chromosome-scale haplotype reconstruction. Genome Biol 2021; 22:101. [PMID: 33845884 PMCID: PMC8040228 DOI: 10.1186/s13059-021-02328-9] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2021] [Accepted: 03/25/2021] [Indexed: 12/13/2022] Open
Abstract
High-quality chromosome-scale haplotype sequences of diploid genomes, polyploid genomes, and metagenomes provide important insights into genetic variation associated with disease and biodiversity. However, whole-genome short read sequencing does not yield haplotype information spanning whole chromosomes directly. Computational assembly of shorter haplotype fragments is required for haplotype reconstruction, which can be challenging owing to limited fragment lengths and high haplotype and repeat variability across genomes. Recent advancements in long-read and chromosome-scale sequencing technologies, alongside computational innovations, are improving the reconstruction of haplotypes at the level of whole chromosomes. Here, we review recent and discuss methodological progress and perspectives in these areas.
Collapse
Affiliation(s)
- Shilpa Garg
- Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
435
|
Hiatt SM, Lawlor JM, Handley LH, Ramaker RC, Rogers BB, Partridge EC, Boston LB, Williams M, Plott CB, Jenkins J, Gray DE, Holt JM, Bowling KM, Bebin EM, Grimwood J, Schmutz J, Cooper GM. Long-read genome sequencing for the molecular diagnosis of neurodevelopmental disorders. HGG ADVANCES 2021; 2:100023. [PMID: 33937879 PMCID: PMC8087252 DOI: 10.1016/j.xhgg.2021.100023] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 01/07/2021] [Indexed: 02/07/2023] Open
Abstract
Exome and genome sequencing have proven to be effective tools for the diagnosis of neurodevelopmental disorders (NDDs), but large fractions of NDDs cannot be attributed to currently detectable genetic variation. This is likely, at least in part, a result of the fact that many genetic variants are difficult or impossible to detect through typical short-read sequencing approaches. Here, we describe a genomic analysis using Pacific Biosciences circular consensus sequencing (CCS) reads, which are both long (>10 kb) and accurate (>99% bp accuracy). We used CCS on six proband-parent trios with NDDs that were unexplained despite extensive testing, including genome sequencing with short reads. We identified variants and created de novo assemblies in each trio, with global metrics indicating these datasets are more accurate and comprehensive than those provided by short-read data. In one proband, we identified a likely pathogenic (LP), de novo L1-mediated insertion in CDKL5 that results in duplication of exon 3, leading to a frameshift. In a second proband, we identified multiple large de novo structural variants, including insertion-translocations affecting DGKB and MLLT3, which we show disrupt MLLT3 transcript levels. We consider this extensive structural variation likely pathogenic. The breadth and quality of variant detection, coupled to finding variants of clinical and research interest in two of six probands with unexplained NDDs, support the hypothesis that long-read genome sequencing can substantially improve rare disease genetic discovery rates.
Collapse
Affiliation(s)
- Susan M. Hiatt
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | | | - Lori H. Handley
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Ryne C. Ramaker
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Brianne B. Rogers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL 35924, USA
| | | | - Lori Beth Boston
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Melissa Williams
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | | | - Jerry Jenkins
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - David E. Gray
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - James M. Holt
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Kevin M. Bowling
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - E. Martina Bebin
- Department of Neurology, University of Alabama at Birmingham, Birmingham, AL 35924, USA
| | - Jane Grimwood
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Jeremy Schmutz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | | |
Collapse
|
436
|
Seixas FA, Edelman NB, Mallet J. Synteny-Based Genome Assembly for 16 Species of Heliconius Butterflies, and an Assessment of Structural Variation across the Genus. Genome Biol Evol 2021; 13:6207971. [PMID: 33792688 PMCID: PMC8290116 DOI: 10.1093/gbe/evab069] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/29/2021] [Indexed: 12/11/2022] Open
Abstract
Heliconius butterflies (Lepidoptera: Nymphalidae) are a group of 48 neotropical species widely studied in evolutionary research. Despite the wealth of genomic data generated in past years, chromosomal level genome assemblies currently exist for only two species, Heliconius melpomene and Heliconius erato, each a representative of one of the two major clades of the genus. Here, we use these reference genomes to improve the contiguity of previously published draft genome assemblies of 16 Heliconius species. Using a reference-assisted scaffolding approach, we place and order the scaffolds of these genomes onto chromosomes, resulting in 95.7-99.9% of their genomes anchored to chromosomes. Genome sizes are somewhat variable among species (270-422 Mb) and in one small group of species (Heliconius hecale, Heliconius elevatus, and Heliconius pardalinus) expansions in genome size are driven mainly by repetitive sequences that map to four small regions in the H. melpomene reference genome. Genes from these repeat regions show an increase in exon copy number, an absence of internal stop codons, evidence of constraint on nonsynonymous changes, and increased expression, all of which suggest that at least some of the extra copies are functional. Finally, we conducted a systematic search for inversions and identified five moderately large inversions fixed between the two major Heliconius clades. We infer that one of these inversions was transferred by introgression between the lineages leading to the erato/sara and burneyi/doris clades. These reference-guided assemblies represent a major improvement in Heliconius genomic resources that enable further genetic and evolutionary discoveries in this genus.
Collapse
Affiliation(s)
- Fernando A Seixas
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA
| | - Nathaniel B Edelman
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA.,Yale Institute for Biospheric Studies, Yale University, New Haven, Connecticut, USA
| | - James Mallet
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA
| |
Collapse
|
437
|
Watson CM, Crinnion LA, Lindsay H, Mitchell R, Camm N, Robinson R, Joyce C, Tanteles GA, Halloran DJO, Pena SDJ, Carr IM, Bonthron DT. Assessing the utility of long-read nanopore sequencing for rapid and efficient characterization of mobile element insertions. J Transl Med 2021; 101:442-449. [PMID: 32989232 DOI: 10.1038/s41374-020-00489-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 09/09/2020] [Accepted: 09/10/2020] [Indexed: 12/16/2022] Open
Abstract
Short-read next generation sequencing (NGS) has become the predominant first-line technique used to diagnose patients with rare genetic conditions. Inherent limitations of short-read technology, notably for the detection and characterization of complex insertion-containing variants, are offset by the ability to concurrently screen many disease genes. "Third-generation" long-read sequencers are increasingly being deployed as an orthogonal adjunct technology, but their full potential for molecular genetic diagnosis has yet to be exploited. Here, we describe three diagnostic cases in which pathogenic mobile element insertions were refractory to characterization by short-read sequencing. To validate the accuracy of the long-read technology, we first used Sanger sequencing to confirm the integration sites and derive curated benchmark sequences of the variant-containing alleles. Long-read nanopore sequencing was then performed on locus-specific amplicons. Pairwise comparison between these data and the previously determined benchmark alleles revealed 100% identity of the variant-containing sequences. We demonstrate a number of technical advantages over existing wet-laboratory approaches, including in silico size selection of a mixed pool of amplification products, and the relative ease with which an automated informatics workflow can be established. Our findings add to a growing body of literature describing the diagnostic utility of long-read sequencing.
Collapse
Affiliation(s)
- Christopher M Watson
- Yorkshire and North East Genomic Laboratory Hub, Central Lab, St. James's University Hospital, Leeds, LS9 7TF, UK.
- Leeds Institute of Medical Research, University of Leeds, St. James's University Hospital, Leeds, LS9 7TF, UK.
| | - Laura A Crinnion
- Yorkshire and North East Genomic Laboratory Hub, Central Lab, St. James's University Hospital, Leeds, LS9 7TF, UK
- Leeds Institute of Medical Research, University of Leeds, St. James's University Hospital, Leeds, LS9 7TF, UK
| | - Helen Lindsay
- Yorkshire and North East Genomic Laboratory Hub, Central Lab, St. James's University Hospital, Leeds, LS9 7TF, UK
| | - Rowena Mitchell
- Yorkshire and North East Genomic Laboratory Hub, Central Lab, St. James's University Hospital, Leeds, LS9 7TF, UK
| | - Nick Camm
- Yorkshire and North East Genomic Laboratory Hub, Central Lab, St. James's University Hospital, Leeds, LS9 7TF, UK
| | - Rachel Robinson
- Yorkshire and North East Genomic Laboratory Hub, Central Lab, St. James's University Hospital, Leeds, LS9 7TF, UK
| | - Caroline Joyce
- Department of Endocrinology, Cork University Hospital, Wilton, Cork, Ireland
| | - George A Tanteles
- Department of Clinical Genetics, The Cyprus Institute of Neurology and Genetics, 6 International Airport Avenue, PO Box 23462, CY1683, Nicosia, Cyprus
| | | | | | - Ian M Carr
- Leeds Institute of Medical Research, University of Leeds, St. James's University Hospital, Leeds, LS9 7TF, UK
| | - David T Bonthron
- Leeds Institute of Medical Research, University of Leeds, St. James's University Hospital, Leeds, LS9 7TF, UK
| |
Collapse
|
438
|
|
439
|
Blom MPK. Opportunities and challenges for high-quality biodiversity tissue archives in the age of long-read sequencing. Mol Ecol 2021; 30:5935-5948. [PMID: 33786900 DOI: 10.1111/mec.15909] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 03/06/2021] [Accepted: 03/22/2021] [Indexed: 12/11/2022]
Abstract
The technological ability to characterize genetic variation at a genome-wide scale provides an unprecedented opportunity to study the genetic underpinnings and evolutionary mechanisms that promote and sustain biodiversity. The transition from short- to long-read sequencing is particularly promising and allows a more holistic view on any changes in genetic diversity across time and space. Long-read sequencing has tremendous potential but sequencing success strongly depends on the long-range integrity of DNA molecules and therefore on the availability of high-quality tissue samples. With the scope of genomic experiments expanding and wild populations simultaneously disappearing at an unprecedented rate, access to high-quality samples may soon be a major concern for many projects. The need for high-quality biodiversity tissue archives is therefore urgent but sampling and preserving high-quality samples is not a trivial exercise. In this review, I will briefly outline how long-read sequencing can benefit the study of molecular ecology, how this will substantially increase the demand for high-quality tissues and why it is challenging to preserve DNA integrity. I will then provide an overview of preservation approaches and end with a call for support to acknowledge the efforts needed to assemble high-quality tissue archives. In doing so, I hope to simultaneously motivate field biologists to expand sampling practices and molecular biologists to develop (cost) efficient guidelines for the sampling and long-term storage of tissues. A concerted, interdisciplinary, effort is needed to catalogue the genetic variation underlying contemporary biodiversity and will eventually provide a critical resource for future studies.
Collapse
Affiliation(s)
- Mozes P K Blom
- Leibniz Institut für Evolutions- und Biodiversitätsforschung, Museum für Naturkunde, Berlin, Germany
| |
Collapse
|
440
|
Hu T, Chitnis N, Monos D, Dinh A. Next-generation sequencing technologies: An overview. Hum Immunol 2021; 82:801-811. [PMID: 33745759 DOI: 10.1016/j.humimm.2021.02.012] [Citation(s) in RCA: 258] [Impact Index Per Article: 86.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Revised: 02/18/2021] [Accepted: 02/23/2021] [Indexed: 12/14/2022]
Abstract
Since the days of Sanger sequencing, next-generation sequencing technologies have significantly evolved to provide increased data output, efficiencies, and applications. These next generations of technologies can be categorized based on read length. This review provides an overview of these technologies as two paradigms: short-read, or "second-generation," technologies, and long-read, or "third-generation," technologies. Herein, short-read sequencing approaches are represented by the most prevalent technologies, Illumina and Ion Torrent, and long-read sequencing approaches are represented by Pacific Biosciences and Oxford Nanopore technologies. All technologies are reviewed along with reported advantages and disadvantages. Until recently, short-read sequencing was thought to provide high accuracy limited by read-length, while long-read technologies afforded much longer read-lengths at the expense of accuracy. Emerging developments for third-generation technologies hold promise for the next wave of sequencing evolution, with the co-existence of longer read lengths and high accuracy.
Collapse
Affiliation(s)
- Taishan Hu
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, United States
| | - Nilesh Chitnis
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, United States; Department of Surgery, Baylor College of Medicine, Houston, TX, United States
| | - Dimitri Monos
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, United States; Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
| | - Anh Dinh
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, United States; Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
| |
Collapse
|
441
|
Complex targeted sequencing in real time. Nat Rev Genet 2021; 22:67. [PMID: 33349697 DOI: 10.1038/s41576-020-00324-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
442
|
Abstract
Microbial ecology is the study of microorganisms present in nature. It particularly focuses on microbial interactions with any biota and with surrounding environments. Microbial ecology is entering its golden age with innovative multi-omics methods triggered by next-generation sequencing technologies. However, the extraction of ecologically relevant information from ever-increasing omics data remains one of the most challenging tasks in microbial ecology. This special issue includes 11 review articles that provide an overview of the state of the art of omics-based approaches in the field of microbial ecology, with particular emphasis on the interpretation of omics data, environmental pollution tracking, interactions in microbiomes, and viral ecology.
Collapse
|
443
|
Prokaryotic DNA methylation and its functional roles. J Microbiol 2021; 59:242-248. [DOI: 10.1007/s12275-021-0674-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 01/29/2021] [Accepted: 02/01/2021] [Indexed: 12/31/2022]
|
444
|
Huo W, Ling W, Wang Z, Li Y, Zhou M, Ren M, Li X, Li J, Xia Z, Liu X, Huang X. Miniaturized DNA Sequencers for Personal Use: Unreachable Dreams or Achievable Goals. FRONTIERS IN NANOTECHNOLOGY 2021. [DOI: 10.3389/fnano.2021.628861] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
The appearance of next generation sequencing technology that features short read length with high measurement throughput and low cost has revolutionized the field of life science, medicine, and even computer science. The subsequent development of the third-generation sequencing technologies represented by nanopore and zero-mode waveguide techniques offers even higher speed and long read length with promising applications in portable and rapid genomic tests in field. Especially under the current circumstances, issues such as public health emergencies and global pandemics impose soaring demand on quick identification of origins and species of analytes through DNA sequences. In addition, future development of disease diagnosis, treatment, and tracking techniques may also require frequent DNA testing. As a result, DNA sequencers with miniaturized size and highly integrated components for personal and portable use to tackle increasing needs for disease prevention, personal medicine, and biohazard protection may become future trends. Just like many other biological and medical analytical systems that were originally bulky in sizes, collaborative work from various subjects in engineering and science eventually leads to the miniaturization of these systems. DNA sequencers that involve nanoprobes, detectors, microfluidics, microelectronics, and circuits as well as complex functional materials and structures are extremely complicated but may be miniaturized with technical advancement. This paper reviews the state-of-the-art technology in developing essential components in DNA sequencers and analyzes the feasibility to achieve miniaturized DNA sequencers for personal use. Future perspectives on the opportunities and associated challenges for compact DNA sequencers are also identified.
Collapse
|
445
|
The extrachromosomal elements of the Naegleria genus: How little we know. Plasmid 2021; 115:102567. [PMID: 33617907 DOI: 10.1016/j.plasmid.2021.102567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2020] [Revised: 02/05/2021] [Accepted: 02/10/2021] [Indexed: 11/20/2022]
Abstract
There are currently 47 characterized species in the Naegleria genus of free-living amoebae. Each amoeba has thousands of extrachromosomal elements that are closed circular structures comprised of a single ribosomal DNA (rDNA) copy and a large non-rDNA sequence. Despite the presence of putative open reading frames and introns, ribosomal RNA is the only established transcript. A single origin of DNA replication (ori) has been mapped within the non-rDNA sequence for one species (N. gruberi), a finding that strongly indicates that these episomes replicate independently of the cell's chromosomal DNA component. This article reviews that which has been published about these interesting DNA elements and by analyzing available sequence data, discusses the possibility that different phylogenetically related clusters of Naegleria species individually conserve ori structures and suggests where the rRNA promoter and termination sites may be located.
Collapse
|
446
|
Amarasinghe SL, Ritchie ME, Gouil Q. long-read-tools.org: an interactive catalogue of analysis methods for long-read sequencing data. Gigascience 2021; 10:6137723. [PMID: 33590862 PMCID: PMC7931822 DOI: 10.1093/gigascience/giab003] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 12/21/2020] [Accepted: 01/13/2021] [Indexed: 01/01/2023] Open
Abstract
Background The data produced by long-read third-generation sequencers have unique characteristics compared to short-read sequencing data, often requiring tailored analysis tools for tasks ranging from quality control to downstream processing. The rapid growth in software that addresses these challenges for different genomics applications is difficult to keep track of, which makes it hard for users to choose the most appropriate tool for their analysis goal and for developers to identify areas of need and existing solutions to benchmark against. Findings We describe the implementation of long-read-tools.org, an open-source database that organizes the rapidly expanding collection of long-read data analysis tools and allows its exploration through interactive browsing and filtering. The current database release contains 478 tools across 32 categories. Most tools are developed in Python, and the most frequent analysis tasks include base calling, de novo assembly, error correction, quality checking/filtering, and isoform detection, while long-read single-cell data analysis and transcriptomics are areas with the fewest tools available. Conclusion Continued growth in the application of long-read sequencing in genomics research positions the long-read-tools.org database as an essential resource that allows researchers to keep abreast of both established and emerging software to help guide the selection of the most relevant tool for their analysis needs.
Collapse
Affiliation(s)
- Shanika L Amarasinghe
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC 3052, Australia.,Department of Medical Biology, The University of Melbourne, 1G Royal Parade, Parkville, VIC 3052, Australia
| | - Matthew E Ritchie
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC 3052, Australia.,Department of Medical Biology, The University of Melbourne, 1G Royal Parade, Parkville, VIC 3052, Australia.,School of Mathematics and Statistics, The University of Melbourne, 813 Swanston Street, Parkville, VIC 3010, Australia
| | - Quentin Gouil
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC 3052, Australia.,Department of Medical Biology, The University of Melbourne, 1G Royal Parade, Parkville, VIC 3052, Australia
| |
Collapse
|
447
|
Caspar SM, Schneider T, Stoll P, Meienberg J, Matyas G. Potential of whole-genome sequencing-based pharmacogenetic profiling. Pharmacogenomics 2021; 22:177-190. [PMID: 33517770 DOI: 10.2217/pgs-2020-0155] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Pharmacogenetics represents a major driver of precision medicine, promising individualized drug selection and dosing. Traditionally, pharmacogenetic profiling has been performed using targeted genotyping that focuses on common/known variants. Recently, whole-genome sequencing (WGS) is emerging as a more comprehensive short-read next-generation sequencing approach, enabling both gene diagnostics and pharmacogenetic profiling, including rare/novel variants, in a single assay. Using the example of the pharmacogene CYP2D6, we demonstrate the potential of WGS-based pharmacogenetic profiling as well as emphasize the limitations of short-read next-generation sequencing. In the near future, we envision a shift toward long-read sequencing as the predominant method for gene diagnostics and pharmacogenetic profiling, providing unprecedented data quality and improving patient care.
Collapse
Affiliation(s)
- Sylvan Manuel Caspar
- Center for Cardiovascular Genetics & Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich 8952, Switzerland.,Department of Health Sciences & Technology, Laboratory of Translational Nutrition Biology, ETH Zurich, Schwerzenbach 8603, Switzerland
| | - Timo Schneider
- Center for Cardiovascular Genetics & Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich 8952, Switzerland
| | - Patricia Stoll
- Center for Cardiovascular Genetics & Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich 8952, Switzerland
| | - Janine Meienberg
- Center for Cardiovascular Genetics & Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich 8952, Switzerland
| | - Gabor Matyas
- Center for Cardiovascular Genetics & Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich 8952, Switzerland.,Zurich Center for Integrative Human Physiology, University of Zurich, Zurich 8057, Switzerland
| |
Collapse
|
448
|
Holley G, Beyter D, Ingimundardottir H, Møller PL, Kristmundsdottir S, Eggertsson HP, Halldorsson BV. Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly. Genome Biol 2021; 22:28. [PMID: 33419473 PMCID: PMC7792008 DOI: 10.1186/s13059-020-02244-4] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Accepted: 12/15/2020] [Indexed: 12/20/2022] Open
Abstract
A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low as 0.22 %. SNP calls in Ratatosk corrected reads are nearly 99 % accurate and indel calls accuracy is increased by up to 37 %. An assembly of Ratatosk corrected reads from an Ashkenazi individual yields a contig N50 of 45 Mbp and less misassemblies than a PacBio HiFi reads assembly.
Collapse
Affiliation(s)
| | | | | | - Peter L Møller
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
| | - Snædis Kristmundsdottir
- deCODE genetics/Amgen Inc., Reykjavík, Iceland
- School of Technology, Reykjavik University, Reykjavík, Iceland
| | | | - Bjarni V Halldorsson
- deCODE genetics/Amgen Inc., Reykjavík, Iceland
- School of Technology, Reykjavik University, Reykjavík, Iceland
| |
Collapse
|
449
|
The structure, function and evolution of a complete human chromosome 8. Nature 2021; 593:101-107. [PMID: 33828295 PMCID: PMC8099727 DOI: 10.1038/s41586-021-03420-7] [Citation(s) in RCA: 179] [Impact Index Per Article: 59.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 03/04/2021] [Indexed: 02/07/2023]
Abstract
The complete assembly of each human chromosome is essential for understanding human biology and evolution1,2. Here we use complementary long-read sequencing technologies to complete the linear assembly of human chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08-Mb centromeric α-satellite array, a 644-kb copy number polymorphism in the β-defensin gene cluster that is important for disease risk, and an 863-kb variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73-kb hypomethylated region of diverse higher-order α-satellites enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. In addition, we confirm the overall organization and methylation pattern of the centromere in a diploid human genome. Using a dual long-read sequencing approach, we complete high-quality draft assemblies of the orthologous centromere from chromosome 8 in chimpanzee, orangutan and macaque to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved in the great ape ancestor with a layered symmetry, in which more ancient higher-order repeats locate peripherally to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated by more than 2.2-fold compared to the unique portions of the genome, and this acceleration extends into the flanking sequence.
Collapse
|
450
|
Murigneux V, Rai SK, Furtado A, Bruxner TJC, Tian W, Harliwong I, Wei H, Yang B, Ye Q, Anderson E, Mao Q, Drmanac R, Wang O, Peters BA, Xu M, Wu P, Topp B, Coin LJM, Henry RJ. Comparison of long-read methods for sequencing and assembly of a plant genome. Gigascience 2020; 9:giaa146. [PMID: 33347571 PMCID: PMC7751402 DOI: 10.1093/gigascience/giaa146] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Revised: 07/07/2020] [Accepted: 11/22/2020] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND Sequencing technologies have advanced to the point where it is possible to generate high-accuracy, haplotype-resolved, chromosome-scale assemblies. Several long-read sequencing technologies are available, and a growing number of algorithms have been developed to assemble the reads generated by those technologies. When starting a new genome project, it is therefore challenging to select the most cost-effective sequencing technology, as well as the most appropriate software for assembly and polishing. It is thus important to benchmark different approaches applied to the same sample. RESULTS Here, we report a comparison of 3 long-read sequencing technologies applied to the de novo assembly of a plant genome, Macadamia jansenii. We have generated sequencing data using Pacific Biosciences (Sequel I), Oxford Nanopore Technologies (PromethION), and BGI (single-tube Long Fragment Read) technologies for the same sample. Several assemblers were benchmarked in the assembly of Pacific Biosciences and Nanopore reads. Results obtained from combining long-read technologies or short-read and long-read technologies are also presented. The assemblies were compared for contiguity, base accuracy, and completeness, as well as sequencing costs and DNA material requirements. CONCLUSIONS The 3 long-read technologies produced highly contiguous and complete genome assemblies of M. jansenii. At the time of sequencing, the cost associated with each method was significantly different, but continuous improvements in technologies have resulted in greater accuracy, increased throughput, and reduced costs. We propose updating this comparison regularly with reports on significant iterations of the sequencing technologies.
Collapse
Affiliation(s)
- Valentine Murigneux
- Genome Innovation Hub, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
| | - Subash Kumar Rai
- Genome Innovation Hub, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
| | - Agnelo Furtado
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Timothy J C Bruxner
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
| | - Wei Tian
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Australia, 300 Herston Road, Herston, QLD 4006, Australia
| | - Ivon Harliwong
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Australia, 300 Herston Road, Herston, QLD 4006, Australia
| | - Hanmin Wei
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Bicheng Yang
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Australia, 300 Herston Road, Herston, QLD 4006, Australia
| | - Qianyu Ye
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Australia, 300 Herston Road, Herston, QLD 4006, Australia
| | - Ellis Anderson
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
- Advanced Genomics Technology Lab, Complete Genomics Inc., 2904 Orchard Parkway, San Jose, CA 95134, USA
| | - Qing Mao
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
- Advanced Genomics Technology Lab, Complete Genomics Inc., 2904 Orchard Parkway, San Jose, CA 95134, USA
| | - Radoje Drmanac
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
- Advanced Genomics Technology Lab, Complete Genomics Inc., 2904 Orchard Parkway, San Jose, CA 95134, USA
| | - Ou Wang
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
| | - Brock A Peters
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
- Advanced Genomics Technology Lab, Complete Genomics Inc., 2904 Orchard Parkway, San Jose, CA 95134, USA
| | - Mengyang Xu
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Qingdao, Building 2, No. 2 Hengyunshan Road, Qingdao 266555, China
| | - Pei Wu
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Tianjin, Airport Business Park, Building E3, Airport Economics Area, Tianjin 300308, China
| | - Bruce Topp
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Lachlan J M Coin
- Genome Innovation Hub, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
- Department of Microbiology and Immunology, University of Melbourne at The Peter Doherty Institute for Infection and Immunity, 792 Elizabeth Street, Melbourne, VIC 3004, Australia
| | - Robert J Henry
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD 4072, Australia
| |
Collapse
|