1
|
Ambrodji A, Sadlon A, Amstutz U, Hoch D, Berger MD, Bastian S, Offer SM, Largiadèr CR. Approach for Phased Sequence-Based Genotyping of the Critical Pharmacogene Dihydropyrimidine Dehydrogenase ( DPYD). Int J Mol Sci 2024; 25:7599. [PMID: 39062841 PMCID: PMC11277299 DOI: 10.3390/ijms25147599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 07/04/2024] [Accepted: 07/08/2024] [Indexed: 07/28/2024] Open
Abstract
Pre-treatment genotyping of four well-characterized toxicity risk-variants in the dihydropyrimidine dehydrogenase gene (DPYD) has been widely implemented in Europe to prevent serious adverse effects in cancer patients treated with fluoropyrimidines. Current genotyping practices are largely limited to selected commonly studied variants and are unable to determine phasing when more than one variant allele is detected. Recent evidence indicates that common DPYD variants modulate the functional impact of deleterious variants in a phase-dependent manner, where a cis- or a trans-configuration translates into different toxicity risks and dosing recommendations. DPYD is a large gene with 23 exons spanning nearly a mega-base of DNA, making it a challenging candidate for full-gene sequencing in the diagnostic setting. Herein, we present a time- and cost-efficient long-read sequencing approach for capturing the complete coding region of DPYD. We demonstrate that this method can reliably produce phased genotypes, overcoming a major limitation with current methods. This method was validated using 21 subjects, including two cancer patients, each of whom carried multiple DPYD variants. Genotype assignments showed complete concordance with conventional approaches. Furthermore, we demonstrate that the method is robust to technical challenges inherent in long-range sequencing of PCR products, including reference alignment bias and PCR chimerism.
Collapse
Affiliation(s)
- Alisa Ambrodji
- Department of Clinical Chemistry, Inselspital, University Hospital of Bern, University of Bern, INO-F, 3010 Bern, Switzerland; (A.A.); (A.S.); (U.A.)
- Graduate School for Cellular and Biomedical Sciences, University of Bern, 3012 Bern, Switzerland
| | - Angélique Sadlon
- Department of Clinical Chemistry, Inselspital, University Hospital of Bern, University of Bern, INO-F, 3010 Bern, Switzerland; (A.A.); (A.S.); (U.A.)
| | - Ursula Amstutz
- Department of Clinical Chemistry, Inselspital, University Hospital of Bern, University of Bern, INO-F, 3010 Bern, Switzerland; (A.A.); (A.S.); (U.A.)
| | - Dennis Hoch
- Department of Medical Oncology, Inselspital, University Hospital of Bern, 3010 Bern, Switzerland; (D.H.); (M.D.B.)
| | - Martin D. Berger
- Department of Medical Oncology, Inselspital, University Hospital of Bern, 3010 Bern, Switzerland; (D.H.); (M.D.B.)
| | - Sara Bastian
- Department of Medical Oncology, Cantonal Hospital Graubünden, 7000 Chur, Switzerland;
| | - Steven M. Offer
- Department of Pathology, Carver College of Medicine, University of Iowa, Iowa City, IA 52242, USA;
| | - Carlo R. Largiadèr
- Department of Clinical Chemistry, Inselspital, University Hospital of Bern, University of Bern, INO-F, 3010 Bern, Switzerland; (A.A.); (A.S.); (U.A.)
| |
Collapse
|
2
|
Lassen FH, Venkatesh SS, Baya N, Hill B, Zhou W, Bloemendal A, Neale BM, Kessler BM, Whiffin N, Lindgren CM, Palmer DS. Exome-wide evidence of compound heterozygous effects across common phenotypes in the UK Biobank. CELL GENOMICS 2024; 4:100602. [PMID: 38944039 DOI: 10.1016/j.xgen.2024.100602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 03/11/2024] [Accepted: 06/07/2024] [Indexed: 07/01/2024]
Abstract
The phenotypic impact of compound heterozygous (CH) variation has not been investigated at the population scale. We phased rare variants (MAF ∼0.001%) in the UK Biobank (UKBB) exome-sequencing data to characterize recessive effects in 175,587 individuals across 311 common diseases. A total of 6.5% of individuals carry putatively damaging CH variants, 90% of which are only identifiable upon phasing rare variants (MAF < 0.38%). We identify six recessive gene-trait associations (p < 1.68 × 10-7) after accounting for relatedness, polygenicity, nearby common variants, and rare variant burden. Of these, just one is discovered when considering homozygosity alone. Using longitudinal health records, we additionally identify and replicate a novel association between bi-allelic variation in ATP2C2 and an earlier age at onset of chronic obstructive pulmonary disease (COPD) (p < 3.58 × 10-8). Genetic phase contributes to disease risk for gene-trait pairs: ATP2C2-COPD (p = 0.000238), FLG-asthma (p = 0.00205), and USH2A-visual impairment (p = 0.0084). We demonstrate the power of phasing large-scale genetic cohorts to discover phenome-wide consequences of compound heterozygosity.
Collapse
Affiliation(s)
- Frederik H Lassen
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK; Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.
| | - Samvida S Venkatesh
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK; Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
| | - Nikolas Baya
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK; Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
| | - Barney Hill
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
| | - Wei Zhou
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Analytical and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Alex Bloemendal
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Novo Nordisk Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Benjamin M Neale
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Analytical and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Benedikt M Kessler
- Target Discovery Institute, Centre for Medicines Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Nicola Whiffin
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK; Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Cecilia M Lindgren
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK; Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK; Nuffield Department of Population Health, Medical Sciences Division, University of Oxford, Oxford, UK.
| | - Duncan S Palmer
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Nuffield Department of Population Health, Medical Sciences Division, University of Oxford, Oxford, UK.
| |
Collapse
|
3
|
Lai J, Yang Y, Liu Y, Scharpf RB, Karchin R. Assessing the merits: an opinion on the effectiveness of simulation techniques in tumor subclonal reconstruction. BIOINFORMATICS ADVANCES 2024; 4:vbae094. [PMID: 38948008 PMCID: PMC11213631 DOI: 10.1093/bioadv/vbae094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Revised: 05/28/2024] [Accepted: 06/15/2024] [Indexed: 07/02/2024]
Abstract
Summary Neoplastic tumors originate from a single cell, and their evolution can be traced through lineages characterized by mutations, copy number alterations, and structural variants. These lineages are reconstructed and mapped onto evolutionary trees with algorithmic approaches. However, without ground truth benchmark sets, the validity of an algorithm remains uncertain, limiting potential clinical applicability. With a growing number of algorithms available, there is urgent need for standardized benchmark sets to evaluate their merits. Benchmark sets rely on in silico simulations of tumor sequence, but there are no accepted standards for simulation tools, presenting a major obstacle to progress in this field. Availability and implementation All analysis done in the paper was based on publicly available data from the publication of each accessed tool.
Collapse
Affiliation(s)
- Jiaying Lai
- Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD 21218, United States
| | - Yi Yang
- Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD 21218, United States
| | - Yunzhou Liu
- Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD 21218, United States
| | - Robert B Scharpf
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21231, United States
- Department of Oncology, Johns Hopkins Medical Institutions, Baltimore, MD 21231, United States
| | - Rachel Karchin
- Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD 21218, United States
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21231, United States
- Department of Oncology, Johns Hopkins Medical Institutions, Baltimore, MD 21231, United States
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, United States
| |
Collapse
|
4
|
Mlakar V, Dupanloup I, Gloor Y, Ansari M. Haplotype Inference Using Long-Read Nanopore Sequencing: Application to GSTA1 Promoter. Mol Biotechnol 2024:10.1007/s12033-024-01213-7. [PMID: 38886308 DOI: 10.1007/s12033-024-01213-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 06/03/2024] [Indexed: 06/20/2024]
Abstract
Recovering true haplotypes can have important clinical consequences. The laboratory process is difficult and is, therefore, most often done through inference. In this paper, we show that when using the Oxford nanopore sequencing technology, we could recover the true haplotypes of the GSTA1 promoter region. Eight LCL cell lines with potentially ambiguous haplotypes were used to characterize the efficacy of Oxford nanopore sequencing to phase the correct GSTA1 promoter haplotypes. The results were compared to Sanger sequencing and inferred haplotypes in the 1000 genomes project. The average read length was 813 bp out of a total PCR length of 1336 bp. The best coverage of sequencing was in the middle of the PCR product and decreased to 50% at the PCR ends. SNPs separated by less than 200 bp showed > 90% of correct haplotypes, while at the distance of 1089 bp, this proportion still exceeded 58%. The number of cycles influences the generation of hybrid haplotypes but not extension or annealing time. The results demonstrate that this long sequencing reads methodology, can accurately determine the haplotypes without the need for inference. The technology proved to be robust but the success of phasing nonetheless depends on the distances and frequencies of SNPs.
Collapse
Affiliation(s)
- Vid Mlakar
- CANSEARCH Research Laboratory, Geneva University Medical School, Rue Michel Servet 1, 1211, Geneva, Switzerland.
| | - Isabelle Dupanloup
- CANSEARCH Research Laboratory, Geneva University Medical School, Rue Michel Servet 1, 1211, Geneva, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Yvonne Gloor
- CANSEARCH Research Laboratory, Geneva University Medical School, Rue Michel Servet 1, 1211, Geneva, Switzerland
| | - Marc Ansari
- CANSEARCH Research Laboratory, Geneva University Medical School, Rue Michel Servet 1, 1211, Geneva, Switzerland
- Onco-Hematology Unit, Pediatric Department, Geneva University Hospital, Rue Willy-Donzé 6, 1205, Geneva, Switzerland
| |
Collapse
|
5
|
Shelton WJ, Zandpazandi S, Nix JS, Gokden M, Bauer M, Ryan KR, Wardell CP, Vaske OM, Rodriguez A. Long-read sequencing for brain tumors. Front Oncol 2024; 14:1395985. [PMID: 38915364 PMCID: PMC11194609 DOI: 10.3389/fonc.2024.1395985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 05/27/2024] [Indexed: 06/26/2024] Open
Abstract
Brain tumors and genomics have a long-standing history given that glioblastoma was the first cancer studied by the cancer genome atlas. The numerous and continuous advances through the decades in sequencing technologies have aided in the advanced molecular characterization of brain tumors for diagnosis, prognosis, and treatment. Since the implementation of molecular biomarkers by the WHO CNS in 2016, the genomics of brain tumors has been integrated into diagnostic criteria. Long-read sequencing, also known as third generation sequencing, is an emerging technique that allows for the sequencing of longer DNA segments leading to improved detection of structural variants and epigenetics. These capabilities are opening a way for better characterization of brain tumors. Here, we present a comprehensive summary of the state of the art of third-generation sequencing in the application for brain tumor diagnosis, prognosis, and treatment. We discuss the advantages and potential new implementations of long-read sequencing into clinical paradigms for neuro-oncology patients.
Collapse
Affiliation(s)
- William J. Shelton
- Department of Neurosurgery, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Sara Zandpazandi
- Department of Neurosurgery, Medical University of South Carolina, Charleston, SC, United States
| | - J Stephen Nix
- Department of Pathology, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Murat Gokden
- Department of Pathology, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Michael Bauer
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Katie Rose Ryan
- Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Christopher P. Wardell
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Olena Morozova Vaske
- Department of Molecular, Cell and Developmental Biology, University of California Santa Cruz, Santa Cruz, CA, United States
| | - Analiz Rodriguez
- Department of Neurosurgery, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| |
Collapse
|
6
|
Cho WK. Commentary on "Long-read next-generation sequencing for molecular diagnosis of pediatric endocrine disorders". Ann Pediatr Endocrinol Metab 2024; 29:141. [PMID: 38956750 PMCID: PMC11220394 DOI: 10.6065/apem.24224014edi03] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 07/04/2024] Open
Affiliation(s)
- Won Kyoung Cho
- Department of Pediatrics, St. Vincent’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea
| |
Collapse
|
7
|
Matoute A, Maestri S, Saout M, Laghoe L, Simon S, Blanquart H, Hernandez Martinez MA, Pierre Demar M. Meat-Borne-Parasite: A Nanopore-Based Meta-Barcoding Work-Flow for Parasitic Microbiodiversity Assessment in the Wild Fauna of French Guiana. Curr Issues Mol Biol 2024; 46:3810-3821. [PMID: 38785505 PMCID: PMC11119736 DOI: 10.3390/cimb46050237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 03/06/2024] [Accepted: 04/03/2024] [Indexed: 05/25/2024] Open
Abstract
French Guiana, located in the Guiana Shield, is a natural reservoir for many zoonotic pathogens that are of considerable medical or veterinary importance. Until now, there has been limited data available on the description of parasites circulating in this area, especially on protozoan belonging to the phylum Apicomplexa; conversely, the neighbouring countries describe a high parasitic prevalence in animals and humans. Epidemiological surveillance is necessary, as new potentially virulent strains may emerge from these forest ecosystems, such as Amazonian toxoplasmosis. However, there is no standard tool for detecting protozoa in wildlife. In this study, we developed Meat-Borne-Parasite, a high-throughput meta-barcoding workflow for detecting Apicomplexa based on the Oxford Nanopore Technologies sequencing platform using the 18S gene of 14 Apicomplexa positive samples collected in French Guiana. Sequencing reads were then analysed with MetONTIIME pipeline. Thanks to a scoring rule, we were able to classify 10 samples out of 14 as Apicomplexa positive and reveal the presence of co-carriages. The same samples were also sequenced with the Illumina platform for validation purposes. For samples identified as Apicomplexa positive by both platforms, a strong positive correlation at up to the genus level was reported. Overall, the presented workflow represents a reliable method for Apicomplexa detection, which may pave the way for more comprehensive biomonitoring of zoonotic pathogens.
Collapse
Affiliation(s)
- Adria Matoute
- Tropical Biome and Immunopathophysiology (TBIP), Université de Guyane, 97300 Cayenne, France; (A.M.); (S.M.); (M.S.); (L.L.); (S.S.)
- U1019-UMR 9017-CIIL-Center for Infection and Immunity of Lille, Institut Pasteur de Lille, CHU Lille, INSERM, CNRS, Université Lille, 59000 Lille, France
| | - Simone Maestri
- Tropical Biome and Immunopathophysiology (TBIP), Université de Guyane, 97300 Cayenne, France; (A.M.); (S.M.); (M.S.); (L.L.); (S.S.)
| | - Mona Saout
- Tropical Biome and Immunopathophysiology (TBIP), Université de Guyane, 97300 Cayenne, France; (A.M.); (S.M.); (M.S.); (L.L.); (S.S.)
- U1019-UMR 9017-CIIL-Center for Infection and Immunity of Lille, Institut Pasteur de Lille, CHU Lille, INSERM, CNRS, Université Lille, 59000 Lille, France
| | - Laure Laghoe
- Tropical Biome and Immunopathophysiology (TBIP), Université de Guyane, 97300 Cayenne, France; (A.M.); (S.M.); (M.S.); (L.L.); (S.S.)
- U1019-UMR 9017-CIIL-Center for Infection and Immunity of Lille, Institut Pasteur de Lille, CHU Lille, INSERM, CNRS, Université Lille, 59000 Lille, France
| | - Stéphane Simon
- Tropical Biome and Immunopathophysiology (TBIP), Université de Guyane, 97300 Cayenne, France; (A.M.); (S.M.); (M.S.); (L.L.); (S.S.)
- U1019-UMR 9017-CIIL-Center for Infection and Immunity of Lille, Institut Pasteur de Lille, CHU Lille, INSERM, CNRS, Université Lille, 59000 Lille, France
| | | | - Miguel Angel Hernandez Martinez
- Laboratoire Associé du CNR Leishmaniose, Laboratoire Hospitalo-Universitaire de Parasitologie et Mycologie, Centre Hospitalier Andrée Rosemon, 97300 Cayenne, France;
| | - Magalie Pierre Demar
- Tropical Biome and Immunopathophysiology (TBIP), Université de Guyane, 97300 Cayenne, France; (A.M.); (S.M.); (M.S.); (L.L.); (S.S.)
- U1019-UMR 9017-CIIL-Center for Infection and Immunity of Lille, Institut Pasteur de Lille, CHU Lille, INSERM, CNRS, Université Lille, 59000 Lille, France
- Laboratoire Associé du CNR Leishmaniose, Laboratoire Hospitalo-Universitaire de Parasitologie et Mycologie, Centre Hospitalier Andrée Rosemon, 97300 Cayenne, France;
| |
Collapse
|
8
|
Heath H, Peng S, Szmatola T, Ryan S, Bellone R, Kalbfleisch T, Petersen J, Finno C. A Comprehensive Allele Specific Expression Resource for the Equine Transcriptome. RESEARCH SQUARE 2024:rs.3.rs-4182812. [PMID: 38645140 PMCID: PMC11030527 DOI: 10.21203/rs.3.rs-4182812/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Background Allele-specific expression (ASE) analysis provides a nuanced view of cis-regulatory mechanisms affecting gene expression. Results An equine ASE analysis was performed, using integrated Iso-seq and short-read RNA sequencing data from four healthy Thoroughbreds (2 mares and 2 stallions) across 9 tissues from the Functional Annotation of Animal Genomes (FAANG) project. Allele expression was quantified by haplotypes from long-read data, with 42,900 allele expression events compared. Within these events, 635 (1.48%) demonstrated ASE, with liver tissue containing the highest proportion. Genetic variants within ASE events were in histone modified regions 64.2% of the time. Validation of allele-specific variants, using a set of 66 equine liver samples from multiple breeds, confirmed that 97% of variants demonstrated ASE. Conclusions This valuable publicly accessible resource is poised to facilitate investigations into regulatory variation in equine tissues. Our results highlight the tissue-specific nature of allelic imbalance in the equine genome.
Collapse
|
9
|
Lai J, Liu Y, Scharpf RB, Karchin R. Evaluation of simulation methods for tumor subclonal reconstruction. ARXIV 2024:arXiv:2402.09599v1. [PMID: 38410652 PMCID: PMC10896360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 02/28/2024]
Abstract
Most neoplastic tumors originate from a single cell, and their evolution can be genetically traced through lineages characterized by common alterations such as small somatic mutations (SSMs), copy number alterations (CNAs), structural variants (SVs), and aneuploidies. Due to the complexity of these alterations in most tumors and the errors introduced by sequencing protocols and calling algorithms, tumor subclonal reconstruction algorithms are necessary to recapitulate the DNA sequence composition and tumor evolution in silico. With a growing number of these algorithms available, there is a pressing need for consistent and comprehensive benchmarking, which relies on realistic tumor sequencing generated by simulation tools. Here, we examine the current simulation methods, identifying their strengths and weaknesses, and provide recommendations for their improvement. Our review also explores potential new directions for research in this area. This work aims to serve as a resource for understanding and enhancing tumor genomic simulations, contributing to the advancement of the field.
Collapse
Affiliation(s)
- Jiaying Lai
- Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD
| | - Yunzhou Liu
- Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD
| | - Robert B. Scharpf
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD
- Department of Oncology, Johns Hopkins Medical Institutions, Baltimore, MD
| | - Rachel Karchin
- Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD
- Department of Oncology, Johns Hopkins Medical Institutions, Baltimore, MD
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD
| |
Collapse
|
10
|
Heath HD, Peng S, Szmatola T, Bellone RR, Kalbfleisch T, Petersen JL, Finno CJ. A Comprehensive Allele Specific Expression Resource for the Equine Transcriptome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.31.573798. [PMID: 38260378 PMCID: PMC10802363 DOI: 10.1101/2023.12.31.573798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Background Allele-specific expression (ASE) analysis provides a nuanced view of cis-regulatory mechanisms affecting gene expression. Results In this work, we introduce and highlight the significance of an equine ASE analysis, containing integrated long- and short-read RNA sequencing data, along with insight from histone modification data, from four healthy Thoroughbreds (2 mares and 2 stallions) across 9 tissues. Conclusions This valuable publicly accessible resource is poised to facilitate investigations into regulatory variation in equine tissues and foster a deeper understanding of the impact of allelic imbalance in equine health and disease at the molecular level.
Collapse
|
11
|
Tarquini G, Maestri S, Ermacora P, Martini M. The Oxford Nanopore MinION as a Versatile Technology for the Diagnosis and Characterization of Emerging Plant Viruses. Methods Mol Biol 2024; 2732:235-249. [PMID: 38060129 DOI: 10.1007/978-1-0716-3515-5_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
The emergence of novel viral epidemics that could affect major crops represents a serious threat to global food security. The early and accurate identification of the causative viral agent is the most important step for a rapid and effective response to disease outbreaks. Over the last years, the Oxford Nanopore Technologies (ONT) MinION sequencer has been proposed as an effective diagnostic tool for the early detection and identification of emerging viruses in plants, providing many advantages compared with different high-throughput sequencing (HTS) technologies. Here, we provide a step-by-step protocol that we optimized to obtain the virome of "Lamon bean" plants (Phaseolus vulgaris L.), an agricultural product with Protected Geographical Indication (PGI) in North-East of Italy, which is frequently subjected to multiple infections caused by different RNA viruses. The conversion of viral RNA in ds-cDNA enabled the use of Genomic DNA Ligation Sequencing Kit and Native Barcoding DNA Kit, which have been originally developed for DNA sequencing. This allowed the simultaneous diagnosis of both DNA- and RNA-based pathogens, providing a more versatile alternative to the use of direct RNA and/or direct cDNA sequencing kits.
Collapse
Affiliation(s)
- Giulia Tarquini
- Department of Agriculture, Food, Environmental and Animal Sciences, University of Udine, Udine, Italy
| | - Simone Maestri
- Center for Genomic Science of IIT@SEMM, Fondazione Istituto Italiano di Tecnologia, Milano, Italy
| | - Paolo Ermacora
- Department of Agriculture, Food, Environmental and Animal Sciences, University of Udine, Udine, Italy
| | - Marta Martini
- Department of Agriculture, Food, Environmental and Animal Sciences, University of Udine, Udine, Italy.
| |
Collapse
|
12
|
Guo MH, Francioli LC, Stenton SL, Goodrich JK, Watts NA, Singer-Berk M, Groopman E, Darnowsky PW, Solomonson M, Baxter S, Tiao G, Neale BM, Hirschhorn JN, Rehm HL, Daly MJ, O'Donnell-Luria A, Karczewski KJ, MacArthur DG, Samocha KE. Inferring compound heterozygosity from large-scale exome sequencing data. Nat Genet 2024; 56:152-161. [PMID: 38057443 PMCID: PMC10872287 DOI: 10.1038/s41588-023-01608-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 11/08/2023] [Indexed: 12/08/2023]
Abstract
Recessive diseases arise when both copies of a gene are impacted by a damaging genetic variant. When a patient carries two potentially causal variants in a gene, accurate diagnosis requires determining that these variants occur on different copies of the chromosome (that is, are in trans) rather than on the same copy (that is, in cis). However, current approaches for determining phase, beyond parental testing, are limited in clinical settings. Here we developed a strategy for inferring phase for rare variant pairs within genes, leveraging genotypes observed in the Genome Aggregation Database (v2, n = 125,748 exomes). Our approach estimates phase with 96% accuracy, both in trio data and in patients with Mendelian conditions and presumed causal compound heterozygous variants. We provide a public resource of phasing estimates for coding variants and counts per gene of rare variants in trans that can aid interpretation of rare co-occurring variants in the context of recessive disease.
Collapse
Affiliation(s)
- Michael H Guo
- Department of Neurology, Hospital of the University of the Pennsylvania, Philadelphia, PA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Laurent C Francioli
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Sarah L Stenton
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Julia K Goodrich
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Nicholas A Watts
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Moriel Singer-Berk
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Emily Groopman
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Philip W Darnowsky
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Matthew Solomonson
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Samantha Baxter
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Grace Tiao
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Benjamin M Neale
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Joel N Hirschhorn
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Departments of Genetics and Pediatrics, Harvard Medical School, Boston, MA, USA
- Division of Endocrinology, Boston Children's Hospital, Boston, MA, USA
- Center for Basic and Translational Obesity Research, Boston Children's Hospital, Boston, MA, USA
| | - Heidi L Rehm
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Mark J Daly
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Institute for Molecular Medicine Finland (FIMM), Helsinki, Finland
| | - Anne O'Donnell-Luria
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Konrad J Karczewski
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Daniel G MacArthur
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Centre for Population Genomics, Garvan Institute of Medical Research and UNSW Sydney, Sydney, New South Wales, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
| | - Kaitlin E Samocha
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
13
|
Dunn T, Narayanasamy S. vcfdist: accurately benchmarking phased small variant calls in human genomes. Nat Commun 2023; 14:8149. [PMID: 38071244 PMCID: PMC10710436 DOI: 10.1038/s41467-023-43876-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 11/22/2023] [Indexed: 12/18/2023] Open
Abstract
Accurately benchmarking small variant calling accuracy is critical for the continued improvement of human whole genome sequencing. In this work, we show that current variant calling evaluations are biased towards certain variant representations and may misrepresent the relative performance of different variant calling pipelines. We propose solutions, first exploring the affine gap parameter design space for complex variant representation and suggesting a standard. Next, we present our tool vcfdist and demonstrate the importance of enforcing local phasing for evaluation accuracy. We then introduce the notion of partial credit for mostly-correct calls and present an algorithm for clustering dependent variants. Lastly, we motivate using alignment distance metrics to supplement precision-recall curves for understanding variant calling performance. We evaluate the performance of 64 phased Truth Challenge V2 submissions and show that vcfdist improves measured insertion and deletion performance consistency across variant representations from R2 = 0.97243 for baseline vcfeval to 0.99996 for vcfdist.
Collapse
Affiliation(s)
- Tim Dunn
- Computer Science and Engineering, University of Michigan, 2260 Hayward Street, Ann Arbor, MI, 48109, USA.
| | - Satish Narayanasamy
- Computer Science and Engineering, University of Michigan, 2260 Hayward Street, Ann Arbor, MI, 48109, USA
| |
Collapse
|
14
|
Guo MH, Francioli LC, Stenton SL, Goodrich JK, Watts NA, Singer-Berk M, Groopman E, Darnowsky PW, Solomonson M, Baxter S, Tiao G, Neale BM, Hirschhorn JN, Rehm HL, Daly MJ, O’Donnell-Luria A, Karczewski KJ, MacArthur DG, Samocha KE. Inferring compound heterozygosity from large-scale exome sequencing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.19.533370. [PMID: 36993580 PMCID: PMC10055215 DOI: 10.1101/2023.03.19.533370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/22/2023]
Abstract
Recessive diseases arise when both the maternal and the paternal copies of a gene are impacted by a damaging genetic variant in the affected individual. When a patient carries two different potentially causal variants in a gene for a given disorder, accurate diagnosis requires determining that these two variants occur on different copies of the chromosome (i.e., are in trans) rather than on the same copy (i.e. in cis). However, current approaches for determining phase, beyond parental testing, are limited in clinical settings. We developed a strategy for inferring phase for rare variant pairs within genes, leveraging genotypes observed in exome sequencing data from the Genome Aggregation Database (gnomAD v2, n=125,748). When applied to trio data where phase can be determined by transmission, our approach estimates phase with 95.7% accuracy and remains accurate even for very rare variants (allele frequency < 1×10-4). We also correctly phase 95.9% of variant pairs in a set of 293 patients with Mendelian conditions carrying presumed causal compound heterozygous variants. We provide a public resource of phasing estimates from gnomAD, including phasing estimates for coding variants across the genome and counts per gene of rare variants in trans, that can aid interpretation of rare co-occurring variants in the context of recessive disease.
Collapse
Affiliation(s)
- Michael H. Guo
- Department of Neurology, Hospital of the University of the Pennsylvania, Philadelphia, PA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Laurent C. Francioli
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Sarah L. Stenton
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA
| | - Julia K. Goodrich
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Nicholas A. Watts
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Moriel Singer-Berk
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Emily Groopman
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA
| | - Philip W. Darnowsky
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Matthew Solomonson
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Samantha Baxter
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Grace Tiao
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Benjamin M. Neale
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Joel N. Hirschhorn
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Division of Endocrinology, Boston Children’s Hospital, Boston, MA, USA
- Center for Basic and Translational Obesity Research, Boston Children’s Hospital, Boston, MA, USA
| | - Heidi L. Rehm
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Mark J. Daly
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Institute for Molecular Medicine Finland, (FIMM) Helsinki, Finland
| | - Anne O’Donnell-Luria
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Konrad J. Karczewski
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Daniel G. MacArthur
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Centre for Population Genomics, Garvan Institute of Medical Research, UNSW Sydney, Sydney, Australia
- Centre for Population Genomics, Murdoch Children’s Research Institute, Melbourne, Australia
| | - Kaitlin E. Samocha
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
15
|
Calvo-Roitberg E, Daniels RF, Pai AA. Challenges in identifying mRNA transcript starts and ends from long-read sequencing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.26.550536. [PMID: 37546743 PMCID: PMC10402045 DOI: 10.1101/2023.07.26.550536] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Long-read sequencing (LRS) technologies have the potential to revolutionize scientific discoveries in RNA biology, especially by enabling the comprehensive identification and quantification of full length mRNA isoforms. However, inherently high error rates make the analysis of long-read sequencing data challenging. While these error rates have been characterized for sequence and splice site identification, it is still unclear how accurately LRS reads represent transcript start and end sites. Here, we systematically assess the variability and accuracy of mRNA terminal ends identified by LRS reads across multiple sequencing platforms. We find substantial inconsistencies in both the start and end coordinates of LRS reads spanning a gene, such that LRS reads often fail to accurately recapitulate annotated or empirically derived terminal ends of mRNA molecules. To address this challenge, we introduce an approach to condition reads based on empirically derived terminal ends and identified a subset of reads that are more likely to represent full-length transcripts. Our approach can improve transcriptome analyses by enhancing the fidelity of transcript terminal end identification, but may result in lower power to quantify genes or discover novel isoforms. Thus, it is necessary to be cautious when selecting sequencing approaches and/or interpreting data from long-read RNA sequencing.
Collapse
Affiliation(s)
| | - Rachel F Daniels
- RNA Therapeutics Institute, University of Massachusetts Chan Medical School, Worcester, MA
| | - Athma A Pai
- RNA Therapeutics Institute, University of Massachusetts Chan Medical School, Worcester, MA
| |
Collapse
|
16
|
Lassen FH, Venkatesh SS, Baya N, Zhou W, Bloemendal A, Neale BM, Kessler BM, Whiffin N, Lindgren CM, Palmer DS. Exome-wide evidence of compound heterozygous effects across common phenotypes in the UK Biobank. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.06.29.23291992. [PMID: 37461573 PMCID: PMC10350147 DOI: 10.1101/2023.06.29.23291992] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/27/2023]
Abstract
Exome-sequencing association studies have successfully linked rare protein-coding variation to risk of thousands of diseases. However, the relationship between rare deleterious compound heterozygous (CH) variation and their phenotypic impact has not been fully investigated. Here, we leverage advances in statistical phasing to accurately phase rare variants (MAF ~ 0.001%) in exome sequencing data from 175,587 UK Biobank (UKBB) participants, which we then systematically annotate to identify putatively deleterious CH coding variation. We show that 6.5% of individuals carry such damaging variants in the CH state, with 90% of variants occurring at MAF < 0.34%. Using a logistic mixed model framework, systematically accounting for relatedness, polygenic risk, nearby common variants, and rare variant burden, we investigate recessive effects in common complex diseases. We find six exome-wide significant (P < 1.68 × 10 - 7 ) and 17 nominally significant (P < 5.25 × 10 - 5 ) gene-trait associations. Among these, only four would have been identified without accounting for CH variation in the gene. We further incorporate age-at-diagnosis information from primary care electronic health records, to show that genetic phase influences lifetime risk of disease across 20 gene-trait combinations (FDR < 5%). Using a permutation approach, we find evidence for genetic phase contributing to disease susceptibility for a collection of gene-trait pairs, including FLG-asthma (P = 0.00205 ) and USH2A-visual impairment (P = 0.0084 ). Taken together, we demonstrate the utility of phasing large-scale genetic sequencing cohorts for robust identification of the phenome-wide consequences of compound heterozygosity.
Collapse
Affiliation(s)
- Frederik H. Lassen
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| | - Samvida S. Venkatesh
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| | - Nikolas Baya
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| | - Wei Zhou
- Program in Medical and Population Genetics Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytical and Translational Genetics Unit, Department of Medicine Massachusetts General Hospital, Boston, MA, USA
| | - Alex Bloemendal
- Program in Medical and Population Genetics Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Novo Nordisk Center for Genomic Mechanisms of Disease Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Data Sciences Platform Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Benjamin M. Neale
- Program in Medical and Population Genetics Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytical and Translational Genetics Unit, Department of Medicine Massachusetts General Hospital, Boston, MA, USA
| | - Benedikt M. Kessler
- Target Discovery Institute, Centre for Medicines Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Nicola Whiffin
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
- Program in Medical and Population Genetics Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Cecilia M. Lindgren
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
- Nuffield Department of Population Health Health, Medical Sciences Division University of Oxford, Oxford, United Kingdom
| | - Duncan S. Palmer
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
17
|
CRISPR/Cas9-Mediated Enrichment Coupled to Nanopore Sequencing Provides a Valuable Tool for the Precise Reconstruction of Large Genomic Target Regions. Int J Mol Sci 2023; 24:ijms24021076. [PMID: 36674592 PMCID: PMC9863143 DOI: 10.3390/ijms24021076] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 12/23/2022] [Accepted: 12/24/2022] [Indexed: 01/09/2023] Open
Abstract
Complete and accurate identification of genetic variants associated with specific phenotypes can be challenging when there is a high level of genomic divergence between individuals in a study and the corresponding reference genome. We have applied the Cas9-mediated enrichment coupled to nanopore sequencing to perform a targeted de novo assembly and accurately reconstruct a genomic region of interest. This approach was used to reconstruct a 250-kbp target region on chromosome 5 of the common bean genome (Phaseolus vulgaris) associated with the shattering phenotype. Comparing a non-shattering cultivar (Midas) with the reference genome revealed many single-nucleotide variants and structural variants in this region. We cut five 50-kbp tiled sub-regions of Midas genomic DNA using Cas9, followed by sequencing on a MinION device and de novo assembly, generating a single contig spanning the whole 250-kbp region. This assembly increased the number of Illumina reads mapping to genes in the region, improving their genotypability for downstream analysis. The Cas9 tiling approach for target enrichment and sequencing is a valuable alternative to whole-genome sequencing for the assembly of ultra-long regions of interest, improving the accuracy of downstream genotype-phenotype association analysis.
Collapse
|
18
|
Conlin LK, Aref-Eshghi E, McEldrew DA, Luo M, Rajagopalan R. Long-read sequencing for molecular diagnostics in constitutional genetic disorders. Hum Mutat 2022; 43:1531-1544. [PMID: 36086952 PMCID: PMC9561063 DOI: 10.1002/humu.24465] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 09/03/2022] [Accepted: 09/06/2022] [Indexed: 11/11/2022]
Abstract
Long-read sequencing (LRS) has been around for more than a decade, but widespread adoption of the technology has been slow due to the perceived high error rates and high sequencing cost. This is changing due to the recent advancements to produce highly accurate sequences and the reducing costs. LRS promises significant improvement over short read sequencing in four major areas: (1) better detection of structural variation (2) better resolution of highly repetitive or nonunique regions (3) accurate long-range haplotype phasing and (4) the detection of base modifications natively from the sequencing data. Several successful applications of LRS have demonstrated its ability to resolve molecular diagnoses where short-read sequencing fails to identify a cause. However, the argument for increased diagnostic yield from LRS remains to be validated. Larger cohort studies may be required to establish the realistic boundaries of LRS's clinical utility and analytical validity, as well as the development of standards for clinical applications. We discuss the limitations of the current standard of care, and contrast with the applications and advantages of two major LRS platforms, PacBio and Oxford Nanopore, for molecular diagnostics of constitutional disorders, and present a critical argument about the potential of LRS in diagnostic settings.
Collapse
Affiliation(s)
- Laura K. Conlin
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Erfan Aref-Eshghi
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104
| | - Deborah A. McEldrew
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104
| | - Minjie Luo
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Ramakrishnan Rajagopalan
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| |
Collapse
|
19
|
Nanopore Sequencing for De Novo Bacterial Genome Assembly and Search for Single-Nucleotide Polymorphism. Int J Mol Sci 2022; 23:ijms23158569. [PMID: 35955702 PMCID: PMC9369328 DOI: 10.3390/ijms23158569] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 07/28/2022] [Accepted: 07/30/2022] [Indexed: 11/17/2022] Open
Abstract
Nanopore sequencing (ONT) is a new and rapidly developing method for determining nucleotide sequences in DNA and RNA. It serves the ability to obtain long reads of thousands of nucleotides without assembly and amplification during sequencing compared to next-generation sequencing. Nanopore sequencing can help for determination of genetic changes leading to antibiotics resistance. This study presents the application of ONT technology in the assembly of an E. coli genome characterized by a deletion of the tolC gene and known single-nucleotide variations leading to antibiotic resistance, in the absence of a reference genome. We performed benchmark studies to determine minimum coverage depth to obtain a complete genome, depending on the quality of the ONT data. A comparison of existing programs was carried out. It was shown that the Flye program demonstrates plausible assembly results relative to others (Shasta, Canu, and Necat). The required coverage depth for successful assembly strongly depends on the size of reads. When using high-quality samples with an average read length of 8 Kbp or more, the coverage depth of 30× is sufficient to assemble the complete genome de novo and reliably determine single-nucleotide variations in it. For samples with shorter reads with mean lengths of 2 Kbp, a higher coverage depth of 50× is required. Avoiding of mechanical mixing is obligatory for samples preparation. Nanopore sequencing can be used alone to determine antibiotics-resistant genetic features of bacterial strains.
Collapse
|
20
|
Grosz BR, Stevanovski I, Negri S, Ellis M, Barnes S, Reddel S, Vucic S, Nicholson GA, Cortese A, Kumar KR, Deveson IW, Kennerson ML. Long read sequencing overcomes challenges in the diagnosis of
SORD
neuropathy. J Peripher Nerv Syst 2022; 27:120-126. [DOI: 10.1111/jns.12485] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 02/13/2022] [Accepted: 02/22/2022] [Indexed: 10/19/2022]
Affiliation(s)
- Bianca R Grosz
- Northcott Neuroscience Laboratory ANZAC Research Institute Concord NSW Australia
| | - Igor Stevanovski
- Kinghorn Centre for Clinical Genomics Garvan Institute of Medical Research Sydney NSW Australia
| | - Sara Negri
- Istituiti Clinici Scientifici Maugeri IRCCS Environmental Research Center Pavia Italy
| | - Melina Ellis
- Northcott Neuroscience Laboratory ANZAC Research Institute Concord NSW Australia
- Sydney Medical School University of Sydney Camperdown NSW Australia
| | - Stephanie Barnes
- Sydney Medical School University of Sydney Camperdown NSW Australia
- Department of Neurology Concord Repatriation General Hospital Concord NSW Australia
- Faculty of Medicine University of Notre Dame Sydney Australia
- Department of Neurology Hornsby Ku‐ring‐Gai Hospital Sydney Australia
| | - Stephen Reddel
- Sydney Medical School University of Sydney Camperdown NSW Australia
- Department of Neurology Concord Repatriation General Hospital Concord NSW Australia
| | - Steve Vucic
- Sydney Medical School University of Sydney Camperdown NSW Australia
- Department of Neurology Concord Repatriation General Hospital Concord NSW Australia
| | - Garth A Nicholson
- Sydney Medical School University of Sydney Camperdown NSW Australia
- Department of Neurology Concord Repatriation General Hospital Concord NSW Australia
- Molecular Medicine Laboratory Concord Repatriation General Hospital Concord NSW Australia
| | - Andrea Cortese
- MRC Centre for Neuromuscular Diseases, Department of Neuromuscular Diseases UCL Queen Square Institute of Neurology London UK
- Department of Brain and Behavioral Sciences University of Pavia Pavia Italy
| | - Kishore R Kumar
- Kinghorn Centre for Clinical Genomics Garvan Institute of Medical Research Sydney NSW Australia
- Sydney Medical School University of Sydney Camperdown NSW Australia
- Department of Neurology Concord Repatriation General Hospital Concord NSW Australia
- Molecular Medicine Laboratory Concord Repatriation General Hospital Concord NSW Australia
| | - Ira W Deveson
- Kinghorn Centre for Clinical Genomics Garvan Institute of Medical Research Sydney NSW Australia
- St Vincent’s Clinical School University of New South Wales Sydney NSW Australia
| | - Marina L Kennerson
- Northcott Neuroscience Laboratory ANZAC Research Institute Concord NSW Australia
- Sydney Medical School University of Sydney Camperdown NSW Australia
- Molecular Medicine Laboratory Concord Repatriation General Hospital Concord NSW Australia
| |
Collapse
|
21
|
Maestri S, Grosso V, Alfano M, Lavezzari D, Piubelli C, Bisoffi Z, Rossato M, Delledonne M. STArS (STrain-Amplicon-Seq), a targeted nanopore sequencing workflow for SARS-CoV-2 diagnostics and genotyping. Biol Methods Protoc 2022; 7:bpac020. [PMID: 36046362 PMCID: PMC9422081 DOI: 10.1093/biomethods/bpac020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 08/12/2022] [Indexed: 11/13/2022] Open
Abstract
Diagnostic tests based on reverse transcription–quantitative polymerase chain reaction (RT–qPCR) are the gold standard approach to detect severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection from clinical specimens. However, unless specifically optimized, this method is usually unable to recognize the specific viral strain responsible of coronavirus disease 2019, a crucial information that is proving increasingly important in relation to virus spread and treatment effectiveness. Even if some RT–qPCR commercial assays are currently being developed for the detection of viral strains, they focus only on single/few genetic variants that may not be sufficient to uniquely identify a specific strain. Therefore, genome sequencing approaches remain the most comprehensive solution for virus genotyping and to recognize viral strains, but their application is much less widespread due to higher costs. Starting from the well-established ARTIC protocol coupled to nanopore sequencing, in this work, we developed STArS (STrain-Amplicon-Seq), a cost/time-effective sequencing-based workflow for both SARS-CoV-2 diagnostics and genotyping. A set of 10 amplicons was initially selected from the ARTIC tiling panel, to cover: (i) all the main biologically relevant genetic variants located on the Spike gene; (ii) a minimal set of variants to uniquely identify the currently circulating strains; (iii) genomic sites usually amplified by RT–qPCR method to identify SARS-CoV-2 presence. PCR-amplified clinical samples (both positive and negative for SARS-CoV-2 presence) were pooled together with a serially diluted exogenous amplicon at known concentration and sequenced on a MinION device. Thanks to a scoring rule, STArS had the capability to accurately classify positive samples in agreement with RT–qPCR results, both at the qualitative and quantitative level. Moreover, the method allowed to effectively genotype strain-specific variants and thus also return the phylogenetic classification of SARS-CoV-2-postive samples. Thanks to the reduced turnaround time and costs, the proposed approach represents a step towards simplifying the clinical application of sequencing for viral genotyping, hopefully aiding in combatting the global pandemic.
Collapse
Affiliation(s)
- Simone Maestri
- Department of Biotechnology, University of Verona , 37134 Verona, Italy
- Center for Genomic Science of IIT@SEMM, Fondazione Istituto Italiano di Tecnologia , 20139 Milano, Italy
| | - Valentina Grosso
- Department of Biotechnology, University of Verona , 37134 Verona, Italy
| | | | - Denise Lavezzari
- Department of Biotechnology, University of Verona , 37134 Verona, Italy
| | - Chiara Piubelli
- Department of Infectious, Tropical Diseases and Microbiology, IRCCS Sacro Cuore Don Calabria Hospital , 37024 Verona, Italy
| | - Zeno Bisoffi
- Department of Infectious, Tropical Diseases and Microbiology, IRCCS Sacro Cuore Don Calabria Hospital , 37024 Verona, Italy
- Department of Diagnostics and Public Health, University of Verona , 37134, Verona, Italy
| | - Marzia Rossato
- Department of Biotechnology, University of Verona , 37134 Verona, Italy
- Genartis srl , 37126 Verona, Italy
| | - Massimo Delledonne
- Department of Biotechnology, University of Verona , 37134 Verona, Italy
- Genartis srl , 37126 Verona, Italy
| |
Collapse
|
22
|
Braicu C. Functional Genomics in Health and Disease. Int J Mol Sci 2021; 22:ijms222312944. [PMID: 34884749 PMCID: PMC8657478 DOI: 10.3390/ijms222312944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 11/17/2021] [Indexed: 11/25/2022] Open
Affiliation(s)
- Cornelia Braicu
- Research Center for Functional Genomics, Biomedicine and Translational Medicine, Iuliu Hatieganu University of Medicine and Pharmacy, 23 Marinescu Street, 400337 Cluj-Napoca, Romania
| |
Collapse
|
23
|
Bhat JA, Yu D, Bohra A, Ganie SA, Varshney RK. Features and applications of haplotypes in crop breeding. Commun Biol 2021; 4:1266. [PMID: 34737387 PMCID: PMC8568931 DOI: 10.1038/s42003-021-02782-y] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Accepted: 10/09/2021] [Indexed: 12/17/2022] Open
Abstract
Climate change with altered pest-disease dynamics and rising abiotic stresses threatens resource-constrained agricultural production systems worldwide. Genomics-assisted breeding (GAB) approaches have greatly contributed to enhancing crop breeding efficiency and delivering better varieties. Fast-growing capacity and affordability of DNA sequencing has motivated large-scale germplasm sequencing projects, thus opening exciting avenues for mining haplotypes for breeding applications. This review article highlights ways to mine haplotypes and apply them for complex trait dissection and in GAB approaches including haplotype-GWAS, haplotype-based breeding, haplotype-assisted genomic selection. Improvement strategies that efficiently deploy superior haplotypes to hasten breeding progress will be key to safeguarding global food security.
Collapse
Affiliation(s)
- Javaid Akhter Bhat
- National Center for Soybean Improvement, State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, China
| | - Deyue Yu
- National Center for Soybean Improvement, State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, China
| | - Abhishek Bohra
- Crop Improvement Division, ICAR- Indian Institute of Pulses Research (ICAR- IIPR), Kanpur, India
| | - Showkat Ahmad Ganie
- Department of Biotechnology, Visva-Bharati, Santiniketan, 731235, WB, India.
| | - Rajeev K Varshney
- Center of Excellence in Genomics & Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, 502324, India.
- State Agricultural Biotechnology Centre, Centre for Crop & Food Research Innovation, Food Futures Institute, Murdoch University, Murdoch, WA, Australia.
| |
Collapse
|
24
|
Grosso V, Marcolungo L, Maestri S, Alfano M, Lavezzari D, Iadarola B, Salviati A, Mariotti B, Botta A, D’Apice MR, Novelli G, Delledonne M, Rossato M. Characterization of FMR1 Repeat Expansion and Intragenic Variants by Indirect Sequence Capture. Front Genet 2021; 12:743230. [PMID: 34646309 PMCID: PMC8504923 DOI: 10.3389/fgene.2021.743230] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Accepted: 08/26/2021] [Indexed: 11/30/2022] Open
Abstract
Traditional methods for the analysis of repeat expansions, which underlie genetic disorders, such as fragile X syndrome (FXS), lack single-nucleotide resolution in repeat analysis and the ability to characterize causative variants outside the repeat array. These drawbacks can be overcome by long-read and short-read sequencing, respectively. However, the routine application of next-generation sequencing in the clinic requires target enrichment, and none of the available methods allows parallel analysis of long-DNA fragments using both sequencing technologies. In this study, we investigated the use of indirect sequence capture (Xdrop technology) coupled to Nanopore and Illumina sequencing to characterize FMR1, the gene responsible of FXS. We achieved the efficient enrichment (> 200×) of large target DNA fragments (~60-80 kbp) encompassing the entire FMR1 gene. The analysis of Xdrop-enriched samples by Nanopore long-read sequencing allowed the complete characterization of repeat lengths in samples with normal, pre-mutation, and full mutation status (> 1 kbp), and correctly identified repeat interruptions relevant for disease prognosis and transmission. Single-nucleotide variants (SNVs) and small insertions/deletions (indels) could be detected in the same samples by Illumina short-read sequencing, completing the mutational testing through the identification of pathogenic variants within the FMR1 gene, when no typical CGG repeat expansion is detected. The study successfully demonstrated the parallel analysis of repeat expansions and SNVs/indels in the FMR1 gene at single-nucleotide resolution by combining Xdrop enrichment with two next-generation sequencing approaches. With the appropriate optimization necessary for the clinical settings, the system could facilitate both the study of genotype-phenotype correlation in FXS and enable a more efficient diagnosis and genetic counseling for patients and their relatives.
Collapse
Affiliation(s)
- Valentina Grosso
- Department of Biotechnology, University of Verona, Verona, Italy
| | - Luca Marcolungo
- Department of Biotechnology, University of Verona, Verona, Italy
| | - Simone Maestri
- Department of Biotechnology, University of Verona, Verona, Italy
| | | | - Denise Lavezzari
- Department of Biotechnology, University of Verona, Verona, Italy
| | - Barbara Iadarola
- Department of Biotechnology, University of Verona, Verona, Italy
| | - Alessandro Salviati
- Department of Biotechnology, University of Verona, Verona, Italy
- GENARTIS srl, Verona, Italy
| | - Barbara Mariotti
- Department of Medicine, Section of General Pathology, University of Verona, Verona, Italy
| | - Annalisa Botta
- Department of Biomedicine and Prevention, Medical Genetics Section, University of Rome "Tor Vergata", Rome, Italy
| | | | - Giuseppe Novelli
- Department of Biomedicine and Prevention, Medical Genetics Section, University of Rome "Tor Vergata", Rome, Italy
- IRCCS Neuromed Mediterranean Neurological Institute, Pozzilli, Italy
- Department of Pharmacology, School of Medicine, University of Nevada, Reno, NV, United States
| | - Massimo Delledonne
- Department of Biotechnology, University of Verona, Verona, Italy
- GENARTIS srl, Verona, Italy
| | - Marzia Rossato
- Department of Biotechnology, University of Verona, Verona, Italy
- GENARTIS srl, Verona, Italy
| |
Collapse
|