1
|
Sund KL, Liu J, Lee J, Garbe J, Abdelhamed Z, Maag C, Hallinan B, Wu SW, Sperry E, Deshpande A, Stottmann R, Smolarek TA, Dyer LM, Hestand MS. Long-read sequencing and optical genome mapping identify causative gene disruptions in noncoding sequence in two patients with neurologic disease and known chromosome abnormalities. Am J Med Genet A 2024:e63818. [PMID: 39041659 DOI: 10.1002/ajmg.a.63818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 06/12/2024] [Accepted: 07/07/2024] [Indexed: 07/24/2024]
Abstract
Despite advances in next generation sequencing (NGS), genetic diagnoses remain elusive for many patients with neurologic syndromes. Long-read sequencing (LRS) and optical genome mapping (OGM) technologies improve upon existing capabilities in the detection and interpretation of structural variation in repetitive DNA, on a single haplotype, while also providing enhanced breakpoint resolution. We performed LRS and OGM on two patients with known chromosomal rearrangements and inconclusive Sanger or NGS. The first patient, who had epilepsy and developmental delay, had a complex translocation between two chromosomes that included insertion and inversion events. The second patient, who had a movement disorder, had an inversion on a single chromosome disrupted by multiple smaller inversions and insertions. Sequence level resolution of the rearrangements identified pathogenic breaks in noncoding sequence in or near known disease-causing genes with relevant neurologic phenotypes (MBD5, NKX2-1). These specific variants have not been reported previously, but expected molecular consequences are consistent with previously reported cases. As the use of LRS and OGM technologies for clinical testing increases and data analyses become more standardized, these methods along with multiomic data to validate noncoding variation effects will improve diagnostic yield and increase the proportion of probands with detectable pathogenic variants for known genes implicated in neurogenetic disease.
Collapse
Affiliation(s)
- Kristen L Sund
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
| | - Jie Liu
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
- Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio, USA
| | - Joyce Lee
- Bionano Genomics, San Diego, California, USA
| | - John Garbe
- University of Minnesota Genomics Center, University of Minnesota, Minneapolis, Minnesota, USA
| | - Zakia Abdelhamed
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
| | - Chelsey Maag
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
| | - Barbara Hallinan
- Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio, USA
- Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
| | - Steven W Wu
- Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio, USA
- Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
| | - Ethan Sperry
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
| | - Archana Deshpande
- University of Minnesota Genomics Center, University of Minnesota, Minneapolis, Minnesota, USA
| | - Rolf Stottmann
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
- Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio, USA
| | - Teresa A Smolarek
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
- Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio, USA
| | - Lisa M Dyer
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
- Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio, USA
| | - Matthew S Hestand
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
- Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio, USA
| |
Collapse
|
2
|
Liu Z, Xie Z, Li M. Comprehensive and deep evaluation of structural variation detection pipelines with third-generation sequencing data. Genome Biol 2024; 25:188. [PMID: 39010145 PMCID: PMC11247875 DOI: 10.1186/s13059-024-03324-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 06/26/2024] [Indexed: 07/17/2024] Open
Abstract
BACKGROUND Structural variation (SV) detection methods using third-generation sequencing data are widely employed, yet accurately detecting SVs remains challenging. Different methods often yield inconsistent results for certain SV types, complicating tool selection and revealing biases in detection. RESULTS This study comprehensively evaluates 53 SV detection pipelines using simulated and real data from PacBio (CLR: Continuous Long Read, CCS: Circular Consensus Sequencing) and Nanopore (ONT) platforms. We assess their performance in detecting various sizes and types of SVs, breakpoint biases, and genotyping accuracy with various sequencing depths. Notably, pipelines such as Minimap2-cuteSV2, NGMLR-SVIM, PBMM2-pbsv, Winnowmap-Sniffles2, and Winnowmap-SVision exhibit comparatively higher recall and precision. Our findings also show that combining multiple pipelines with the same aligner, like pbmm2 or winnowmap, can significantly enhance performance. The individual pipelines' detailed ranking and performance metrics can be viewed in a dynamic table: http://pmglab.top/SVPipelinesRanking . CONCLUSIONS This study comprehensively characterizes the strengths and weaknesses of numerous pipelines, providing valuable insights that can improve SV detection in third-generation sequencing data and inform SV annotation and function prediction.
Collapse
Affiliation(s)
- Zhi Liu
- Program in Bioinformatics, Zhongshan School of Medicine, The Fifth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China
- Key Laboratory of Tropical Disease Control (Sun Yat-Sen University), Ministry of Education, Guangzhou, China
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou, China
| | - Miaoxin Li
- Program in Bioinformatics, Zhongshan School of Medicine, The Fifth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China.
- Key Laboratory of Tropical Disease Control (Sun Yat-Sen University), Ministry of Education, Guangzhou, China.
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China.
- Department of Psychiatry, The University of Hong Kong, Hong Kong, SAR, China.
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital, Sun Yat-Sen University, Zhuhai, China.
| |
Collapse
|
3
|
Zheng J, Li T, Ye H, Jiang Z, Jiang W, Yang H, Wu Z, Xie Z. Comprehensive identification of pathogenic variants in retinoblastoma by long- and short-read sequencing. Cancer Lett 2024; 598:217121. [PMID: 39009069 DOI: 10.1016/j.canlet.2024.217121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 06/16/2024] [Accepted: 07/11/2024] [Indexed: 07/17/2024]
Abstract
Retinoblastoma (RB) is the most common intraocular malignancy in childhood. The causal variants in RB are mostly characterized by previously used short-read sequencing (SRS) analysis, which has technical limitations in identifying structural variants (SVs) and phasing information. Long-read sequencing (LRS) technology has advantages over SRS in detecting SVs, phased genetic variants, and methylation. In this study, we comprehensively characterized the genetic landscape of RB using combinatorial LRS and SRS of 16 RB tumors and 16 matched blood samples. We detected a total of 232 somatic SVs, with an average of 14.5 SVs per sample across the whole genome in our cohort. We identified 20 distinct pathogenic variants disrupting RB1 gene, including three novel small variants and five somatic SVs. We found more somatic SVs were detected from LRS than SRS (140 vs. 122) in RB samples with WGS data, particularly the insertions (18 vs. 1). Furthermore, our analysis shows that, with the exception of one sample who lacked the methylation data, all samples presented biallelic inactivation of RB1 in various forms, including two cases with the biallelic hypermethylated promoter and four cases with compound heterozygous mutations which were missing in SRS analysis. By inferring relative timing of somatic events, we reveal the genetic progression that RB1 disruption early and followed by copy number changes, including amplifications of Chr2p and deletions of Chr16q, during RB tumorigenesis. Altogether, we characterize the comprehensive genetic landscape of RB, providing novel insights into the genetic alterations and mechanisms contributing to RB initiation and development. Our work also establishes a framework to analyze genomic landscape of cancers based on LRS data.
Collapse
Affiliation(s)
- Jingjing Zheng
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Tong Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Huijing Ye
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Zehang Jiang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Wenbing Jiang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Huasheng Yang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China.
| | - Zhikun Wu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China.
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
4
|
Shelton WJ, Zandpazandi S, Nix JS, Gokden M, Bauer M, Ryan KR, Wardell CP, Vaske OM, Rodriguez A. Long-read sequencing for brain tumors. Front Oncol 2024; 14:1395985. [PMID: 38915364 PMCID: PMC11194609 DOI: 10.3389/fonc.2024.1395985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 05/27/2024] [Indexed: 06/26/2024] Open
Abstract
Brain tumors and genomics have a long-standing history given that glioblastoma was the first cancer studied by the cancer genome atlas. The numerous and continuous advances through the decades in sequencing technologies have aided in the advanced molecular characterization of brain tumors for diagnosis, prognosis, and treatment. Since the implementation of molecular biomarkers by the WHO CNS in 2016, the genomics of brain tumors has been integrated into diagnostic criteria. Long-read sequencing, also known as third generation sequencing, is an emerging technique that allows for the sequencing of longer DNA segments leading to improved detection of structural variants and epigenetics. These capabilities are opening a way for better characterization of brain tumors. Here, we present a comprehensive summary of the state of the art of third-generation sequencing in the application for brain tumor diagnosis, prognosis, and treatment. We discuss the advantages and potential new implementations of long-read sequencing into clinical paradigms for neuro-oncology patients.
Collapse
Affiliation(s)
- William J. Shelton
- Department of Neurosurgery, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Sara Zandpazandi
- Department of Neurosurgery, Medical University of South Carolina, Charleston, SC, United States
| | - J Stephen Nix
- Department of Pathology, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Murat Gokden
- Department of Pathology, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Michael Bauer
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Katie Rose Ryan
- Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Christopher P. Wardell
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Olena Morozova Vaske
- Department of Molecular, Cell and Developmental Biology, University of California Santa Cruz, Santa Cruz, CA, United States
| | - Analiz Rodriguez
- Department of Neurosurgery, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| |
Collapse
|
5
|
Corradi Z, Dhaenens CM, Grunewald O, Kocabaş IS, Meunier I, Banfi S, Karali M, Cremers FPM, Hitti-Malin RJ. Novel and Recurrent Copy Number Variants in ABCA4-Associated Retinopathy. Int J Mol Sci 2024; 25:5940. [PMID: 38892127 PMCID: PMC11173210 DOI: 10.3390/ijms25115940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 05/22/2024] [Accepted: 05/24/2024] [Indexed: 06/21/2024] Open
Abstract
ABCA4 is the most frequently mutated gene leading to inherited retinal disease (IRD) with over 2200 pathogenic variants reported to date. Of these, ~1% are copy number variants (CNVs) involving the deletion or duplication of genomic regions, typically >50 nucleotides in length. An in-depth assessment of the current literature based on the public database LOVD, regarding the presence of known CNVs and structural variants in ABCA4, and additional sequencing analysis of ABCA4 using single-molecule Molecular Inversion Probes (smMIPs) for 148 probands highlighted recurrent and novel CNVs associated with ABCA4-associated retinopathies. An analysis of the coverage depth in the sequencing data led to the identification of eleven deletions (six novel and five recurrent), three duplications (one novel and two recurrent) and one complex CNV. Of particular interest was the identification of a complex defect, i.e., a 15.3 kb duplicated segment encompassing exon 31 through intron 41 that was inserted at the junction of a downstream 2.7 kb deletion encompassing intron 44 through intron 47. In addition, we identified a 7.0 kb tandem duplication of intron 1 in three cases. The identification of CNVs in ABCA4 can provide patients and their families with a genetic diagnosis whilst expanding our understanding of the complexity of diseases caused by ABCA4 variants.
Collapse
Affiliation(s)
- Zelia Corradi
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, The Netherlands
| | - Claire-Marie Dhaenens
- Université de Lille, Inserm, CHU Lille, U1172-LilNCog-Lille Neuroscience & Cognition, F-59000 Lille, France
| | - Olivier Grunewald
- Université de Lille, Inserm, CHU Lille, U1172-LilNCog-Lille Neuroscience & Cognition, F-59000 Lille, France
| | - Ipek Selen Kocabaş
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, The Netherlands
| | - Isabelle Meunier
- Institute des Neurosciences de Montpellier, INSERM, Université de Montpellier, F-34295 Montpellier, France
| | - Sandro Banfi
- Department of Precision Medicine, University of Campania “Luigi Vanvitelli”, 81031 Naples, Italy
- Telethon Institute of Genetics and Medicine (TIGEM), 80078 Pozzuoli, Italy
| | - Marianthi Karali
- Department of Precision Medicine, University of Campania “Luigi Vanvitelli”, 81031 Naples, Italy
- Eye Clinic, Multidisciplinary Department of Medical, Surgical and Dental Sciences, University of Campania “Luigi Vanvitelli”, 81031 Naples, Italy
| | - Frans P. M. Cremers
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, The Netherlands
| | - Rebekkah J. Hitti-Malin
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, The Netherlands
| |
Collapse
|
6
|
Liu-Wei W, van der Toorn W, Bohn P, Hölzer M, Smyth RP, von Kleist M. Sequencing accuracy and systematic errors of nanopore direct RNA sequencing. BMC Genomics 2024; 25:528. [PMID: 38807060 PMCID: PMC11134706 DOI: 10.1186/s12864-024-10440-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 05/21/2024] [Indexed: 05/30/2024] Open
Abstract
BACKGROUND Direct RNA sequencing (dRNA-seq) on the Oxford Nanopore Technologies (ONT) platforms can produce reads covering up to full-length gene transcripts, while containing decipherable information about RNA base modifications and poly-A tail lengths. Although many published studies have been expanding the potential of dRNA-seq, its sequencing accuracy and error patterns remain understudied. RESULTS We present the first comprehensive evaluation of sequencing accuracy and characterisation of systematic errors in dRNA-seq data from diverse organisms and synthetic in vitro transcribed RNAs. We found that for sequencing kits SQK-RNA001 and SQK-RNA002, the median read accuracy ranged from 87% to 92% across species, and deletions significantly outnumbered mismatches and insertions. Due to their high abundance in the transcriptome, heteropolymers and short homopolymers were the major contributors to the overall sequencing errors. We also observed systematic biases across all species at the levels of single nucleotides and motifs. In general, cytosine/uracil-rich regions were more likely to be erroneous than guanines and adenines. By examining raw signal data, we identified the underlying signal-level features potentially associated with the error patterns and their dependency on sequence contexts. While read quality scores can be used to approximate error rates at base and read levels, failure to detect DNA adapters may be a source of errors and data loss. By comparing distinct basecallers, we reason that some sequencing errors are attributable to signal insufficiency rather than algorithmic (basecalling) artefacts. Lastly, we generated dRNA-seq data using the latest SQK-RNA004 sequencing kit released at the end of 2023 and found that although the overall read accuracy increased, the systematic errors remain largely identical compared to the previous kits. CONCLUSIONS As the first systematic investigation of dRNA-seq errors, this study offers a comprehensive overview of reproducible error patterns across diverse datasets, identifies potential signal-level insufficiency, and lays the foundation for error correction methods.
Collapse
Affiliation(s)
- Wang Liu-Wei
- Systems Medicine of Infectious Disease (P5), Robert Koch Institute, Berlin, Germany.
- International Max-Planck Research School 'Biology and Computation', Max-Planck Institute for Molecular Genetics, Berlin, Germany.
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany.
| | - Wiep van der Toorn
- Systems Medicine of Infectious Disease (P5), Robert Koch Institute, Berlin, Germany
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | - Patrick Bohn
- Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, Würzburg, Germany
| | - Martin Hölzer
- Genome Competence Center (MF1), Robert Koch Institute, Berlin, Germany
| | - Redmond P Smyth
- Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, Würzburg, Germany
- Faculty of Medicine, University of Würzburg, Würzburg, Germany
| | - Max von Kleist
- Systems Medicine of Infectious Disease (P5), Robert Koch Institute, Berlin, Germany.
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany.
| |
Collapse
|
7
|
Ji CM, Feng XY, Huang YW, Chen RA. The Applications of Nanopore Sequencing Technology in Animal and Human Virus Research. Viruses 2024; 16:798. [PMID: 38793679 PMCID: PMC11125791 DOI: 10.3390/v16050798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 05/07/2024] [Accepted: 05/13/2024] [Indexed: 05/26/2024] Open
Abstract
In recent years, an increasing number of viruses have triggered outbreaks that pose a severe threat to both human and animal life, as well as caused substantial economic losses. It is crucial to understand the genomic structure and epidemiology of these viruses to guide effective clinical prevention and treatment strategies. Nanopore sequencing, a third-generation sequencing technology, has been widely used in genomic research since 2014. This technology offers several advantages over traditional methods and next-generation sequencing (NGS), such as the ability to generate ultra-long reads, high efficiency, real-time monitoring and analysis, portability, and the ability to directly sequence RNA or DNA molecules. As a result, it exhibits excellent applicability and flexibility in virus research, including viral detection and surveillance, genome assembly, the discovery of new variants and novel viruses, and the identification of chemical modifications. In this paper, we provide a comprehensive review of the development, principles, advantages, and applications of nanopore sequencing technology in animal and human virus research, aiming to offer fresh perspectives for future studies in this field.
Collapse
Affiliation(s)
- Chun-Miao Ji
- Zhaoqing Branch Center of Guangdong Laboratory for Lingnan Modern Agricultural Science and Technology, Zhaoqing 526238, China; (C.-M.J.); (X.-Y.F.)
| | - Xiao-Yin Feng
- Zhaoqing Branch Center of Guangdong Laboratory for Lingnan Modern Agricultural Science and Technology, Zhaoqing 526238, China; (C.-M.J.); (X.-Y.F.)
| | - Yao-Wei Huang
- College of Veterinary Medicine, South China Agricultural University, Guangzhou 510642, China;
- Department of Veterinary Medicine, Zhejiang University, Hangzhou 310058, China
| | - Rui-Ai Chen
- Zhaoqing Branch Center of Guangdong Laboratory for Lingnan Modern Agricultural Science and Technology, Zhaoqing 526238, China; (C.-M.J.); (X.-Y.F.)
- College of Veterinary Medicine, South China Agricultural University, Guangzhou 510642, China;
| |
Collapse
|
8
|
Scarano C, Veneruso I, De Simone RR, Di Bonito G, Secondino A, D’Argenio V. The Third-Generation Sequencing Challenge: Novel Insights for the Omic Sciences. Biomolecules 2024; 14:568. [PMID: 38785975 PMCID: PMC11117673 DOI: 10.3390/biom14050568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 05/05/2024] [Accepted: 05/08/2024] [Indexed: 05/25/2024] Open
Abstract
The understanding of the human genome has been greatly improved by the advent of next-generation sequencing technologies (NGS). Despite the undeniable advantages responsible for their widespread diffusion, these methods have some constraints, mainly related to short read length and the need for PCR amplification. As a consequence, long-read sequencers, called third-generation sequencing (TGS), have been developed, promising to overcome NGS. Starting from the first prototype, TGS has progressively ameliorated its chemistries by improving both read length and base-calling accuracy, as well as simultaneously reducing the costs/base. Based on these premises, TGS is showing its potential in many fields, including the analysis of difficult-to-sequence genomic regions, structural variations detection, RNA expression profiling, DNA methylation study, and metagenomic analyses. Protocol standardization and the development of easy-to-use pipelines for data analysis will enhance TGS use, also opening the way for their routine applications in diagnostic contexts.
Collapse
Affiliation(s)
- Carmela Scarano
- Department of Molecular Medicine and Medical Biotechnologies, Federico II University, Via Sergio Pansini 5, 80131 Napoli, Italy
- CEINGE-Biotecnologie Avanzate Franco Salvatore, Via G. Salvatore 486, 80145 Napoli, Italy
| | - Iolanda Veneruso
- Department of Molecular Medicine and Medical Biotechnologies, Federico II University, Via Sergio Pansini 5, 80131 Napoli, Italy
- CEINGE-Biotecnologie Avanzate Franco Salvatore, Via G. Salvatore 486, 80145 Napoli, Italy
| | - Rosa Redenta De Simone
- Department of Molecular Medicine and Medical Biotechnologies, Federico II University, Via Sergio Pansini 5, 80131 Napoli, Italy
- CEINGE-Biotecnologie Avanzate Franco Salvatore, Via G. Salvatore 486, 80145 Napoli, Italy
| | - Gennaro Di Bonito
- Department of Molecular Medicine and Medical Biotechnologies, Federico II University, Via Sergio Pansini 5, 80131 Napoli, Italy
- CEINGE-Biotecnologie Avanzate Franco Salvatore, Via G. Salvatore 486, 80145 Napoli, Italy
| | - Angela Secondino
- Department of Molecular Medicine and Medical Biotechnologies, Federico II University, Via Sergio Pansini 5, 80131 Napoli, Italy
- CEINGE-Biotecnologie Avanzate Franco Salvatore, Via G. Salvatore 486, 80145 Napoli, Italy
| | - Valeria D’Argenio
- CEINGE-Biotecnologie Avanzate Franco Salvatore, Via G. Salvatore 486, 80145 Napoli, Italy
- Department of Human Sciences and Quality of Life Promotion, San Raffaele Open University, Via di Val Cannuta 247, 00166 Roma, Italy
| |
Collapse
|
9
|
Chen Q, Wu B, Li C, Ding L, Huang S, Wang J, Zhao J. Deciphering male influence in gynogenetic Pengze crucian carp ( Carassius auratus var. pengsenensis): insights from Nanopore sequencing of structural variations. Front Genet 2024; 15:1392110. [PMID: 38784042 PMCID: PMC11111978 DOI: 10.3389/fgene.2024.1392110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 04/11/2024] [Indexed: 05/25/2024] Open
Abstract
In this study, we investigate gynogenetic reproduction in Pengze Crucian Carp (Carassius auratus var. pengsenensis) using third-generation Nanopore sequencing to uncover structural variations (SVs) in offspring. Our objective was to understand the role of male genetic material in gynogenesis by examining the genomes of both parents and their offspring. We discovered a notable number of male-specific structural variations (MSSVs): 1,195 to 1,709 MSSVs in homologous offspring, accounting for approximately 0.52%-0.60% of their detected SVs, and 236 to 350 MSSVs in heterologous offspring, making up about 0.10%-0.13%. These results highlight the significant influence of male genetic material on the genetic composition of offspring, particularly in homologous pairs, challenging the traditional view of asexual reproduction. The gene annotation of MSSVs revealed their presence in critical gene regions, indicating potential functional impacts. Specifically, we found 5 MSSVs in the exonic regions of protein-coding genes in homologous offspring, suggesting possible direct effects on protein structure and function. Validation of an MSSV in the exonic region of the polyunsaturated fatty acid 5-lipoxygenase gene confirmed male genetic material transmission in some offspring. This study underscores the importance of further research on the genetic diversity and gynogenesis mechanisms, providing valuable insights for reproductive biology, aquaculture, and fostering innovation in biological research and aquaculture practices.
Collapse
Affiliation(s)
- Qianhui Chen
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Biyu Wu
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Chao Li
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Liyun Ding
- Jiangxi Fisheries Research Institute, Nanchang, China
| | - Shiting Huang
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Junjie Wang
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Jun Zhao
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, China
| |
Collapse
|
10
|
Oberle A, Hanzer F, Kokocinski F, Ennemoser A, Carli L, Vaccari E, Hengstschläger M, Feichtinger M. Evaluation of Nanopore Sequencing on Polar Bodies for Routine Pre-Implantation Genetic Testing for Aneuploidy. Clin Chem 2024; 70:747-758. [PMID: 38451051 DOI: 10.1093/clinchem/hvae024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 01/16/2024] [Indexed: 03/08/2024]
Abstract
BACKGROUND Preimplantation genetic testing for aneuploidy (PGT-A) using polar body (PB) biopsy offers a clinical benefit by reducing the number of embryo transfers and miscarriage rates but is currently not cost-efficient. Nanopore sequencing technology opens possibilities by providing cost-efficient and fast sequencing results with uncomplicated sample preparation work flows. METHODS In this comparative experimental study, 102 pooled PB samples (99 passing QC) from 20 patients were analyzed for aneuploidy using nanopore sequencing technology and compared with array comparative genomic hybridization (aCGH) results generated as part of the clinical routine. Samples were sequenced on a Nanopore MinION machine. Whole-chromosome copy-numbers were called by custom bioinformatic analysis software. Automatically called results were compared to aCGH results. RESULTS Overall, 96/99 samples were consistently detected as euploid or aneuploid in both methods (concordance = 97.0%, sensitivity = 0.957, specificity = 1.0, positive predictive value = 1.0, negative predictive value = 0.906). On the chromosomal level, concordance reached 98.7%. Chromosomal aneuploidies analyzed in this trial covered all 23 chromosomes with 98 trisomies, and 97 monosomies in 70 aCGH samples.The whole nanopore work flow is feasible in under 5 h (for one sample) with a maximum time of 16 h (for 12 samples), enabling fresh PB-euploid embryo transfer. A material cost of US$ 165 (EUR 150)/sample possibly enables cost-efficient aneuploidy screening. CONCLUSIONS This is the first study systematically comparing nanopore sequencing with standard methods for the detection of PB aneuploidy. High concordance rates confirmed the feasibility of nanopore technology for this application. Additionally, the fast and cost-efficient work flow reveals the clinical utility of this technology, making it clinically attractive for PB PGT-A.
Collapse
Affiliation(s)
- Anna Oberle
- Wunschbaby Institut Feichtinger, Lainzer Straße 6, 1130 Vienna, Austria
| | - Franziska Hanzer
- Wunschbaby Institut Feichtinger, Lainzer Straße 6, 1130 Vienna, Austria
| | - Felix Kokocinski
- Gene-Test Bioinformatics Solutions GmbH, Jakob-Müller-Str. 16, 68623 Lampertheim, Germany
| | - Anna Ennemoser
- Wunschbaby Institut Feichtinger, Lainzer Straße 6, 1130 Vienna, Austria
| | - Luca Carli
- Wunschbaby Institut Feichtinger, Lainzer Straße 6, 1130 Vienna, Austria
| | - Enrico Vaccari
- Wunschbaby Institut Feichtinger, Lainzer Straße 6, 1130 Vienna, Austria
| | | | | |
Collapse
|
11
|
Liu YH, Luo C, Golding SG, Ioffe JB, Zhou XM. Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data. Nat Commun 2024; 15:2447. [PMID: 38503752 PMCID: PMC10951360 DOI: 10.1038/s41467-024-46614-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 03/04/2024] [Indexed: 03/21/2024] Open
Abstract
Long-read sequencing offers long contiguous DNA fragments, facilitating diploid genome assembly and structural variant (SV) detection. Efficient and robust algorithms for SV identification are crucial with increasing data availability. Alignment-based methods, favored for their computational efficiency and lower coverage requirements, are prominent. Alternative approaches, relying solely on available reads for de novo genome assembly and employing assembly-based tools for SV detection via comparison to a reference genome, demand significantly more computational resources. However, the lack of comprehensive benchmarking constrains our comprehension and hampers further algorithm development. Here we systematically compare 14 read alignment-based SV calling methods (including 4 deep learning-based methods and 1 hybrid method), and 4 assembly-based SV calling methods, alongside 4 upstream aligners and 7 assemblers. Assembly-based tools excel in detecting large SVs, especially insertions, and exhibit robustness to evaluation parameter changes and coverage fluctuations. Conversely, alignment-based tools demonstrate superior genotyping accuracy at low sequencing coverage (5-10×) and excel in detecting complex SVs, like translocations, inversions, and duplications. Our evaluation provides performance insights, highlighting the absence of a universally superior tool. We furnish guidelines across 31 criteria combinations, aiding users in selecting the most suitable tools for diverse scenarios and offering directions for further method development.
Collapse
Affiliation(s)
- Yichen Henry Liu
- Department of Computer Science, Vanderbilt University, 37235, Nashville, TN, USA
| | - Can Luo
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, TN, USA
| | - Staunton G Golding
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, TN, USA
| | - Jacob B Ioffe
- Department of Computer Science, Vanderbilt University, 37235, Nashville, TN, USA
| | - Xin Maizie Zhou
- Department of Computer Science, Vanderbilt University, 37235, Nashville, TN, USA.
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, TN, USA.
- Data Science Institute, Vanderbilt University, 37235, Nashville, TN, USA.
| |
Collapse
|
12
|
Helal AA, Saad BT, Saad MT, Mosaad GS, Aboshanab KM. Benchmarking long-read aligners and SV callers for structural variation detection in Oxford nanopore sequencing data. Sci Rep 2024; 14:6160. [PMID: 38486064 PMCID: PMC10940726 DOI: 10.1038/s41598-024-56604-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 03/08/2024] [Indexed: 03/18/2024] Open
Abstract
Structural variants (SVs) are one of the significant types of DNA mutations and are typically defined as larger-than-50-bp genomic alterations that include insertions, deletions, duplications, inversions, and translocations. These modifications can profoundly impact the phenotypic characteristics and contribute to disorders like cancer, response to treatment, and infections. Four long-read aligners and five SV callers have been evaluated using three Oxford Nanopore NGS human genome datasets in terms of precision, recall, and F1-score statistical metrics, depth of coverage, and speed of analysis. The best SV caller regarding recall, precision, and F1-score when matched with different aligners at different coverage levels tend to vary depending on the dataset and the specific SV types being analyzed. However, based on our findings, Sniffles and CuteSV tend to perform well across different aligners and coverage levels, followed by SVIM, PBSV, and SVDSS in the last place. The CuteSV caller has the highest average F1-score (82.51%) and recall (78.50%), and Sniffles has the highest average precision value (94.33%). Minimap2 as an aligner and Sniffles as an SV caller act as a strong base for the pipeline of SV calling because of their high speed and reasonable accomplishment. PBSV has a lower average F1-score, precision, and recall and may generate more false positives and overlook some actual SVs. Our results are valuable in the comprehensive evaluation of popular SV callers and aligners as they provide insight into the performance of several long-read aligners and SV callers and serve as a reference for researchers in selecting the most suitable tools for SV detection.
Collapse
Affiliation(s)
- Asmaa A Helal
- Department of Bioinformatics, HITS Solutions Co., Cairo, 11765, Egypt
| | - Bishoy T Saad
- Department of Bioinformatics, HITS Solutions Co., Cairo, 11765, Egypt.
| | - Mina T Saad
- Department of Bioinformatics, HITS Solutions Co., Cairo, 11765, Egypt
| | - Gamal S Mosaad
- Department of Bioinformatics, HITS Solutions Co., Cairo, 11765, Egypt
| | - Khaled M Aboshanab
- Department of Microbiology and Immunology, Faculty of Pharmacy, Ain Shams University, Organization of African Unity St., Abassi, Cairo, 11566, Egypt.
| |
Collapse
|
13
|
Connor R, Shakya M, Yarmosh DA, Maier W, Martin R, Bradford R, Brister JR, Chain PSG, Copeland CA, di Iulio J, Hu B, Ebert P, Gunti J, Jin Y, Katz KS, Kochergin A, LaRosa T, Li J, Li PE, Lo CC, Rashid S, Maiorova ES, Xiao C, Zalunin V, Purcell L, Pruitt KD. Recommendations for Uniform Variant Calling of SARS-CoV-2 Genome Sequence across Bioinformatic Workflows. Viruses 2024; 16:430. [PMID: 38543795 PMCID: PMC10975397 DOI: 10.3390/v16030430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 02/12/2024] [Accepted: 02/16/2024] [Indexed: 04/01/2024] Open
Abstract
Genomic sequencing of clinical samples to identify emerging variants of SARS-CoV-2 has been a key public health tool for curbing the spread of the virus. As a result, an unprecedented number of SARS-CoV-2 genomes were sequenced during the COVID-19 pandemic, which allowed for rapid identification of genetic variants, enabling the timely design and testing of therapies and deployment of new vaccine formulations to combat the new variants. However, despite the technological advances of deep sequencing, the analysis of the raw sequence data generated globally is neither standardized nor consistent, leading to vastly disparate sequences that may impact identification of variants. Here, we show that for both Illumina and Oxford Nanopore sequencing platforms, downstream bioinformatic protocols used by industry, government, and academic groups resulted in different virus sequences from same sample. These bioinformatic workflows produced consensus genomes with differences in single nucleotide polymorphisms, inclusion and exclusion of insertions, and/or deletions, despite using the same raw sequence as input datasets. Here, we compared and characterized such discrepancies and propose a specific suite of parameters and protocols that should be adopted across the field. Consistent results from bioinformatic workflows are fundamental to SARS-CoV-2 and future pathogen surveillance efforts, including pandemic preparation, to allow for a data-driven and timely public health response.
Collapse
Affiliation(s)
- Ryan Connor
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; (R.C.); (J.R.B.); (J.G.); (Y.J.); (K.S.K.); (A.K.); (C.X.); (V.Z.)
| | - Migun Shakya
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA; (M.S.); (P.S.G.C.); (B.H.); (P.-E.L.); (C.-C.L.)
| | - David A. Yarmosh
- American Type Culture Collection, Manassas, VA 20110, USA; (D.A.Y.); (R.B.); (S.R.)
- BEI Resources, Manassas, VA 20110, USA
| | - Wolfgang Maier
- Galaxy Europe Team, University of Freiburg, 79085 Freiburg, Germany;
| | - Ross Martin
- Clinical Virology Department, Gilead Sciences, Foster City, CA 94404, USA; (R.M.); (J.L.); (E.S.M.)
| | - Rebecca Bradford
- American Type Culture Collection, Manassas, VA 20110, USA; (D.A.Y.); (R.B.); (S.R.)
- BEI Resources, Manassas, VA 20110, USA
| | - J. Rodney Brister
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; (R.C.); (J.R.B.); (J.G.); (Y.J.); (K.S.K.); (A.K.); (C.X.); (V.Z.)
| | - Patrick S. G. Chain
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA; (M.S.); (P.S.G.C.); (B.H.); (P.-E.L.); (C.-C.L.)
| | | | - Julia di Iulio
- Vir Biotechnology Inc., San Francisco, CA 94158, USA; (J.d.I.); (L.P.)
| | - Bin Hu
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA; (M.S.); (P.S.G.C.); (B.H.); (P.-E.L.); (C.-C.L.)
| | - Philip Ebert
- Eli Lilly and Company, Indianapolis, IN 46225, USA;
| | - Jonathan Gunti
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; (R.C.); (J.R.B.); (J.G.); (Y.J.); (K.S.K.); (A.K.); (C.X.); (V.Z.)
| | - Yumi Jin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; (R.C.); (J.R.B.); (J.G.); (Y.J.); (K.S.K.); (A.K.); (C.X.); (V.Z.)
| | - Kenneth S. Katz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; (R.C.); (J.R.B.); (J.G.); (Y.J.); (K.S.K.); (A.K.); (C.X.); (V.Z.)
| | - Andrey Kochergin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; (R.C.); (J.R.B.); (J.G.); (Y.J.); (K.S.K.); (A.K.); (C.X.); (V.Z.)
| | - Tré LaRosa
- Deloitte Consulting LLP, Rosslyn, VA 22209, USA; (C.A.C.); (T.L.)
| | - Jiani Li
- Clinical Virology Department, Gilead Sciences, Foster City, CA 94404, USA; (R.M.); (J.L.); (E.S.M.)
| | - Po-E Li
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA; (M.S.); (P.S.G.C.); (B.H.); (P.-E.L.); (C.-C.L.)
| | - Chien-Chi Lo
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA; (M.S.); (P.S.G.C.); (B.H.); (P.-E.L.); (C.-C.L.)
| | - Sujatha Rashid
- American Type Culture Collection, Manassas, VA 20110, USA; (D.A.Y.); (R.B.); (S.R.)
| | - Evguenia S. Maiorova
- Clinical Virology Department, Gilead Sciences, Foster City, CA 94404, USA; (R.M.); (J.L.); (E.S.M.)
| | - Chunlin Xiao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; (R.C.); (J.R.B.); (J.G.); (Y.J.); (K.S.K.); (A.K.); (C.X.); (V.Z.)
| | - Vadim Zalunin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; (R.C.); (J.R.B.); (J.G.); (Y.J.); (K.S.K.); (A.K.); (C.X.); (V.Z.)
| | - Lisa Purcell
- Vir Biotechnology Inc., San Francisco, CA 94158, USA; (J.d.I.); (L.P.)
| | - Kim D. Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; (R.C.); (J.R.B.); (J.G.); (Y.J.); (K.S.K.); (A.K.); (C.X.); (V.Z.)
| |
Collapse
|
14
|
Liu X, Zheng J, Ding J, Wu J, Zuo F, Zhang G. When Livestock Genomes Meet Third-Generation Sequencing Technology: From Opportunities to Applications. Genes (Basel) 2024; 15:245. [PMID: 38397234 PMCID: PMC10888458 DOI: 10.3390/genes15020245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 01/30/2024] [Accepted: 02/10/2024] [Indexed: 02/25/2024] Open
Abstract
Third-generation sequencing technology has found widespread application in the genomic, transcriptomic, and epigenetic research of both human and livestock genetics. This technology offers significant advantages in the sequencing of complex genomic regions, the identification of intricate structural variations, and the production of high-quality genomes. Its attributes, including long sequencing reads, obviation of PCR amplification, and direct determination of DNA/RNA, contribute to its efficacy. This review presents a comprehensive overview of third-generation sequencing technologies, exemplified by single-molecule real-time sequencing (SMRT) and Oxford Nanopore Technology (ONT). Emphasizing the research advancements in livestock genomics, the review delves into genome assembly, structural variation detection, transcriptome sequencing, and epigenetic investigations enabled by third-generation sequencing. A comprehensive analysis is conducted on the application and potential challenges of third-generation sequencing technology for genome detection in livestock. Beyond providing valuable insights into genome structure analysis and the identification of rare genes in livestock, the review ventures into an exploration of the genetic mechanisms underpinning exemplary traits. This review not only contributes to our understanding of the genomic landscape in livestock but also provides fresh perspectives for the advancement of research in this domain.
Collapse
Affiliation(s)
- Xinyue Liu
- College of Animal Science and Technology, Southwest University, Rongchang, Chongqing 402460, China; (X.L.); (J.Z.); (J.D.); (J.W.); (F.Z.)
| | - Junyuan Zheng
- College of Animal Science and Technology, Southwest University, Rongchang, Chongqing 402460, China; (X.L.); (J.Z.); (J.D.); (J.W.); (F.Z.)
| | - Jialan Ding
- College of Animal Science and Technology, Southwest University, Rongchang, Chongqing 402460, China; (X.L.); (J.Z.); (J.D.); (J.W.); (F.Z.)
| | - Jiaxin Wu
- College of Animal Science and Technology, Southwest University, Rongchang, Chongqing 402460, China; (X.L.); (J.Z.); (J.D.); (J.W.); (F.Z.)
| | - Fuyuan Zuo
- College of Animal Science and Technology, Southwest University, Rongchang, Chongqing 402460, China; (X.L.); (J.Z.); (J.D.); (J.W.); (F.Z.)
- Beef Cattle Engineering and Technology Research Center of Chongqing, Southwest University, Rongchang, Chongqing 402460, China
| | - Gongwei Zhang
- College of Animal Science and Technology, Southwest University, Rongchang, Chongqing 402460, China; (X.L.); (J.Z.); (J.D.); (J.W.); (F.Z.)
- Beef Cattle Engineering and Technology Research Center of Chongqing, Southwest University, Rongchang, Chongqing 402460, China
| |
Collapse
|
15
|
Glass DS, Bren A, Vaisbourd E, Mayo A, Alon U. A synthetic differentiation circuit in Escherichia coli for suppressing mutant takeover. Cell 2024; 187:931-944.e12. [PMID: 38320549 PMCID: PMC10882425 DOI: 10.1016/j.cell.2024.01.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 11/27/2023] [Accepted: 01/16/2024] [Indexed: 02/08/2024]
Abstract
Differentiation is crucial for multicellularity. However, it is inherently susceptible to mutant cells that fail to differentiate. These mutants outcompete normal cells by excessive self-renewal. It remains unclear what mechanisms can resist such mutant expansion. Here, we demonstrate a solution by engineering a synthetic differentiation circuit in Escherichia coli that selects against these mutants via a biphasic fitness strategy. The circuit provides tunable production of synthetic analogs of stem, progenitor, and differentiated cells. It resists mutations by coupling differentiation to the production of an essential enzyme, thereby disadvantaging non-differentiating mutants. The circuit selected for and maintained a positive differentiation rate in long-term evolution. Surprisingly, this rate remained constant across vast changes in growth conditions. We found that transit-amplifying cells (fast-growing progenitors) underlie this environmental robustness. Our results provide insight into the stability of differentiation and demonstrate a powerful method for engineering evolutionarily stable multicellular consortia.
Collapse
Affiliation(s)
- David S Glass
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel.
| | - Anat Bren
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Elizabeth Vaisbourd
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Avi Mayo
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Uri Alon
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel.
| |
Collapse
|
16
|
Charron P, Kang M. VariantDetective: an accurate all-in-one pipeline for detecting consensus bacterial SNPs and SVs. Bioinformatics 2024; 40:btae066. [PMID: 38366603 PMCID: PMC10898327 DOI: 10.1093/bioinformatics/btae066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 01/16/2024] [Accepted: 02/14/2024] [Indexed: 02/18/2024] Open
Abstract
MOTIVATION Genomic variations comprise a spectrum of alterations, ranging from single nucleotide polymorphisms (SNPs) to large-scale structural variants (SVs), which play crucial roles in bacterial evolution and species diversification. Accurately identifying SNPs and SVs is beneficial for subsequent evolutionary and epidemiological studies. This study presents VariantDetective (VD), a novel, user-friendly, and all-in-one pipeline combining SNP and SV calling to generate consensus genomic variants using multiple tools. RESULTS The VD pipeline accepts various file types as input to initiate SNP and/or SV calling, and benchmarking results demonstrate VD's robustness and high accuracy across multiple tested datasets when compared to existing variant calling approaches. AVAILABILITY AND IMPLEMENTATION The source code, test data, and relevant information for VD are freely accessible at https://github.com/OLF-Bioinformatics/VariantDetective under the MIT License.
Collapse
Affiliation(s)
- Philippe Charron
- Ottawa Laboratory-Fallowfield, Canadian Food Inspection Agency, 3851 Fallowfield Road, Nepean, Ontario K2J 4S1, Canada
| | - Mingsong Kang
- Ottawa Laboratory-Fallowfield, Canadian Food Inspection Agency, 3851 Fallowfield Road, Nepean, Ontario K2J 4S1, Canada
| |
Collapse
|
17
|
Zhang Z, Jiang T, Li G, Cao S, Liu Y, Liu B, Wang Y. Kled: an ultra-fast and sensitive structural variant detection tool for long-read sequencing data. Brief Bioinform 2024; 25:bbae049. [PMID: 38385878 PMCID: PMC10883419 DOI: 10.1093/bib/bbae049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 01/12/2024] [Accepted: 01/26/2024] [Indexed: 02/23/2024] Open
Abstract
Structural Variants (SVs) are a crucial type of genetic variant that can significantly impact phenotypes. Therefore, the identification of SVs is an essential part of modern genomic analysis. In this article, we present kled, an ultra-fast and sensitive SV caller for long-read sequencing data given the specially designed approach with a novel signature-merging algorithm, custom refinement strategies and a high-performance program structure. The evaluation results demonstrate that kled can achieve optimal SV calling compared to several state-of-the-art methods on simulated and real long-read data for different platforms and sequencing depths. Furthermore, kled excels at rapid SV calling and can efficiently utilize multiple Central Processing Unit (CPU) cores while maintaining low memory usage. The source code for kled can be obtained from https://github.com/CoREse/kled.
Collapse
Affiliation(s)
- Zhendong Zhang
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Tao Jiang
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, 450000, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Gaoyang Li
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Shuqi Cao
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yadong Liu
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, 450000, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Bo Liu
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, 450000, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yadong Wang
- Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, 450000, China
- Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| |
Collapse
|
18
|
Zheng Y, Shang X. SVvalidation: A long-read-based validation method for genomic structural variation. PLoS One 2024; 19:e0291741. [PMID: 38181020 PMCID: PMC10769053 DOI: 10.1371/journal.pone.0291741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 09/05/2023] [Indexed: 01/07/2024] Open
Abstract
Although various methods have been developed to detect structural variations (SVs) in genomic sequences, few are used to validate these results. Several commonly used SV callers produce many false positive SVs, and existing validation methods are not accurate enough. Therefore, a highly efficient and accurate validation method is essential. In response, we propose SVvalidation-a new method that uses long-read sequencing data for validating SVs with higher accuracy and efficiency. Compared to existing methods, SVvalidation performs better in validating SVs in repeat regions and can determine the homozygosity or heterozygosity of an SV. Additionally, SVvalidation offers the highest recall, precision, and F1-score (improving by 7-16%) across all datasets. Moreover, SVvalidation is suitable for different types of SVs. The program is available at https://github.com/nwpuzhengyan/SVvalidation.
Collapse
Affiliation(s)
- Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, Xi’an, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi’an, China
| |
Collapse
|
19
|
Zhuang J, Jiang Y, Chen Y, Mao A, Chen J, Chen C. Third-generation sequencing identified two rare α-chain variants leading to hemoglobin variants in Chinese population. Mol Genet Genomic Med 2024; 12:e2365. [PMID: 38284449 PMCID: PMC10801340 DOI: 10.1002/mgg3.2365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Revised: 12/16/2023] [Accepted: 01/10/2024] [Indexed: 01/30/2024] Open
Abstract
BACKGROUND Rare and novel variants of HBA1/2 and HBB genes resulting in thalassemia and hemoglobin (Hb) variants have been increasingly identified. Our goal was to identify two rare Hb variants in Chinese population using third-generation sequencing (TGS) technology. METHODS Enrolled in this study were two Chinese families from Fujian Province. Hematological screening was conducted using routine blood analysis and Hb capillary electrophoresis analysis. Routine thalassemia gene testing was carried out to detect the common mutations of α- and β-thalassemia in Chinese population. Rare or novel α- and β-globin gene variants were further investigated by TGS. RESULTS The proband of family 1 was a female aged 32, with decreased levels of mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), Hb A2, and abnormal Hb bands in zone 5 and zone 12. No common thalassemia mutations were detected by routine thalassemia analysis, while a rare α-globin gene variant Hb Jilin [α139(HC1)Lys>Gln (AAA>CAA); HBA2:c.418A>C] was identified by TGS. Subsequent pedigree analysis showed that the proband's son also harbored the Hb Jilin variant with slightly low levels of MCH, Hb A2, and abnormal Hb bands. The proband of family 2 was a male at 41 years of age, exhibiting normal MCV and MCH, but a low level of Hb A2 and an abnormal Hb band in zone 12 without any common α- and β-thalassemia mutations. The subsequent TGS detection demonstrated a rare Hb Beijing [α16(A14)Lys>Asn (AAG>AAT); HBA2:c.51G>T] variant in HBA2 gene. CONCLUSION In this study, for the first time, we present two rare Hb variants of Hb Jilin and Hb Beijing in Fujian Province, Southeast China, using TGS technology.
Collapse
Affiliation(s)
- Jianlong Zhuang
- Prenatal Diagnosis CenterQuanzhou Women's and Children's HospitalQuanzhouFujianChina
| | - Yuying Jiang
- Prenatal Diagnosis CenterQuanzhou Women's and Children's HospitalQuanzhouFujianChina
| | - Yu'e Chen
- Department of UltrasoundQuanzhou Women's and Children's HospitalQuanzhouFujianChina
| | - Aiping Mao
- Department of TGS Research and Development, Berry Genomics CorporationBeijingChina
| | - Junwei Chen
- Department of Children Health CareQuanzhou Women's and Children's HospitalQuanzhouChina
| | - Chunnuan Chen
- Department of NeurologyThe Second Affiliated Hospital of Fujian Medical UniversityQuanzhouFujianChina
| |
Collapse
|
20
|
Chavarro-Carrero EA, Snelders NC, Torres DE, Kraege A, López-Moral A, Petti GC, Punt W, Wieneke J, García-Velasco R, López-Herrera CJ, Seidl MF, Thomma BPHJ. The soil-borne white root rot pathogen Rosellinia necatrix expresses antimicrobial proteins during host colonization. PLoS Pathog 2024; 20:e1011866. [PMID: 38236788 PMCID: PMC10796067 DOI: 10.1371/journal.ppat.1011866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 11/27/2023] [Indexed: 01/22/2024] Open
Abstract
Rosellinia necatrix is a prevalent soil-borne plant-pathogenic fungus that is the causal agent of white root rot disease in a broad range of host plants. The limited availability of genomic resources for R. necatrix has complicated a thorough understanding of its infection biology. Here, we sequenced nine R. necatrix strains with Oxford Nanopore sequencing technology, and with DNA proximity ligation we generated a gapless assembly of one of the genomes into ten chromosomes. Whereas many filamentous pathogens display a so-called two-speed genome with more dynamic and more conserved compartments, the R. necatrix genome does not display such genome compartmentalization. It has recently been proposed that fungal plant pathogens may employ effectors with antimicrobial activity to manipulate the host microbiota to promote infection. In the predicted secretome of R. necatrix, 26 putative antimicrobial effector proteins were identified, nine of which are expressed during plant colonization. Two of the candidates were tested, both of which were found to possess selective antimicrobial activity. Intriguingly, some of the inhibited bacteria are antagonists of R. necatrix growth in vitro and can alleviate R. necatrix infection on cotton plants. Collectively, our data show that R. necatrix encodes antimicrobials that are expressed during host colonization and that may contribute to modulation of host-associated microbiota to stimulate disease development.
Collapse
Affiliation(s)
- Edgar A. Chavarro-Carrero
- Laboratory of Phytopathology, Wageningen University & Research, Wageningen, The Netherlands
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Nick C. Snelders
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
- Theoretical Biology & Bioinformatics Group, Department of Biology, Utrecht University, Utrecht, The Netherlands
| | - David E. Torres
- Laboratory of Phytopathology, Wageningen University & Research, Wageningen, The Netherlands
- Theoretical Biology & Bioinformatics Group, Department of Biology, Utrecht University, Utrecht, The Netherlands
| | - Anton Kraege
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Ana López-Moral
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Gabriella C. Petti
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Wilko Punt
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Jan Wieneke
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Rómulo García-Velasco
- Laboratory of Phytopathology, Tenancingo University Center, Autonomous University of the State of Mexico, Tenancingo, State of Mexico, Mexico
| | - Carlos J. López-Herrera
- CSIC, Instituto de Agricultura Sostenible, Dept. Protección de Cultivos, C/Alameda del Obispo s/n, Córdoba, Spain
| | - Michael F. Seidl
- Theoretical Biology & Bioinformatics Group, Department of Biology, Utrecht University, Utrecht, The Netherlands
| | - Bart P. H. J. Thomma
- Laboratory of Phytopathology, Wageningen University & Research, Wageningen, The Netherlands
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| |
Collapse
|
21
|
Liu S, Ebel ER, Luniewski A, Zulawinska J, Simpson ML, Kim J, Ene N, Braukmann TWA, Congdon M, Santos W, Yeh E, Guler JL. Direct long read visualization reveals metabolic interplay between two antimalarial drug targets. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.13.528367. [PMID: 36824743 PMCID: PMC9948948 DOI: 10.1101/2023.02.13.528367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
Increases in the copy number of large genomic regions, termed genome amplification, are an important adaptive strategy for malaria parasites. Numerous amplifications across the Plasmodium falciparum genome contribute directly to drug resistance or impact the fitness of this protozoan parasite. During the characterization of parasite lines with amplifications of the dihydroorotate dehydrogenase (DHODH) gene, we detected increased copies of an additional genomic region that encompassed 3 genes (~5 kb) including GTP cyclohydrolase I (GCH1 amplicon). While this gene is reported to increase the fitness of antifolate resistant parasites, GCH1 amplicons had not previously been implicated in any other antimalarial resistance context. Here, we further explored the association between GCH1 and DHODH copy number. Using long read sequencing and single read visualization, we directly observed a higher number of tandem GCH1 amplicons in parasites with increased DHODH copies (up to 9 amplicons) compared to parental parasites (3 amplicons). While all GCH1 amplicons shared a consistent structure, expansions arose in 2-unit steps (from 3 to 5 to 7, etc copies). Adaptive evolution of DHODH and GCH1 loci was further bolstered when we evaluated prior selection experiments; DHODH amplification was only successful in parasite lines with pre-existing GCH1 amplicons. These observations, combined with the direct connection between metabolic pathways that contain these enzymes, lead us to propose that the GCH1 locus is beneficial for the fitness of parasites exposed to DHODH inhibitors. This finding highlights the importance of studying variation within individual parasite genomes as well as biochemical connections of drug targets as novel antimalarials move towards clinical approval.
Collapse
Affiliation(s)
- Shiwei Liu
- University of Virginia, Department of Biology, Charlottesville, VA, USA
- Current affiliation: Indiana University School of Medicine, Indianapolis, IN, USA
| | - Emily R. Ebel
- Stanford, Departments of Pediatrics and Microbiology & Immunology, Stanford, CA, USA
| | | | - Julia Zulawinska
- University of Virginia, Department of Biology, Charlottesville, VA, USA
| | | | - Jane Kim
- University of Virginia, Department of Biology, Charlottesville, VA, USA
| | - Nnenna Ene
- University of Virginia, Department of Biology, Charlottesville, VA, USA
| | | | - Molly Congdon
- Virginia Tech, Department of Chemistry, Blacksburg, VA, USA
| | - Webster Santos
- Virginia Tech, Department of Chemistry, Blacksburg, VA, USA
| | - Ellen Yeh
- Stanford University, Departments of Pathology and Microbiology & Immunology, Stanford, CA, USA
| | - Jennifer L. Guler
- University of Virginia, Department of Biology, Charlottesville, VA, USA
| |
Collapse
|
22
|
LoTempio J, Delot E, Vilain E. Benchmarking long-read genome sequence alignment tools for human genomics applications. PeerJ 2023; 11:e16515. [PMID: 38130927 PMCID: PMC10734412 DOI: 10.7717/peerj.16515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 11/02/2023] [Indexed: 12/23/2023] Open
Abstract
Background The utility of long-read genome sequencing platforms has been shown in many fields including whole genome assembly, metagenomics, and amplicon sequencing. Less clear is the applicability of long reads to reference-guided human genomics, which is the foundation of genomic medicine. Here, we benchmark available platform-agnostic alignment tools on datasets from nanopore and single-molecule real-time platforms to understand their suitability in producing a genome representation. Results For this study, we leveraged publicly-available data from sample NA12878 generated on Oxford Nanopore and sample NA24385 on Pacific Biosciences platforms. We employed state of the art sequence alignment tools including GraphMap2, long-read aligner (LRA), Minimap2, CoNvex Gap-cost alignMents for Long Reads (NGMLR), and Winnowmap2. Minimap2 and Winnowmap2 were computationally lightweight enough for use at scale, while GraphMap2 was not. NGMLR took a long time and required many resources, but produced alignments each time. LRA was fast, but only worked on Pacific Biosciences data. Each tool widely disagreed on which reads to leave unaligned, affecting the end genome coverage and the number of discoverable breakpoints. No alignment tool independently resolved all large structural variants (1,001-100,000 base pairs) present in the Database of Genome Variants (DGV) for sample NA12878 or the truthset for NA24385. Conclusions These results suggest a combined approach is needed for LRS alignments for human genomics. Specifically, leveraging alignments from three tools will be more effective in generating a complete picture of genomic variability. It should be best practice to use an analysis pipeline that generates alignments with both Minimap2 and Winnowmap2 as they are lightweight and yield different views of the genome. Depending on the question at hand, the data available, and the time constraints, NGMLR and LRA are good options for a third tool. If computational resources and time are not a factor for a given case or experiment, NGMLR will provide another view, and another chance to resolve a case. LRA, while fast, did not work on the nanopore data for our cluster, but PacBio results were promising in that those computations completed faster than Minimap2. Due to its significant burden on computational resources and slow run time, Graphmap2 is not an ideal tool for exploration of a whole human genome generated on a long-read sequencing platform.
Collapse
Affiliation(s)
- Jonathan LoTempio
- Institute for Clinical and Translational Science, University of California, Irvine, CA, United States of America
- International Research Laboratory (IRL2006) “Epigenetics, Data, Politics (EpiDaPo)”, Centre National de la Recherche Scientifique, Washington, DC, United States of America
| | - Emmanuele Delot
- Center for Genetic Medicine Research, Children’s National Hospital, Washington, DC, United States of America
- Department of Genomics and Precision Medicine, George Washington University, Washington, DC, United States of America
| | - Eric Vilain
- Institute for Clinical and Translational Science, University of California, Irvine, CA, United States of America
- International Research Laboratory (IRL2006) “Epigenetics, Data, Politics (EpiDaPo)”, Centre National de la Recherche Scientifique, Washington, DC, United States of America
| |
Collapse
|
23
|
Yu SY, Xi YL, Xu FQ, Zhang J, Liu YS. Application of long read sequencing in rare diseases: The longer, the better? Eur J Med Genet 2023; 66:104871. [PMID: 38832911 DOI: 10.1016/j.ejmg.2023.104871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 10/11/2023] [Accepted: 10/22/2023] [Indexed: 06/06/2024]
Abstract
Rare diseases encompass a diverse group of genetic disorders that affect a small proportion of the population. Identifying the underlying genetic causes of these conditions presents significant challenges due to their genetic heterogeneity and complexity. Conventional short-read sequencing (SRS) techniques have been widely used in diagnosing and investigating of rare diseases, with limitations due to the nature of short-read lengths. In recent years, long read sequencing (LRS) technologies have emerged as a valuable tool in overcoming these limitations. This minireview provides a concise overview of the applications of LRS in rare disease research and diagnosis, including the identification of disease-causing tandem repeat expansions, structural variations, and comprehensive analysis of pathogenic variants with LRS.
Collapse
Affiliation(s)
- Si-Yan Yu
- Department of Pediatric Laboratory, Affiliated Children's Hospital of Jiangnan University (Wuxi Children's Hospital), Wuxi, Jiangsu, China; The First School of Clinical Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Yu-Lin Xi
- Wuxi School of Medicine, Jiangnan University, Wuxi, Jiangsu, China
| | - Fu-Qiang Xu
- Department of Gynecology, Beijing Youan Hospital, Capital Medical University, Beijing, China
| | - Jian Zhang
- Department of Medical Laboratory, Affiliated Children's Hospital of Jiangnan University (Wuxi Children's Hospital), Wuxi, Jiangsu, China.
| | - Yan-Shan Liu
- Department of Pediatric Laboratory, Affiliated Children's Hospital of Jiangnan University (Wuxi Children's Hospital), Wuxi, Jiangsu, China; Wuxi School of Medicine, Jiangnan University, Wuxi, Jiangsu, China.
| |
Collapse
|
24
|
Johannesen KM, Tümer Z, Weckhuysen S, Barakat TS, Bayat A. Solving the unsolved genetic epilepsies: Current and future perspectives. Epilepsia 2023; 64:3143-3154. [PMID: 37750451 DOI: 10.1111/epi.17780] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 09/21/2023] [Accepted: 09/22/2023] [Indexed: 09/27/2023]
Abstract
Many patients with epilepsy undergo exome or genome sequencing as part of a diagnostic workup; however, many remain genetically unsolved. There are various factors that account for negative results in exome/genome sequencing for patients with epilepsy: (1) the underlying cause is not genetic; (2) there is a complex polygenic explanation; (3) the illness is monogenic but the causative gene remains to be linked to a human disorder; (4) family segregation with reduced penetrance; (5) somatic mosaicism or the complexity of, for example, a structural rearrangement; or (6) limited knowledge or diagnostic tools that hinder the proper classification of a variant, resulting in its designation as a variant of unknown significance. The objective of this review is to outline some of the diagnostic options that lie beyond the exome/genome, and that might become clinically relevant within the foreseeable future. These options include: (1) re-analysis of older exome/genome data as knowledge increases or symptoms change; (2) looking for somatic mosaicism or long-read sequencing to detect low-complexity repeat variants or specific structural variants missed by traditional exome/genome sequencing; (3) exploration of the non-coding genome including disruption of topologically associated domains, long range non-coding RNA, or other regulatory elements; and finally (4) transcriptomics, DNA methylation signatures, and metabolomics as complementary diagnostic methods that may be used in the assessment of variants of unknown significance. Some of these tools are currently not integrated into standard diagnostic workup. However, it is reasonable to expect that they will become increasingly available and improve current diagnostic capabilities, thereby enabling precision diagnosis in patients who are currently undiagnosed.
Collapse
Affiliation(s)
- Katrine M Johannesen
- Department of Genetics, Copenhagen University Hospital, Copenhagen, Denmark
- Department of Epilepsy Genetics and Personalized Medicine, The Danish Epilepsy Center, Dianalund, Denmark
| | - Zeynep Tümer
- Department of Genetics, Copenhagen University Hospital, Copenhagen, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Sarah Weckhuysen
- Applied and Translational Neurogenomics Group, VIB Centre for Molecular Neurology, Antwerp, Belgium
- Translational Neurosciences, Faculty of Medicine and Health Science, University of Antwerp, Antwerp, Belgium
- Department of Neurology, University Hospital Antwerp, Antwerp, Belgium
- μNEURO Research Centre of Excellence, University of Antwerp, Antwerp, Belgium
| | - Tahsin Stefan Barakat
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
- Discovery Unit, Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
- ENCORE Expertise Center for Neurodevelopmental Disorders, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Allan Bayat
- Department of Epilepsy Genetics and Personalized Medicine, The Danish Epilepsy Center, Dianalund, Denmark
- Department of Regional Health Research, University of Southern Denmark, Odense, Denmark
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
25
|
Girgis ST, Adika E, Nenyewodey FE, Senoo Jnr DK, Ngoi JM, Bandoh K, Lorenz O, van de Steeg G, Harrott AJR, Nsoh S, Judge K, Pearson RD, Almagro-Garcia J, Saiid S, Atampah S, Amoako EK, Morang'a CM, Asoala V, Adjei ES, Burden W, Roberts-Sengier W, Drury E, Pierce ML, Gonçalves S, Awandare GA, Kwiatkowski DP, Amenga-Etego LN, Hamilton WL. Drug resistance and vaccine target surveillance of Plasmodium falciparum using nanopore sequencing in Ghana. Nat Microbiol 2023; 8:2365-2377. [PMID: 37996707 PMCID: PMC10686832 DOI: 10.1038/s41564-023-01516-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 10/06/2023] [Indexed: 11/25/2023]
Abstract
Malaria results in over 600,000 deaths annually, with the highest burden of deaths in young children living in sub-Saharan Africa. Molecular surveillance can provide important information for malaria control policies, including detection of antimalarial drug resistance. However, genome sequencing capacity in malaria-endemic countries is limited. We designed and implemented an end-to-end workflow to detect Plasmodium falciparum antimalarial resistance markers and diversity in the vaccine target circumsporozoite protein (csp) using nanopore sequencing in Ghana. We analysed 196 clinical samples and showed that our method is rapid, robust, accurate and straightforward to implement. Importantly, our method could be applied to dried blood spot samples, which are readily collected in endemic settings. We report that P. falciparum parasites in Ghana are mostly susceptible to chloroquine, with persistent sulfadoxine-pyrimethamine resistance and no evidence of artemisinin resistance. Multiple single nucleotide polymorphisms were identified in csp, but their significance is uncertain. Our study demonstrates the feasibility of nanopore sequencing for malaria genomic surveillance in endemic countries.
Collapse
Affiliation(s)
- Sophia T Girgis
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Edem Adika
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), College of Basic and Applied Sciences, University of Ghana, Legon, Ghana
| | - Felix E Nenyewodey
- Navrongo Health Research Centre (NHRC), Ghana Health Service, Navrongo, Upper East Region, Ghana
| | - Dodzi K Senoo Jnr
- Navrongo Health Research Centre (NHRC), Ghana Health Service, Navrongo, Upper East Region, Ghana
| | - Joyce M Ngoi
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), College of Basic and Applied Sciences, University of Ghana, Legon, Ghana
| | - Kukua Bandoh
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), College of Basic and Applied Sciences, University of Ghana, Legon, Ghana
| | - Oliver Lorenz
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Guus van de Steeg
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | | | - Sebastian Nsoh
- Navrongo Health Research Centre (NHRC), Ghana Health Service, Navrongo, Upper East Region, Ghana
| | - Kim Judge
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Richard D Pearson
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | | | - Samirah Saiid
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), College of Basic and Applied Sciences, University of Ghana, Legon, Ghana
| | - Solomon Atampah
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), College of Basic and Applied Sciences, University of Ghana, Legon, Ghana
| | - Enock K Amoako
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), College of Basic and Applied Sciences, University of Ghana, Legon, Ghana
| | - Collins M Morang'a
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), College of Basic and Applied Sciences, University of Ghana, Legon, Ghana
| | - Victor Asoala
- Navrongo Health Research Centre (NHRC), Ghana Health Service, Navrongo, Upper East Region, Ghana
| | - Elrmion S Adjei
- Ledzokuku Krowor Municipal Assembly (LEKMA) Hospital, Accra, Ghana
| | - William Burden
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | | | - Eleanor Drury
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Megan L Pierce
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Sónia Gonçalves
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Gordon A Awandare
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), College of Basic and Applied Sciences, University of Ghana, Legon, Ghana
| | | | - Lucas N Amenga-Etego
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), College of Basic and Applied Sciences, University of Ghana, Legon, Ghana.
| | - William L Hamilton
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK.
- Department of Medicine, University of Cambridge, Cambridge, UK.
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK.
| |
Collapse
|
26
|
Ren L, Duan X, Dong L, Zhang R, Yang J, Gao Y, Peng R, Hou W, Liu Y, Li J, Yu Y, Zhang N, Shang J, Liang F, Wang D, Chen H, Sun L, Hao L, Scherer A, Nordlund J, Xiao W, Xu J, Tong W, Hu X, Jia P, Ye K, Li J, Jin L, Hong H, Wang J, Fan S, Fang X, Zheng Y, Shi L. Quartet DNA reference materials and datasets for comprehensively evaluating germline variant calling performance. Genome Biol 2023; 24:270. [PMID: 38012772 PMCID: PMC10680274 DOI: 10.1186/s13059-023-03109-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 11/13/2023] [Indexed: 11/29/2023] Open
Abstract
BACKGROUND Genomic DNA reference materials are widely recognized as essential for ensuring data quality in omics research. However, relying solely on reference datasets to evaluate the accuracy of variant calling results is incomplete, as they are limited to benchmark regions. Therefore, it is important to develop DNA reference materials that enable the assessment of variant detection performance across the entire genome. RESULTS We established a DNA reference material suite from four immortalized cell lines derived from a family of parents and monozygotic twins. Comprehensive reference datasets of 4.2 million small variants and 15,000 structural variants were integrated and certified for evaluating the reliability of germline variant calls inside the benchmark regions. Importantly, the genetic built-in-truth of the Quartet family design enables estimation of the precision of variant calls outside the benchmark regions. Using the Quartet reference materials along with study samples, batch effects are objectively monitored and alleviated by training a machine learning model with the Quartet reference datasets to remove potential artifact calls. Moreover, the matched RNA and protein reference materials and datasets from the Quartet project enables cross-omics validation of variant calls from multiomics data. CONCLUSIONS The Quartet DNA reference materials and reference datasets provide a unique resource for objectively assessing the quality of germline variant calls throughout the whole-genome regions and improving the reliability of large-scale genomic profiling.
Collapse
Affiliation(s)
- Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Xiaoke Duan
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | | | - Rui Zhang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, Guangdong, China
| | - Yuechen Gao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Rongxue Peng
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Jingjing Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
- Nextomics Biosciences Institute, Wuhan, Hubei, China
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Jun Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Fan Liang
- Nextomics Biosciences Institute, Wuhan, Hubei, China
| | - Depeng Wang
- Nextomics Biosciences Institute, Wuhan, Hubei, China
| | - Hui Chen
- OrigiMed Co., Ltd, Shanghai, China
| | - Lele Sun
- Sequanta Technologies Co., Ltd, Shanghai, China
| | | | - Andreas Scherer
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- EATRIS ERIC-European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
| | - Jessica Nordlund
- EATRIS ERIC-European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
- Department of Medical Sciences, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Wenming Xiao
- Office of Oncologic Diseases, Office of New Drugs, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Xin Hu
- Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Peng Jia
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Kai Ye
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Jinming Li
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China
| | - Li Jin
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Jing Wang
- National Institute of Metrology, Beijing, China.
| | - Shaohua Fan
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China.
| | - Xiang Fang
- National Institute of Metrology, Beijing, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China.
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
- Shanghai Cancer Center, Fudan University, Shanghai, China
- International Human Phenome Institutes, Shanghai, China
| |
Collapse
|
27
|
Zhang S, Cui Q, Yang S, Zhang F, Li C, Wang X, Lei B, Sheng X. Exome and genome sequencing to unravel the precise breakpoints of partial trisomy 6q and partial Monosomy 2q. BMC Pediatr 2023; 23:586. [PMID: 37993819 PMCID: PMC10664609 DOI: 10.1186/s12887-023-04368-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 10/15/2023] [Indexed: 11/24/2023] Open
Abstract
BACKGROUND Patients with complex phenotypes and a chromosomal translocation are particularly challenging, since several potentially pathogenic mechanisms need to be investigated. CASE PRESENTATION Here, we combined exome and genome sequencing techniques to identify the precise breakpoints of heterozygous microduplications in the 6q25.3-q27 region and microdeletions in the 2q37.1-q37.3 region in a proband. The 5-year-old girl exhibited a severe form of congenital cranial dysinnervation disorder (CCDD) in addition to skeletal dysmorphism anomalies and severe intellectual disability. This is the second case affecting chromosomes 2q and 6q. The individual's karyotype showed an unbalanced translocation 46,XX,del(2)t(2;6)(q37.1;q25.3), which was inherited from her unaffected father [46,XY,t(2;6)(q37.1;q25.3)]. We also obtained the precise breakpoints of a de novo heterozygous copy number deletion [del(2)(q37.1q37.3)chr2:g.232963568_24305260del] and a copy number duplication [dup(6)(q25.3q27)chr6:g.158730978_170930050dup]. The parental origin of the observed balanced translocation was not clear because the parents declined genetic testing. CONCLUSION Patients with a 2q37 deletion and 6q25.3 duplication may exhibit severe significant neurological and skeletal dysmorphisms, and the utilization of exome and genome sequencing techniques has the potential to unveil the entire translocation of the CNV and the precise breakpoint.
Collapse
Affiliation(s)
- Shuang Zhang
- People's Hospital of Ningxia Hui Autonomous Region (Ningxia Medical University), Ningxia Eye Hospital, Yinchuan, 750001, China
| | - Qianwei Cui
- People's Hospital of Ningxia Hui Autonomous Region (Ningxia Medical University), Ningxia Eye Hospital, Yinchuan, 750001, China
| | - Shangying Yang
- People's Hospital of Ningxia Hui Autonomous Region (Ningxia Medical University), Ningxia Eye Hospital, Yinchuan, 750001, China
| | - Fangxia Zhang
- People's Hospital of Ningxia Hui Autonomous Region (Ningxia Medical University), Ningxia Eye Hospital, Yinchuan, 750001, China
| | - Chunxia Li
- People's Hospital of Ningxia Hui Autonomous Region (Ningxia Medical University), Ningxia Eye Hospital, Yinchuan, 750001, China
| | - Xiaoguang Wang
- People's Hospital of Ningxia Hui Autonomous Region (Ningxia Medical University), Ningxia Eye Hospital, Yinchuan, 750001, China
| | - Bo Lei
- Henan Eye Institute, Henan Eye Hospital, People's Hospital of Zhengzhou University, Henan Provincial People's Hospital, Zhengzhou, Henan, 450003, China.
| | - Xunlun Sheng
- Gansu Aier Ophthalmology & Optometry Hospital, Lanzhou, 730030, China.
| |
Collapse
|
28
|
Lin W, Chu L, Su Y, Xie R, Yao X, Zan X, Xu P, Liu W. Limit and screen sequences with high degree of secondary structures in DNA storage by deep learning method. Comput Biol Med 2023; 166:107548. [PMID: 37801922 DOI: 10.1016/j.compbiomed.2023.107548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Revised: 08/24/2023] [Accepted: 09/28/2023] [Indexed: 10/08/2023]
Abstract
BACKGROUND In single-stranded DNAs/RNAs, secondary structures are very common especially in long sequences. It has been recognized that the high degree of secondary structures in DNA sequences could interfere with the correct writing and reading of information in DNA storage. However, how to circumvent its side-effect is seldom studied. METHOD As the degree of secondary structures of DNA sequences is closely related to the magnitude of the free energy released in the complicated folding process, we first investigate the free-energy distribution at different encoding lengths based on randomly generated DNA sequences. Then, we construct a bidirectional long short-term (BiLSTM)-attention deep learning model to predict the free energy of sequences. RESULTS Our simulation results indicate that the free energy of DNA sequences at a specific length follows a right skewed distribution and the mean increases as the length increases. Given a tolerable free energy threshold of 20 kcal/mol, we could control the ratio of serious secondary structures in the encoding sequences to within 1% of the significant level through selecting a feasible encoding length of 100 nt. Compared with traditional deep learning models, the proposed model could achieve a better prediction performance both in the mean relative error (MRE) and the coefficient of determination (R2). It achieved MRE = 0.109 and R2 = 0.918 respectively in the simulation experiment. The combination of the BiLSTM and attention module can handle the long-term dependencies and capture the feature of base pairing. Further, the prediction has a linear time complexity which is suitable for detecting sequences with severe secondary structures in future large-scale applications. Finally, 70 of 94 predicted free energy can be screened out on a real dataset. It demonstrates that the proposed model could screen out some highly suspicious sequences which are prone to produce more errors and low sequencing copies.
Collapse
Affiliation(s)
- Wanmin Lin
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou, Guangdong, China
| | - Ling Chu
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou, Guangdong, China
| | - Yanqing Su
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou, Guangdong, China
| | - Ranze Xie
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou, Guangdong, China
| | - Xiangyu Yao
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou, Guangdong, China
| | - Xiangzhen Zan
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou, Guangdong, China
| | - Peng Xu
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou, Guangdong, China; School of Computer Science of Information Technology, Qiannan Normal University for Nationalities, Duyun, Guizhou, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou, Guangdong, China.
| | - Wenbin Liu
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou, Guangdong, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou, Guangdong, China.
| |
Collapse
|
29
|
Allou L, Mundlos S. Disruption of regulatory domains and novel transcripts as disease-causing mechanisms. Bioessays 2023; 45:e2300010. [PMID: 37381881 DOI: 10.1002/bies.202300010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 05/24/2023] [Accepted: 06/06/2023] [Indexed: 06/30/2023]
Abstract
Deletions, duplications, insertions, inversions, and translocations, collectively called structural variations (SVs), affect more base pairs of the genome than any other sequence variant. The recent technological advancements in genome sequencing have enabled the discovery of tens of thousands of SVs per human genome. These SVs primarily affect non-coding DNA sequences, but the difficulties in interpreting their impact limit our understanding of human disease etiology. The functional annotation of non-coding DNA sequences and methodologies to characterize their three-dimensional (3D) organization in the nucleus have greatly expanded our understanding of the basic mechanisms underlying gene regulation, thereby improving the interpretation of SVs for their pathogenic impact. Here, we discuss the various mechanisms by which SVs can result in altered gene regulation and how these mechanisms can result in rare genetic disorders. Beyond changing gene expression, SVs can produce novel gene-intergenic fusion transcripts at the SV breakpoints.
Collapse
Affiliation(s)
- Lila Allou
- RG Development & Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Stefan Mundlos
- RG Development & Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, Berlin, Germany
- Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Berlin, Germany
| |
Collapse
|
30
|
Lin Z, Lu Y, Yu G, Teng H, Wang B, Yang Y, Li Q, Sun Z, Xu S, Wang W, Tian P. Genome-wide DNA methylation landscape of four Chinese populations and epigenetic variation linked to Tibetan high-altitude adaptation. SCIENCE CHINA. LIFE SCIENCES 2023; 66:2354-2369. [PMID: 37115492 DOI: 10.1007/s11427-022-2284-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 01/18/2023] [Indexed: 04/29/2023]
Abstract
DNA methylation (DNAm) is one of the major epigenetic mechanisms in humans and is important in diverse cellular processes. The variation of DNAm in the human population is related to both genetic and environmental factors. However, the DNAm profiles have not been investigated in the Chinese population of diverse ethnicities. Here, we performed double-strand bisulfite sequencing (DSBS) for 32 Chinese individuals representing four major ethnic groups including Han Chinese, Tibetan, Zhuang, and Mongolian. We identified a total of 604,649 SNPs and quantified DNAm at more than 14 million CpGs in the population. We found global DNAm-based epigenetic structure is different from the genetic structure of the population, and ethnic difference only partially explains the variation of DNAm. Surprisingly, non-ethnic-specific DNAm variations showed stronger correlation with the global genetic divergence than these ethnic-specific DNAm. Differentially methylated regions (DMRs) among these ethnic groups were found around genes in diverse biological processes. Especially, these DMR-genes between Tibetan and non-Tibetans were enriched around high-altitude genes including EPAS1 and EGLN1, suggesting DNAm alteration plays an important role in high-altitude adaptation. Our results provide the first batch of epigenetic maps for Chinese populations and the first evidence of the association of epigenetic changes with Tibetans' high-altitude adaptation.
Collapse
Affiliation(s)
- Zeshan Lin
- School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Yan Lu
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, Collaborative Innovation Center of Genetics and Development, School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Guoliang Yu
- GrandOmics Biosciences, Beijing, 102200, China
| | - Huajing Teng
- Department of Radiation Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Peking University Cancer Hospital and Institute, Beijing, 100142, China
| | - Bao Wang
- School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Yajun Yang
- Human Phenome Institute, Zhangjiang Fudan International Innovation Center, and Ministry of Education Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai, 201203, China
| | - Qinglan Li
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, 100101, China
| | - Zhongsheng Sun
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, 100101, China
| | - Shuhua Xu
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, Collaborative Innovation Center of Genetics and Development, School of Life Sciences, Fudan University, Shanghai, 200438, China.
- Human Phenome Institute, Zhangjiang Fudan International Innovation Center, and Ministry of Education Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai, 201203, China.
- School of Life Science and Technology, ShanghaiTech University, Shanghai, 201210, China.
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.
| | - Wen Wang
- School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, 710072, China.
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.
| | - Peng Tian
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Agronomy, Northwest A&F University, Yangling, 712100, China.
| |
Collapse
|
31
|
Xia X, Zhang F, Li S, Luo X, Peng L, Dong Z, Pausch H, Leonard AS, Crysnanto D, Wang S, Tong B, Lenstra JA, Han J, Li F, Xu T, Gu L, Jin L, Dang R, Huang Y, Lan X, Ren G, Wang Y, Gao Y, Ma Z, Cheng H, Ma Y, Chen H, Pang W, Lei C, Chen N. Structural variation and introgression from wild populations in East Asian cattle genomes confer adaptation to local environment. Genome Biol 2023; 24:211. [PMID: 37723525 PMCID: PMC10507960 DOI: 10.1186/s13059-023-03052-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 09/07/2023] [Indexed: 09/20/2023] Open
Abstract
BACKGROUND Structural variations (SVs) in individual genomes are major determinants of complex traits, including adaptability to environmental variables. The Mongolian and Hainan cattle breeds in East Asia are of taurine and indicine origins that have evolved to adapt to cold and hot environments, respectively. However, few studies have investigated SVs in East Asian cattle genomes and their roles in environmental adaptation, and little is known about adaptively introgressed SVs in East Asian cattle. RESULTS In this study, we examine the roles of SVs in the climate adaptation of these two cattle lineages by generating highly contiguous chromosome-scale genome assemblies. Comparison of the two assemblies along with 18 Mongolian and Hainan cattle genomes obtained by long-read sequencing data provides a catalog of 123,898 nonredundant SVs. Several SVs detected from long reads are in exons of genes associated with epidermal differentiation, skin barrier, and bovine tuberculosis resistance. Functional investigations show that a 108-bp exonic insertion in SPN may affect the uptake of Mycobacterium tuberculosis by macrophages, which might contribute to the low susceptibility of Hainan cattle to bovine tuberculosis. Genotyping of 373 whole genomes from 39 breeds identifies 2610 SVs that are differentiated along a "north-south" gradient in China and overlap with 862 related genes that are enriched in pathways related to environmental adaptation. We identify 1457 Chinese indicine-stratified SVs that possibly originate from banteng and are frequent in Chinese indicine cattle. CONCLUSIONS Our findings highlight the unique contribution of SVs in East Asian cattle to environmental adaptation and disease resistance.
Collapse
Affiliation(s)
- Xiaoting Xia
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
| | - Fengwei Zhang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
| | - Shuang Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
| | - Xiaoyu Luo
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
| | - Lixin Peng
- National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, China
| | - Zheng Dong
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
| | - Hubert Pausch
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8006, Zurich, Switzerland
| | - Alexander S Leonard
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8006, Zurich, Switzerland
| | - Danang Crysnanto
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8006, Zurich, Switzerland
| | - Shikang Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
| | - Bin Tong
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, School of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Johannes A Lenstra
- Faculty of Veterinary Medicine, Utrecht University, Utrecht, The Netherlands
| | - Jianlin Han
- Livestock Genetics Program, International Livestock Research Institute (ILRI), Nairobi, Kenya
- CAAS-ILRI Joint Laboratory On Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agriculture Sciences (CAAS), Beijing, China
| | - Fuyong Li
- Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Kowloon, Hong Kong SAR, China
| | - Tieshan Xu
- Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou, China
| | - Lihong Gu
- Institute of Animal Science & Veterinary Medicine, Hainan Academy of Agricultural Sciences, Haikou, China
| | - Liangliang Jin
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
| | - Ruihua Dang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
| | - Yongzhen Huang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
| | - Xianyong Lan
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
| | - Gang Ren
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
| | - Yu Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
| | - Yuanpeng Gao
- College of Veterinary Medicine, Northwest A&F University, Xianyang, Yangling, China
| | - Zhijie Ma
- Qinghai Academy of Animal Science and Veterinary Medicine, Qinghai University, Xining, China
| | - Haijian Cheng
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
- Institute of Animal Science and Veterinary Medicine, Shandong Academy of Agricultural Sciences, Shandong Key Lab of Animal Disease Control and Breeding, Jinan, China
| | - Yun Ma
- Key Laboratory of Ruminant Molecular and Cellular Breeding of Ningxia Hui Autonomous Region, School of Agriculture, Ningxia University, Yinchuan, China
| | - Hong Chen
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China
| | - Weijun Pang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China.
| | - Chuzhao Lei
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China.
| | - Ningbo Chen
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Xianyang, China.
| |
Collapse
|
32
|
Romagnoli S, Bartalucci N, Vannucchi AM. Resolving complex structural variants via nanopore sequencing. Front Genet 2023; 14:1213917. [PMID: 37674481 PMCID: PMC10479017 DOI: 10.3389/fgene.2023.1213917] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 07/26/2023] [Indexed: 09/08/2023] Open
Abstract
The recent development of high-throughput sequencing platforms provided impressive insights into the field of human genetics and contributed to considering structural variants (SVs) as the hallmark of genome instability, leading to the establishment of several pathologic conditions, including neoplasia and neurodegenerative and cognitive disorders. While SV detection is addressed by next-generation sequencing (NGS) technologies, the introduction of more recent long-read sequencing technologies have already been proven to be invaluable in overcoming the inaccuracy and limitations of NGS technologies when applied to resolve wide and structurally complex SVs due to the short length (100-500 bp) of the sequencing read utilized. Among the long-read sequencing technologies, Oxford Nanopore Technologies developed a sequencing platform based on a protein nanopore that allows the sequencing of "native" long DNA molecules of virtually unlimited length (typical range 1-100 Kb). In this review, we focus on the bioinformatics methods that improve the identification and genotyping of known and novel SVs to investigate human pathological conditions, discussing the possibility of introducing nanopore sequencing technology into routine diagnostics.
Collapse
Affiliation(s)
| | | | - Alessandro Maria Vannucchi
- CRIMM, Center of Research and Innovation of Myeloproliferative Neoplasms, DENOTHE Excellence Center, Careggi University Hospital and Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
| |
Collapse
|
33
|
Shiraishi Y, Koya J, Chiba K, Okada A, Arai Y, Saito Y, Shibata T, Kataoka K. Precise characterization of somatic complex structural variations from tumor/control paired long-read sequencing data with nanomonsv. Nucleic Acids Res 2023; 51:e74. [PMID: 37336583 PMCID: PMC10415145 DOI: 10.1093/nar/gkad526] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 05/23/2023] [Accepted: 06/07/2023] [Indexed: 06/21/2023] Open
Abstract
We present our novel software, nanomonsv, for detecting somatic structural variations (SVs) using tumor and matched control long-read sequencing data with a single-base resolution. The current version of nanomonsv includes two detection modules, Canonical SV module, and Single breakend SV module. Using tumor/control paired long-read sequencing data from three cancer and their matched lymphoblastoid lines, we demonstrate that Canonical SV module can identify somatic SVs that can be captured by short-read technologies with higher precision and recall than existing methods. In addition, we have developed a workflow to classify mobile element insertions while elucidating their in-depth properties, such as 5' truncations, internal inversions, as well as source sites for 3' transductions. Furthermore, Single breakend SV module enables the detection of complex SVs that can only be identified by long-reads, such as SVs involving highly-repetitive centromeric sequences, and LINE1- and virus-mediated rearrangements. In summary, our approaches applied to cancer long-read sequencing data can reveal various features of somatic SVs and will lead to a better understanding of mutational processes and functional consequences of somatic SVs.
Collapse
Affiliation(s)
- Yuichi Shiraishi
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Junji Koya
- Division of Molecular Oncology, National Cancer Center Research Institute, Tokyo, Japan
| | - Kenichi Chiba
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Ai Okada
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Yasuhito Arai
- Division of Cancer Genomics, National Cancer Center Research Institute, Tokyo, Japan
| | - Yuki Saito
- Division of Molecular Oncology, National Cancer Center Research Institute, Tokyo, Japan
- Department of Gastroenterology, Keio University School of Medicine, Tokyo, Japan
| | - Tatsuhiro Shibata
- Division of Cancer Genomics, National Cancer Center Research Institute, Tokyo, Japan
- Laboratory of Molecular Medicine, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Keisuke Kataoka
- Division of Molecular Oncology, National Cancer Center Research Institute, Tokyo, Japan
- Department of Hematology, Keio University School of Medicine, Tokyo, Japan
| |
Collapse
|
34
|
Wojcik MH, Reuter CM, Marwaha S, Mahmoud M, Duyzend MH, Barseghyan H, Yuan B, Boone PM, Groopman EE, Délot EC, Jain D, Sanchis-Juan A, Starita LM, Talkowski M, Montgomery SB, Bamshad MJ, Chong JX, Wheeler MT, Berger SI, O'Donnell-Luria A, Sedlazeck FJ, Miller DE. Beyond the exome: What's next in diagnostic testing for Mendelian conditions. Am J Hum Genet 2023; 110:1229-1248. [PMID: 37541186 PMCID: PMC10432150 DOI: 10.1016/j.ajhg.2023.06.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 06/13/2023] [Accepted: 06/14/2023] [Indexed: 08/06/2023] Open
Abstract
Despite advances in clinical genetic testing, including the introduction of exome sequencing (ES), more than 50% of individuals with a suspected Mendelian condition lack a precise molecular diagnosis. Clinical evaluation is increasingly undertaken by specialists outside of clinical genetics, often occurring in a tiered fashion and typically ending after ES. The current diagnostic rate reflects multiple factors, including technical limitations, incomplete understanding of variant pathogenicity, missing genotype-phenotype associations, complex gene-environment interactions, and reporting differences between clinical labs. Maintaining a clear understanding of the rapidly evolving landscape of diagnostic tests beyond ES, and their limitations, presents a challenge for non-genetics professionals. Newer tests, such as short-read genome or RNA sequencing, can be challenging to order, and emerging technologies, such as optical genome mapping and long-read DNA sequencing, are not available clinically. Furthermore, there is no clear guidance on the next best steps after inconclusive evaluation. Here, we review why a clinical genetic evaluation may be negative, discuss questions to be asked in this setting, and provide a framework for further investigation, including the advantages and disadvantages of new approaches that are nascent in the clinical sphere. We present a guide for the next best steps after inconclusive molecular testing based upon phenotype and prior evaluation, including when to consider referral to research consortia focused on elucidating the underlying cause of rare unsolved genetic disorders.
Collapse
Affiliation(s)
- Monica H Wojcik
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA; Division of Newborn Medicine, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| | - Chloe M Reuter
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Shruti Marwaha
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Medhat Mahmoud
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA
| | - Michael H Duyzend
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Hayk Barseghyan
- Center for Genetics Medicine Research, Children's National Research Institute, Children's National Hospital, Washington, DC 20010, USA; Department of Genomics and Precision Medicine, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
| | - Bo Yuan
- Department of Molecular and Human Genetics and Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA
| | - Philip M Boone
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Emily E Groopman
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Emmanuèle C Délot
- Department of Genomics and Precision Medicine, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA; Center for Genetics Medicine Research, Children's National Research and Innovation Campus, Washington, DC, USA; Department of Pediatrics, George Washington University, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
| | - Deepti Jain
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, WA 98195, USA
| | - Alba Sanchis-Juan
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Lea M Starita
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA 98195, USA; Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Michael Talkowski
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Stephen B Montgomery
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Michael J Bamshad
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA 98195, USA; Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Department of Pediatrics, Division of Genetic Medicine, University of Washington, Seattle, WA 98195, USA
| | - Jessica X Chong
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA 98195, USA; Department of Pediatrics, Division of Genetic Medicine, University of Washington, Seattle, WA 98195, USA
| | - Matthew T Wheeler
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Seth I Berger
- Center for Genetics Medicine Research and Rare Disease Institute, Children's National Hospital, Washington, DC 20010, USA
| | - Anne O'Donnell-Luria
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA; Department of Computer Science, Rice University, 6100 Main Street, Houston, TX 77005, USA
| | - Danny E Miller
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA 98195, USA; Department of Pediatrics, Division of Genetic Medicine, University of Washington, Seattle, WA 98195, USA; Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
35
|
Ahsan MU, Liu Q, Perdomo JE, Fang L, Wang K. A survey of algorithms for the detection of genomic structural variants from long-read sequencing data. Nat Methods 2023; 20:1143-1158. [PMID: 37386186 PMCID: PMC11208083 DOI: 10.1038/s41592-023-01932-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 05/31/2023] [Indexed: 07/01/2023]
Abstract
As long-read sequencing technologies are becoming increasingly popular, a number of methods have been developed for the discovery and analysis of structural variants (SVs) from long reads. Long reads enable detection of SVs that could not be previously detected from short-read sequencing, but computational methods must adapt to the unique challenges and opportunities presented by long-read sequencing. Here, we summarize over 50 long-read-based methods for SV detection, genotyping and visualization, and discuss how new telomere-to-telomere genome assemblies and pangenome efforts can improve the accuracy and drive the development of SV callers in the future.
Collapse
Affiliation(s)
- Mian Umair Ahsan
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Qian Liu
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Jonathan Elliot Perdomo
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- School of Biomedical Engineering, Drexel University, Philadelphia, PA, USA
| | - Li Fang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Genetics and Biomedical Informatics, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Kai Wang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
36
|
Park K, Noh J, Kim K, Kim J, Cho HK, Kim SG, Yang E, Kim WK, Song JW. A Development of Rapid Whole-Genome Sequencing of Seoul orthohantavirus Using a Portable One-Step Amplicon-Based High Accuracy Nanopore System. Viruses 2023; 15:1542. [PMID: 37515228 PMCID: PMC10386077 DOI: 10.3390/v15071542] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 07/10/2023] [Accepted: 07/11/2023] [Indexed: 07/30/2023] Open
Abstract
Whole-genome sequencing provides a robust platform for investigating the epidemiology and transmission of emerging viruses. Oxford Nanopore Technologies allows for real-time viral sequencing on a local laptop system for point-of-care testing. Seoul orthohantavirus (Seoul virus, SEOV), harbored by Rattus norvegicus and R. rattus, causes mild hemorrhagic fever with renal syndrome and poses an important threat to public health worldwide. We evaluated the deployable MinION system to obtain high-fidelity entire-length sequences of SEOV for the genome identification of accurate infectious sources and their genetic diversity. One-step amplicon-based nanopore sequencing was performed from SEOV 80-39 specimens with different viral copy numbers and SEOV-positive wild rats. The KU-ONT-SEOV-consensus module was developed to analyze SEOV genomic sequences generated from the nanopore system. Using amplicon-based nanopore sequencing and the KU-ONT-consensus pipeline, we demonstrated novel molecular diagnostics for acquiring full-length SEOV genome sequences, with sufficient read depth in less than 6 h. The consensus sequence accuracy of the SEOV small, medium, and large genomes showed 99.75-100% (for SEOV 80-39 isolate) and 99.62-99.89% (for SEOV-positive rats) identities. This study provides useful insights into on-site diagnostics based on nanopore technology and the genome epidemiology of orthohantaviruses for a quicker response to hantaviral outbreaks.
Collapse
Affiliation(s)
- Kyungmin Park
- Department of Microbiology, College of Medicine, Korea University, Seoul 02841, Republic of Korea
- BK21 Graduate Program, Department of Biomedical Sciences, Korea University College of Medicine, Seoul 02841, Republic of Korea
| | - Juyoung Noh
- Department of Microbiology, College of Medicine, Korea University, Seoul 02841, Republic of Korea
- BK21 Graduate Program, Department of Biomedical Sciences, Korea University College of Medicine, Seoul 02841, Republic of Korea
| | - Kijin Kim
- Centre for Infectious Disease Genomics and One Health, Faculty of Health Sciences, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Jongwoo Kim
- Department of Microbiology, College of Medicine, Korea University, Seoul 02841, Republic of Korea
- BK21 Graduate Program, Department of Biomedical Sciences, Korea University College of Medicine, Seoul 02841, Republic of Korea
| | - Hee-Kyung Cho
- Department of Microbiology, College of Medicine, Korea University, Seoul 02841, Republic of Korea
- BK21 Graduate Program, Department of Biomedical Sciences, Korea University College of Medicine, Seoul 02841, Republic of Korea
| | - Seong-Gyu Kim
- Department of Microbiology, College of Medicine, Korea University, Seoul 02841, Republic of Korea
- BK21 Graduate Program, Department of Biomedical Sciences, Korea University College of Medicine, Seoul 02841, Republic of Korea
| | - Eunyoung Yang
- Department of Microbiology, College of Medicine, Korea University, Seoul 02841, Republic of Korea
| | - Won-Keun Kim
- Department of Microbiology, College of Medicine, Hallym University, Chuncheon 24252, Republic of Korea
- Institute of Medical Research, College of Medicine, Hallym University, Chuncheon 24252, Republic of Korea
| | - Jin-Won Song
- Department of Microbiology, College of Medicine, Korea University, Seoul 02841, Republic of Korea
- BK21 Graduate Program, Department of Biomedical Sciences, Korea University College of Medicine, Seoul 02841, Republic of Korea
| |
Collapse
|
37
|
Pyrak E, Kowalczyk A, Weyher JL, Nowicka AM, Kudelski A. Influence of sandwich-type DNA construction strategy and plasmonic metal on signal generated by SERS DNA sensors. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2023; 295:122606. [PMID: 36934597 DOI: 10.1016/j.saa.2023.122606] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 02/26/2023] [Accepted: 03/06/2023] [Indexed: 06/18/2023]
Abstract
The DNA biosensors are powerful tools in the gene mutation or pathogens detection. That is why there are a lot of DNA detection strategies and methods. Here we present the insight on a slightly overlooked DNA detection technique, surface-enhanced Raman scattering (SERS). The present work is a summary of the influence of the plasmonic metal of the SERS substrate and strategy of the sandwich-type biosensor construction, simply the placement of the Raman reporter and mismatches, on the SERS signal enhancement. We found that, although in general there is an increase in the intensity of the SERS signal when the distance between the Raman scatterer and the SERS-active surface decreases, for this type of DNA SERS sensor a greater intensity of the measured Raman signal is usually observed when the Raman reporter is farther away from the plasmonic substrate. This is probably caused by a significant change in the hybridisation efficiency for the different structures of the sensor analysed due to some steric hindrances.
Collapse
Affiliation(s)
- Edyta Pyrak
- Faculty of Chemistry, University of Warsaw, Pasteura 1 Str., PL 02-093 Warsaw, Poland; Nencki Institute of Experimental Biology of Polish Academy of Sciences, Pasteura 3 St., 02-093 Warsaw, Poland
| | - Agata Kowalczyk
- Faculty of Chemistry, University of Warsaw, Pasteura 1 Str., PL 02-093 Warsaw, Poland
| | - Jan L Weyher
- Institute of High Pressure Physics of the Polish Academy of Science, Sokolowska 29/37 Str., PL 01-142 Warsaw, Poland
| | - Anna M Nowicka
- Faculty of Chemistry, University of Warsaw, Pasteura 1 Str., PL 02-093 Warsaw, Poland
| | - Andrzej Kudelski
- Faculty of Chemistry, University of Warsaw, Pasteura 1 Str., PL 02-093 Warsaw, Poland.
| |
Collapse
|
38
|
Turner AJ, Derezinski AD, Gaedigk A, Berres ME, Gregornik DB, Brown K, Broeckel U, Scharer G. Characterization of complex structural variation in the CYP2D6-CYP2D7-CYP2D8 gene loci using single-molecule long-read sequencing. Front Pharmacol 2023; 14:1195778. [PMID: 37426826 PMCID: PMC10324673 DOI: 10.3389/fphar.2023.1195778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 05/30/2023] [Indexed: 07/11/2023] Open
Abstract
Complex regions in the human genome such as repeat motifs, pseudogenes and structural (SVs) and copy number variations (CNVs) present ongoing challenges to accurate genetic analysis, particularly for short-read Next-Generation-Sequencing (NGS) technologies. One such region is the highly polymorphic CYP2D loci, containing CYP2D6, a clinically relevant pharmacogene contributing to the metabolism of >20% of common drugs, and two highly similar pseudogenes, CYP2D7 and CYP2D8. Multiple complex SVs, including CYP2D6/CYP2D7-derived hybrid genes are known to occur in different configurations and frequencies across populations and are difficult to detect and characterize accurately. This can lead to incorrect enzyme activity assignment and impact drug dosing recommendations, often disproportionally affecting underrepresented populations. To improve CYP2D6 genotyping accuracy, we developed a PCR-free CRISPR-Cas9 based enrichment method for targeted long-read sequencing that fully characterizes the entire CYP2D6-CYP2D7-CYP2D8 loci. Clinically relevant sample types, including blood, saliva, and liver tissue were sequenced, generating high coverage sets of continuous single molecule reads spanning the entire targeted region of up to 52 kb, regardless of SV present (n = 9). This allowed for fully phased dissection of the entire loci structure, including breakpoints, to accurately resolve complex CYP2D6 diplotypes with a single assay. Additionally, we identified three novel CYP2D6 suballeles, and fully characterized 17 CYP2D7 and 18 CYP2D8 unique haplotypes. This method for CYP2D6 genotyping has the potential to significantly improve accurate clinical phenotyping to inform drug therapy and can be adapted to overcome testing limitations of other clinically challenging genomic regions.
Collapse
Affiliation(s)
| | | | - Andrea Gaedigk
- Children’s Mercy Research Institute, Kansas City, MO, United States
| | - Mark E. Berres
- Biotechnology Center, University of Wisconsin Madison, Madison, WI, United States
| | | | - Keith Brown
- Jumpcode Genomics, San Diego, CA, United States
| | | | | |
Collapse
|
39
|
Boßelmann CM, Leu C, Lal D. Technological and computational approaches to detect somatic mosaicism in epilepsy. Neurobiol Dis 2023:106208. [PMID: 37343892 DOI: 10.1016/j.nbd.2023.106208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 06/03/2023] [Accepted: 06/16/2023] [Indexed: 06/23/2023] Open
Abstract
Lesional epilepsy is a common and severe disease commonly associated with malformations of cortical development, including focal cortical dysplasia and hemimegalencephaly. Recent advances in sequencing and variant calling technologies have identified several genetic causes, including both short/single nucleotide and structural somatic variation. In this review, we aim to provide a comprehensive overview of the methodological advancements in this field while highlighting the unresolved technological and computational challenges that persist, including ultra-low variant allele fractions in bulk tissue, low availability of paired control samples, spatial variability of mutational burden within the lesion, and the issue of false-positive calls and validation procedures. Information from genetic testing in focal epilepsy may be integrated into clinical care to inform histopathological diagnosis, postoperative prognosis, and candidate precision therapies.
Collapse
Affiliation(s)
- Christian M Boßelmann
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA; Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Costin Leu
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA; Department of Clinical and Experimental Epilepsy, Institute of Neurology, University College London, London, UK.
| | - Dennis Lal
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA; Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and M.I.T., Cambridge, MA, USA; Cologne Center for Genomics (CCG), University of Cologne, Cologne, DE, USA
| |
Collapse
|
40
|
Zheng Y, Shang X. SVcnn: an accurate deep learning-based method for detecting structural variation based on long-read data. BMC Bioinformatics 2023; 24:213. [PMID: 37221476 DOI: 10.1186/s12859-023-05324-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 05/06/2023] [Indexed: 05/25/2023] Open
Abstract
BACKGROUND Structural variations (SVs) refer to variations in an organism's chromosome structure that exceed a length of 50 base pairs. They play a significant role in genetic diseases and evolutionary mechanisms. While long-read sequencing technology has led to the development of numerous SV caller methods, their performance results have been suboptimal. Researchers have observed that current SV callers often miss true SVs and generate many false SVs, especially in repetitive regions and areas with multi-allelic SVs. These errors are due to the messy alignments of long-read data, which are affected by their high error rate. Therefore, there is a need for a more accurate SV caller method. RESULT We propose a new method-SVcnn, a more accurate deep learning-based method for detecting SVs by using long-read sequencing data. We run SVcnn and other SV callers in three real datasets and find that SVcnn improves the F1-score by 2-8% compared with the second-best method when the read depth is greater than 5×. More importantly, SVcnn has better performance for detecting multi-allelic SVs. CONCLUSIONS SVcnn is an accurate deep learning-based method to detect SVs. The program is available at https://github.com/nwpuzhengyan/SVcnn .
Collapse
Affiliation(s)
- Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China.
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China.
| |
Collapse
|
41
|
Lin J, Jia P, Wang S, Kosters W, Ye K. Comparison and benchmark of structural variants detected from long read and long-read assembly. Brief Bioinform 2023:7169138. [PMID: 37200087 DOI: 10.1093/bib/bbad188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 04/25/2023] [Accepted: 04/26/2023] [Indexed: 05/20/2023] Open
Abstract
Structural variant (SV) detection is essential for genomic studies, and long-read sequencing technologies have advanced our capacity to detect SVs directly from read or de novo assembly, also known as read-based and assembly-based strategy. However, to date, no independent studies have compared and benchmarked the two strategies. Here, on the basis of SVs detected by 20 read-based and eight assembly-based detection pipelines from six datasets of HG002 genome, we investigated the factors that influence the two strategies and assessed their performance with well-curated SVs. We found that up to 80% of the SVs could be detected by both strategies among different long-read datasets, whereas variant type, size, and breakpoint detected by read-based strategy were greatly affected by aligners. For the high-confident insertions and deletions at non-tandem repeat regions, a remarkable subset of them (82% in assembly-based calls and 93% in read-based calls), accounting for around 4000 SVs, could be captured by both reads and assemblies. However, discordance between two strategies was largely caused by complex SVs and inversions, which resulted from inconsistent alignment of reads and assemblies at these loci. Finally, benchmarking with SVs at medically relevant genes, the recall of read-based strategy reached 77% on 5X coverage data, whereas assembly-based strategy required 20X coverage data to achieve similar performance. Therefore, integrating SVs from read and assembly is suggested for general-purpose detection because of inconsistently detected complex SVs and inversions, whereas assembly-based strategy is optional for applications with limited resources.
Collapse
Affiliation(s)
- Jiadong Lin
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
- Genome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an 710061 China
- Leiden Institute of Advanced Computer Science, Faculty of Science, Leiden University, Leiden 2311 EZ, The Netherlands
| | - Peng Jia
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
| | - Songbo Wang
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
| | - Walter Kosters
- Leiden Institute of Advanced Computer Science, Faculty of Science, Leiden University, Leiden 2311 EZ, The Netherlands
| | - Kai Ye
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
- Genome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an 710061 China
- The School of Life Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
- Faculty of Science, Leiden University, Leiden 2311 , The Netherlands
| |
Collapse
|
42
|
Rozowsky J, Gao J, Borsari B, Yang YT, Galeev T, Gürsoy G, Epstein CB, Xiong K, Xu J, Li T, Liu J, Yu K, Berthel A, Chen Z, Navarro F, Sun MS, Wright J, Chang J, Cameron CJF, Shoresh N, Gaskell E, Drenkow J, Adrian J, Aganezov S, Aguet F, Balderrama-Gutierrez G, Banskota S, Corona GB, Chee S, Chhetri SB, Cortez Martins GC, Danyko C, Davis CA, Farid D, Farrell NP, Gabdank I, Gofin Y, Gorkin DU, Gu M, Hecht V, Hitz BC, Issner R, Jiang Y, Kirsche M, Kong X, Lam BR, Li S, Li B, Li X, Lin KZ, Luo R, Mackiewicz M, Meng R, Moore JE, Mudge J, Nelson N, Nusbaum C, Popov I, Pratt HE, Qiu Y, Ramakrishnan S, Raymond J, Salichos L, Scavelli A, Schreiber JM, Sedlazeck FJ, See LH, Sherman RM, Shi X, Shi M, Sloan CA, Strattan JS, Tan Z, Tanaka FY, Vlasova A, Wang J, Werner J, Williams B, Xu M, Yan C, Yu L, Zaleski C, Zhang J, Ardlie K, Cherry JM, Mendenhall EM, Noble WS, Weng Z, Levine ME, Dobin A, Wold B, Mortazavi A, Ren B, Gillis J, Myers RM, Snyder MP, Choudhary J, Milosavljevic A, Schatz MC, Bernstein BE, Guigó R, Gingeras TR, Gerstein M. The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models. Cell 2023; 186:1493-1511.e40. [PMID: 37001506 PMCID: PMC10074325 DOI: 10.1016/j.cell.2023.02.018] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 10/16/2022] [Accepted: 02/10/2023] [Indexed: 04/03/2023]
Abstract
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.
Collapse
Affiliation(s)
- Joel Rozowsky
- Section on Biomedical Informatics and Data Science, Yale University, New Haven, CT, USA; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jiahao Gao
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Beatrice Borsari
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Yucheng T Yang
- Institute of Science and Technology for Brain-Inspired Intelligence; MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence; MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Timur Galeev
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Gamze Gürsoy
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | | | - Kun Xiong
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jinrui Xu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Tianxiao Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jason Liu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Keyang Yu
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Ana Berthel
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Zhanlin Chen
- Department of Statistics and Data Science, Yale University, New Haven, CT, USA
| | - Fabio Navarro
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Maxwell S Sun
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | | | - Justin Chang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Christopher J F Cameron
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Noam Shoresh
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Jorg Drenkow
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jessika Adrian
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Sergey Aganezov
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
| | | | | | | | | | - Sora Chee
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Surya B Chhetri
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Gabriel Conte Cortez Martins
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Cassidy Danyko
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Carrie A Davis
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Daniel Farid
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | | | - Idan Gabdank
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Yoel Gofin
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - David U Gorkin
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Mengting Gu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Vivian Hecht
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Benjamin C Hitz
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Robbyn Issner
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Yunzhe Jiang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Melanie Kirsche
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Xiangmeng Kong
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Bonita R Lam
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Shantao Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Bian Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Xiqi Li
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Khine Zin Lin
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Ruibang Luo
- Department of Computer Science, The University of Hong Kong, Hong Kong, CHN
| | - Mark Mackiewicz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Ran Meng
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Jonathan Mudge
- European Bioinformatics Institute, Cambridge, Cambridgeshire, GB
| | | | - Chad Nusbaum
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ioann Popov
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Henry E Pratt
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Yunjiang Qiu
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Srividya Ramakrishnan
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Joe Raymond
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Leonidas Salichos
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Department of Biological and Chemical Sciences, New York Institute of Technology, Old Westbury, NY, USA
| | - Alexandra Scavelli
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jacob M Schreiber
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Fritz J Sedlazeck
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA; Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Lei Hoon See
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Rachel M Sherman
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Xu Shi
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Minyi Shi
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Cricket Alicia Sloan
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - J Seth Strattan
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Zhen Tan
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Forrest Y Tanaka
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Anna Vlasova
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain; Comparative Genomics Group, Life Science Programme, Barcelona Supercomputing Centre, Barcelona, Spain; Institute of Research in Biomedicine, Barcelona, Spain
| | - Jun Wang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jonathan Werner
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Brian Williams
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Min Xu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Chengfei Yan
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Lu Yu
- Institute of Cancer Research, London, UK
| | - Christopher Zaleski
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jing Zhang
- Department of Computer Science, University of California, Irvine, Irvine, CA, USA
| | | | - J Michael Cherry
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | | | - William S Noble
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Morgan E Levine
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Pathology, Yale University School of Medicine, New Haven, CT, USA
| | - Alexander Dobin
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Barbara Wold
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA, USA
| | - Bing Ren
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Jesse Gillis
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA; Department of Physiology, University of Toronto, Toronto, ON, Canada
| | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Michael P Snyder
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | | | | | - Michael C Schatz
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA; Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
| | - Bradley E Bernstein
- Broad Institute of MIT and Harvard, Cambridge, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
| | - Roderic Guigó
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain; Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.
| | - Thomas R Gingeras
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
| | - Mark Gerstein
- Section on Biomedical Informatics and Data Science, Yale University, New Haven, CT, USA; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Department of Statistics and Data Science, Yale University, New Haven, CT, USA; Department of Computer Science, Yale University, New Haven, CT, USA.
| |
Collapse
|
43
|
Payne ZL, Penny GM, Turner TN, Dutcher SK. A gap-free genome assembly of Chlamydomonas reinhardtii and detection of translocations induced by CRISPR-mediated mutagenesis. PLANT COMMUNICATIONS 2023; 4:100493. [PMID: 36397679 PMCID: PMC10030371 DOI: 10.1016/j.xplc.2022.100493] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 10/26/2022] [Accepted: 11/15/2022] [Indexed: 05/04/2023]
Abstract
Genomic assemblies of the unicellular green alga Chlamydomonas reinhardtii have provided important resources for researchers. However, assembly errors, large gaps, and unplaced scaffolds as well as strain-specific variants currently impede many types of analysis. By combining PacBio HiFi and Oxford Nanopore long-read technologies, we generated a de novo genome assembly for strain CC-5816, derived from crosses of strains CC-125 and CC-124. Multiple methods of evaluating genome completeness and base-pair error rate suggest that the final telomere-to-telomere assembly is highly accurate. The CC-5816 assembly enabled previously difficult analyses that include characterization of the 17 centromeres, rDNA arrays on three chromosomes, and 56 insertions of organellar DNA into the nuclear genome. Using Nanopore sequencing, we identified sites of cytosine (CpG) methylation, which are enriched at centromeres. We analyzed CRISPR-Cas9 insertional mutants in the PF23 gene. Two of the three alleles produced progeny that displayed patterns of meiotic inviability that suggested the presence of a chromosomal aberration. Mapping Nanopore reads from pf23-2 and pf23-3 onto the CC-5816 genome showed that these two strains each carry a translocation that was initiated at the PF23 gene locus on chromosome 11 and joined with chromosomes 5 or 3, respectively. The translocations were verified by demonstrating linkage between loci on the two translocated chromosomes in meiotic progeny. The three pf23 alleles display the expected short-cilia phenotype, and immunoblotting showed that pf23-2 lacks the PF23 protein. Our CC-5816 genome assembly will undoubtedly provide an important tool for the Chlamydomonas research community.
Collapse
Affiliation(s)
- Zachary L Payne
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Gervette M Penny
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Tychele N Turner
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Susan K Dutcher
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63110, USA.
| |
Collapse
|
44
|
Gao R, Luo J, Ding H, Zhai H. INSnet: a method for detecting insertions based on deep learning network. BMC Bioinformatics 2023; 24:80. [PMID: 36879189 PMCID: PMC9990265 DOI: 10.1186/s12859-023-05216-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 03/01/2023] [Indexed: 03/08/2023] Open
Abstract
BACKGROUND Many studies have shown that structural variations (SVs) strongly impact human disease. As a common type of SV, insertions are usually associated with genetic diseases. Therefore, accurately detecting insertions is of great significance. Although many methods for detecting insertions have been proposed, these methods often generate some errors and miss some variants. Hence, accurately detecting insertions remains a challenging task. RESULTS In this paper, we propose a method named INSnet to detect insertions using a deep learning network. First, INSnet divides the reference genome into continuous sub-regions and takes five features for each locus through alignments between long reads and the reference genome. Next, INSnet uses a depthwise separable convolutional network. The convolution operation extracts informative features through spatial information and channel information. INSnet uses two attention mechanisms, the convolutional block attention module (CBAM) and efficient channel attention (ECA) to extract key alignment features in each sub-region. In order to capture the relationship between adjacent subregions, INSnet uses a gated recurrent unit (GRU) network to further extract more important SV signatures. After predicting whether a sub-region contains an insertion through the previous steps, INSnet determines the precise site and length of the insertion. The source code is available from GitHub at https://github.com/eioyuou/INSnet . CONCLUSION Experimental results show that INSnet can achieve better performance than other methods in terms of F1 score on real datasets.
Collapse
Affiliation(s)
- Runtian Gao
- School of Software, Henan Polytechnic University, Jiaozuo, 454003, China
| | - Junwei Luo
- School of Software, Henan Polytechnic University, Jiaozuo, 454003, China.
| | - Hongyu Ding
- School of Software, Henan Polytechnic University, Jiaozuo, 454003, China
| | - Haixia Zhai
- School of Software, Henan Polytechnic University, Jiaozuo, 454003, China
| |
Collapse
|
45
|
Zheng Y, Shang X, Sung WK. SVsearcher: A more accurate structural variation detection method in long read data. Comput Biol Med 2023; 158:106843. [PMID: 37019014 DOI: 10.1016/j.compbiomed.2023.106843] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 03/03/2023] [Accepted: 03/30/2023] [Indexed: 04/03/2023]
Abstract
Structural variations (SVs) represent genomic rearrangements (such as deletions, insertions, and inversions) whose sizes are larger than 50bp. They play important roles in genetic diseases and evolution mechanism. Due to the advance of long-read sequencing (i.e. PacBio long-read sequencing and Oxford Nanopore (ONT) long-read sequencing), we can call SVs accurately. However, for ONT long reads, we observe that existing long read SV callers miss a lot of true SVs and call a lot of false SVs in repetitive regions and in regions with multi-allelic SVs. Those errors are caused by messy alignments of ONT reads due to their high error rate. Hence, we propose a novel method, SVsearcher, to solve these issues. We run SVsearcher and other callers in three real datasets and find that SVsearcher improves the F1 score by approximately 10% for high coverage (50×) datasets and more than 25% for low coverage (10×) datasets. More importantly, SVsearcher can identify 81.7%-91.8% multi-allelic SVs while existing methods only identify 13.2% (Sniffles)-54.0% (nanoSV) of them. SVsearcher is available at https://github.com/kensung-lab/SVsearcher.
Collapse
Affiliation(s)
- Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi'an, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi'an, China.
| | - Wing-Kin Sung
- Department of Chemical Pathology, The Chinese University of Hong Kong, Hong Kong, China; Hong Kong Genome Institute, Hong Kong Science Park, Shatin, Hong Kong, China; Laboratory of Computational Genomics, Li Ka Shing Institute of Health Science, The Chinese University of Hong Kong, Hong Kong, China.
| |
Collapse
|
46
|
Lu W, Zhang T, Zhang Q, Zhang N, Jia L, Ma S, Xia Q. FibH Gene Complete Sequences (FibHome) Revealed Silkworm Pedigree. INSECTS 2023; 14:244. [PMID: 36975929 PMCID: PMC10055898 DOI: 10.3390/insects14030244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Revised: 02/23/2023] [Accepted: 02/24/2023] [Indexed: 06/18/2023]
Abstract
The highly repetitive and variable fibroin heavy chain (FibH) gene can be used as a silkworm identification; however, only a few complete FibH sequences are known. In this study, we extracted and examined 264 FibH gene complete sequences (FibHome) from a high-resolution silkworm pan-genome. The average FibH lengths of the wild silkworm, local, and improved strains were 19,698 bp, 16,427 bp, and 15,795 bp, respectively. All FibH sequences had a conserved 5' and 3' terminal non-repetitive (5' and 3' TNR, 99.74% and 99.99% identity, respectively) sequence and a variable repetitive core (RC). The RCs differed greatly, but they all shared the same motif. During domestication or breeding, the FibH gene mutated with hexanucleotide (GGTGCT) as the core unit. Numerous variations existed that were not unique to wild and domesticated silkworms. However, the transcriptional factor binding sites, such as fibroin modulator-binding protein, were highly conserved and had 100% identity in the FibH gene's intron and upstream sequences. The local and improved strains with the same FibH gene were divided into four families using this gene as a marker. Family I contained a maximum of 62 strains with the optional FibH (Opti-FibH, 15,960 bp) gene. This study provides new insights into FibH variations and silkworm breeding.
Collapse
Affiliation(s)
- Wei Lu
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China
- Biological Science Research Center, Southwest University, Chongqing 400715, China
- Integrative Science Center of Gerplasm Greation in Western China (CHONGQING) Science City & Southwest University, Chongqing 400715, China
| | - Tong Zhang
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China
- Biological Science Research Center, Southwest University, Chongqing 400715, China
- Integrative Science Center of Gerplasm Greation in Western China (CHONGQING) Science City & Southwest University, Chongqing 400715, China
| | - Quan Zhang
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China
- Biological Science Research Center, Southwest University, Chongqing 400715, China
- Integrative Science Center of Gerplasm Greation in Western China (CHONGQING) Science City & Southwest University, Chongqing 400715, China
| | - Na Zhang
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China
- Biological Science Research Center, Southwest University, Chongqing 400715, China
- Integrative Science Center of Gerplasm Greation in Western China (CHONGQING) Science City & Southwest University, Chongqing 400715, China
| | - Ling Jia
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China
- Biological Science Research Center, Southwest University, Chongqing 400715, China
- Integrative Science Center of Gerplasm Greation in Western China (CHONGQING) Science City & Southwest University, Chongqing 400715, China
| | - Sanyuan Ma
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China
- Biological Science Research Center, Southwest University, Chongqing 400715, China
- Integrative Science Center of Gerplasm Greation in Western China (CHONGQING) Science City & Southwest University, Chongqing 400715, China
| | - Qingyou Xia
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China
- Biological Science Research Center, Southwest University, Chongqing 400715, China
- Integrative Science Center of Gerplasm Greation in Western China (CHONGQING) Science City & Southwest University, Chongqing 400715, China
| |
Collapse
|
47
|
Malekshoar M, Azimi SA, Kaki A, Mousazadeh L, Motaei J, Vatankhah M. CRISPR-Cas9 Targeted Enrichment and Next-Generation Sequencing for Mutation Detection. J Mol Diagn 2023; 25:249-262. [PMID: 36841425 DOI: 10.1016/j.jmoldx.2023.01.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 01/08/2023] [Accepted: 01/27/2023] [Indexed: 02/27/2023] Open
Abstract
Despite the rapid application of next-generation sequencing (NGS) technologies, target sequencing in regions of the genome is often required to diagnose many genetic diseases. Target enrichment can be an effective factor in reducing the cost of sequencing and the duration of sequencing. Recently, several clustered system regularly interspaced short palindromic repeats (CRISPR)-based methods (amplification-free sequencing) have been developed to target enrichment in combination with one of the NGS platforms. CRISPR-based target enrichment strategies act as an auxiliary tool to improve NGS analytical performance, thereby indirectly facilitating nucleic acid detection. The direct DNA cleavage approach by CRISPR-Cas at genome-specific sites enhances the possibility of separating native large fragments from disease-related genomic regions. The CRISPR-Cas can isolate the target region without any amplification; subsequently, long-read sequencing technologies were also implemented. These methods, as promising tools, have the ability to assess genetic and epigenetic composition for clinical application and treatment responses in cancer precision medicine. By modifying CRISPR-based enrichment protocols, it was possible to identify different types of mutations, including structural variants, short tandem repeats, fusion genes, and mobile elements. The Cas9 can specifically eliminate wild-type sequences, and it also enables the enrichment and detection of small amounts of tumor DNA fragments among the highly heterogeneous fragments of wild-type DNA.
Collapse
Affiliation(s)
- Mehrdad Malekshoar
- Anesthesiology, Critical Care and Pain Management Research Center, Hormozgan University of Medical Sciences, Bandar Abbas, Iran
| | - Sajad Ataei Azimi
- Department of Hematology-Oncology, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Arastoo Kaki
- Department of Medical Genetics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Leila Mousazadeh
- Department of Medical Biotechnology, School of Advanced Medical Sciences, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Jamshid Motaei
- Department of Medical Genetics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Majid Vatankhah
- Anesthesiology, Critical Care and Pain Management Research Center, Hormozgan University of Medical Sciences, Bandar Abbas, Iran.
| |
Collapse
|
48
|
Chowdhury T, Cressiot B, Parisi C, Smolyakov G, Thiébot B, Trichet L, Fernandes FM, Pelta J, Manivet P. Circulating Tumor Cells in Cancer Diagnostics and Prognostics by Single-Molecule and Single-Cell Characterization. ACS Sens 2023; 8:406-426. [PMID: 36696289 DOI: 10.1021/acssensors.2c02308] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Circulating tumor cells (CTCs) represent an interesting source of biomarkers for diagnosis, prognosis, and the prediction of cancer recurrence, yet while they are extensively studied in oncobiology research, their diagnostic utility has not yet been demonstrated and validated. Their scarcity in human biological fluids impedes the identification of dangerous CTC subpopulations that may promote metastatic dissemination. In this Perspective, we discuss promising techniques that could be used for the identification of these metastatic cells. We first describe methods for isolating patient-derived CTCs and then the use of 3D biomimetic matrixes in their amplification and analysis, followed by methods for further CTC analyses at the single-cell and single-molecule levels. Finally, we discuss how the elucidation of mechanical and morphological properties using techniques such as atomic force microscopy and molecular biomarker identification using nanopore-based detection could be combined in the future to provide patients and their healthcare providers with a more accurate diagnosis.
Collapse
Affiliation(s)
- Tafsir Chowdhury
- Centre de Ressources Biologiques Biobank Lariboisière (BB-0033-00064), DMU BioGem, AP-HP, 75010 Paris, France
| | | | - Cleo Parisi
- Centre de Ressources Biologiques Biobank Lariboisière (BB-0033-00064), DMU BioGem, AP-HP, 75010 Paris, France.,Sorbonne Université, UMR 7574, Laboratoire de Chimie de la Matière Condensée de Paris, 75005 Paris, France
| | - Georges Smolyakov
- Centre de Ressources Biologiques Biobank Lariboisière (BB-0033-00064), DMU BioGem, AP-HP, 75010 Paris, France
| | | | - Léa Trichet
- Sorbonne Université, UMR 7574, Laboratoire de Chimie de la Matière Condensée de Paris, 75005 Paris, France
| | - Francisco M Fernandes
- Sorbonne Université, UMR 7574, Laboratoire de Chimie de la Matière Condensée de Paris, 75005 Paris, France
| | - Juan Pelta
- CY Cergy Paris Université, CNRS, LAMBE, 95000 Cergy, France.,Université Paris-Saclay, Université d'Evry, CNRS, LAMBE, 91190 Evry, France
| | - Philippe Manivet
- Centre de Ressources Biologiques Biobank Lariboisière (BB-0033-00064), DMU BioGem, AP-HP, 75010 Paris, France.,Université Paris Cité, Inserm, NeuroDiderot, F-75019 Paris, France
| |
Collapse
|
49
|
Chen P, Sun Z, Wang J, Liu X, Bai Y, Chen J, Liu A, Qiao F, Chen Y, Yuan C, Sha J, Zhang J, Xu LQ, Li J. Portable nanopore-sequencing technology: Trends in development and applications. Front Microbiol 2023; 14:1043967. [PMID: 36819021 PMCID: PMC9929578 DOI: 10.3389/fmicb.2023.1043967] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 01/03/2023] [Indexed: 02/04/2023] Open
Abstract
Sequencing technology is the most commonly used technology in molecular biology research and an essential pillar for the development and applications of molecular biology. Since 1977, when the first generation of sequencing technology opened the door to interpreting the genetic code, sequencing technology has been developing for three generations. It has applications in all aspects of life and scientific research, such as disease diagnosis, drug target discovery, pathological research, species protection, and SARS-CoV-2 detection. However, the first- and second-generation sequencing technology relied on fluorescence detection systems and DNA polymerization enzyme systems, which increased the cost of sequencing technology and limited its scope of applications. The third-generation sequencing technology performs PCR-free and single-molecule sequencing, but it still depends on the fluorescence detection device. To break through these limitations, researchers have made arduous efforts to develop a new advanced portable sequencing technology represented by nanopore sequencing. Nanopore technology has the advantages of small size and convenient portability, independent of biochemical reagents, and direct reading using physical methods. This paper reviews the research and development process of nanopore sequencing technology (NST) from the laboratory to commercially viable tools; discusses the main types of nanopore sequencing technologies and their various applications in solving a wide range of real-world problems. In addition, the paper collates the analysis tools necessary for performing different processing tasks in nanopore sequencing. Finally, we highlight the challenges of NST and its future research and application directions.
Collapse
Affiliation(s)
- Pin Chen
- Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Zepeng Sun
- China Mobile (Chengdu) Industrial Research Institute, Chengdu, China
| | - Jiawei Wang
- School of Computer Science and Technology, Southeast University, Nanjing, China
| | - Xinlong Liu
- China Mobile (Chengdu) Industrial Research Institute, Chengdu, China
| | - Yun Bai
- Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Jiang Chen
- Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Anna Liu
- Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Feng Qiao
- China Mobile (Chengdu) Industrial Research Institute, Chengdu, China
| | - Yang Chen
- Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Chenyan Yuan
- Clinical Laboratory, Southeast University Zhongda Hospital, Nanjing, China
| | - Jingjie Sha
- School of Mechanical Engineering, Southeast University, Nanjing, China
| | - Jinghui Zhang
- School of Computer Science and Technology, Southeast University, Nanjing, China
| | - Li-Qun Xu
- China Mobile (Chengdu) Industrial Research Institute, Chengdu, China,*Correspondence: Li-Qun Xu, ✉
| | - Jian Li
- Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China,Jian Li, ✉
| |
Collapse
|
50
|
Zhuang J, Chen C, Fu W, Wang Y, Zhuang Q, Lu Y, Xie T, Xu R, Zeng S, Jiang Y, Xie Y, Wang G. Third-Generation Sequencing as a New Comprehensive Technology for Identifying Rare α- and β-Globin Gene Variants in Thalassemia Alleles in the Chinese Population. Arch Pathol Lab Med 2023; 147:208-214. [PMID: 35639603 DOI: 10.5858/arpa.2021-0510-oa] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/20/2021] [Indexed: 02/05/2023]
Abstract
CONTEXT.— Identification of rare thalassemia variants requires a combination of multiple diagnostic technologies. OBJECTIVE.— To investigate a new approach of comprehensive analysis of thalassemia alleles based on third-generation sequencing (TGS) for identification of α- and β-globin gene variants. DESIGN.— Enrolled in this study were 70 suspected carriers of rare thalassemia variants. Routine gap-polymerase chain reaction and DNA sequencing were used to detect rare thalassemia variants, and TGS technology was performed to identify α- and β-globin gene variants. RESULTS.— Twenty-three cases that carried rare variants in α- and β-globin genes were identified by the routine detection methods. TGS technology yielded a 7.14% (5 of 70) increment of rare α- and β-globin gene variants as compared with the routine methods. Among them, the rare deletional genotype of -THAI was the most common variant. In addition, rare variants of CD15 (G>A) (HBA2:c.46G>A), CD117/118(+TCA) (HBA1:c.354_355insTCA), and β-thalassemia 3.5-kilobase gene deletion were first identified in Fujian Province, China; to the best of our knowledge, this is the second report in the Chinese population. Moreover, HBA1:c.-24C>G, IVS-II-55 (G>T) (HBA1:c.300+55G>T) and hemoglobin (Hb) Maranon (HBA2:c.94A>G) were first identified in the Chinese population. We also identified rare Hb variants of HbC, HbG-Honolulu, Hb Miyashiro, and HbG-Coushatta in this study. CONCLUSIONS.— TGS technology can effectively and accurately detect deletional and nondeletional thalassemia variants simultaneously in one experiment. Our study also demonstrated the application value of TGS-based comprehensive analysis of thalassemia alleles in the detection of rare thalassemia gene variants.
Collapse
Affiliation(s)
- Jianlong Zhuang
- From the Prenatal Diagnosis Center (J. Zhuang, Fu, Y. Wang, Q. Zhuang, Zeng, Jiang), Quanzhou Women's and Children's Hospital, Quanzhou, Fujian Province, China
| | - Chunnuan Chen
- From the Department of Neurology, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, Fujian Province, China (Chen)
| | - Wanyu Fu
- From the Prenatal Diagnosis Center (J. Zhuang, Fu, Y. Wang, Q. Zhuang, Zeng, Jiang), Quanzhou Women's and Children's Hospital, Quanzhou, Fujian Province, China
| | - Yuanbai Wang
- From the Prenatal Diagnosis Center (J. Zhuang, Fu, Y. Wang, Q. Zhuang, Zeng, Jiang), Quanzhou Women's and Children's Hospital, Quanzhou, Fujian Province, China
| | - Qianmei Zhuang
- From the Prenatal Diagnosis Center (J. Zhuang, Fu, Y. Wang, Q. Zhuang, Zeng, Jiang), Quanzhou Women's and Children's Hospital, Quanzhou, Fujian Province, China
| | - Yulin Lu
- From the Third-Generation Sequencing Business Unit, Berry Genomics Corporation, Beijing, China (Lu, T. Xie, Xu)
| | - Tiantian Xie
- From the Third-Generation Sequencing Business Unit, Berry Genomics Corporation, Beijing, China (Lu, T. Xie, Xu).,From the Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, Key Laboratory of Reproduction and Genetics of Guangdong Higher Education Institutes, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China (Y. Xie)
| | - Ruofan Xu
- From the Third-Generation Sequencing Business Unit, Berry Genomics Corporation, Beijing, China (Lu, T. Xie, Xu)
| | - Shuhong Zeng
- From the Prenatal Diagnosis Center (J. Zhuang, Fu, Y. Wang, Q. Zhuang, Zeng, Jiang), Quanzhou Women's and Children's Hospital, Quanzhou, Fujian Province, China
| | - Yuying Jiang
- From the Prenatal Diagnosis Center (J. Zhuang, Fu, Y. Wang, Q. Zhuang, Zeng, Jiang), Quanzhou Women's and Children's Hospital, Quanzhou, Fujian Province, China.,Authors Jiang, Y. Xie and G. Wang are co-lead authors
| | - Yingjun Xie
- From the Third-Generation Sequencing Business Unit, Berry Genomics Corporation, Beijing, China (Lu, T. Xie, Xu).,From the Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, Key Laboratory of Reproduction and Genetics of Guangdong Higher Education Institutes, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China (Y. Xie).,Authors Jiang, Y. Xie and G. Wang are co-lead authors
| | - Gaoxiong Wang
- From the Prenatal Diagnosis Center (J. Zhuang, Fu, Y. Wang, Q. Zhuang, Zeng, Jiang), Quanzhou Women's and Children's Hospital, Quanzhou, Fujian Province, China.,From the Department of Surgery (G. Wang), Quanzhou Women's and Children's Hospital, Quanzhou, Fujian Province, China.,Authors Jiang, Y. Xie and G. Wang are co-lead authors
| |
Collapse
|