1
|
Liu Z, Xie Z, Li M. Comprehensive and deep evaluation of structural variation detection pipelines with third-generation sequencing data. Genome Biol 2024; 25:188. [PMID: 39010145 PMCID: PMC11247875 DOI: 10.1186/s13059-024-03324-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 06/26/2024] [Indexed: 07/17/2024] Open
Abstract
BACKGROUND Structural variation (SV) detection methods using third-generation sequencing data are widely employed, yet accurately detecting SVs remains challenging. Different methods often yield inconsistent results for certain SV types, complicating tool selection and revealing biases in detection. RESULTS This study comprehensively evaluates 53 SV detection pipelines using simulated and real data from PacBio (CLR: Continuous Long Read, CCS: Circular Consensus Sequencing) and Nanopore (ONT) platforms. We assess their performance in detecting various sizes and types of SVs, breakpoint biases, and genotyping accuracy with various sequencing depths. Notably, pipelines such as Minimap2-cuteSV2, NGMLR-SVIM, PBMM2-pbsv, Winnowmap-Sniffles2, and Winnowmap-SVision exhibit comparatively higher recall and precision. Our findings also show that combining multiple pipelines with the same aligner, like pbmm2 or winnowmap, can significantly enhance performance. The individual pipelines' detailed ranking and performance metrics can be viewed in a dynamic table: http://pmglab.top/SVPipelinesRanking . CONCLUSIONS This study comprehensively characterizes the strengths and weaknesses of numerous pipelines, providing valuable insights that can improve SV detection in third-generation sequencing data and inform SV annotation and function prediction.
Collapse
Affiliation(s)
- Zhi Liu
- Program in Bioinformatics, Zhongshan School of Medicine, The Fifth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China
- Key Laboratory of Tropical Disease Control (Sun Yat-Sen University), Ministry of Education, Guangzhou, China
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou, China
| | - Miaoxin Li
- Program in Bioinformatics, Zhongshan School of Medicine, The Fifth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China.
- Key Laboratory of Tropical Disease Control (Sun Yat-Sen University), Ministry of Education, Guangzhou, China.
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China.
- Department of Psychiatry, The University of Hong Kong, Hong Kong, SAR, China.
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital, Sun Yat-Sen University, Zhuhai, China.
| |
Collapse
|
2
|
Yu Y, Gao R, Luo J. LcDel: deletion variation detection based on clustering and long reads. Front Genet 2024; 15:1404415. [PMID: 38798694 PMCID: PMC11116628 DOI: 10.3389/fgene.2024.1404415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Accepted: 04/25/2024] [Indexed: 05/29/2024] Open
Abstract
Motivation: Genomic structural variation refers to chromosomal level variations such as genome rearrangement or insertion/deletion, which typically involve larger DNA fragments compared to single nucleotide variations. Deletion is a common type of structural variants in the genome, which may lead to mangy diseases, so the detection of deletions can help to gain insights into the pathogenesis of diseases and provide accurate information for disease diagnosis, treatment, and prevention. Many tools exist for deletion variant detection, but they are still inadequate in some aspects, and most of them ignore the presence of chimeric variants in clustering, resulting in less precise clustering results. Results: In this paper, we present LcDel, which can detect deletion variation based on clustering and long reads. LcDel first finds the candidate deletion sites and then performs the first clustering step using two clustering methods (sliding window-based and coverage-based, respectively) based on the length of the deletion. After that, LcDel immediately uses the second clustering by hierarchical clustering to determine the location and length of the deletion. LcDel is benchmarked against some other structural variation detection tools on multiple datasets, and the results show that LcDel has better detection performance for deletion. The source code is available in https://github.com/cyq1314woaini/LcDel.
Collapse
Affiliation(s)
| | | | - Junwei Luo
- School of Software, Henan Polytechnic University, Jiaozuo, China
| |
Collapse
|
3
|
Chen Q, Wu B, Li C, Ding L, Huang S, Wang J, Zhao J. Deciphering male influence in gynogenetic Pengze crucian carp ( Carassius auratus var. pengsenensis): insights from Nanopore sequencing of structural variations. Front Genet 2024; 15:1392110. [PMID: 38784042 PMCID: PMC11111978 DOI: 10.3389/fgene.2024.1392110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 04/11/2024] [Indexed: 05/25/2024] Open
Abstract
In this study, we investigate gynogenetic reproduction in Pengze Crucian Carp (Carassius auratus var. pengsenensis) using third-generation Nanopore sequencing to uncover structural variations (SVs) in offspring. Our objective was to understand the role of male genetic material in gynogenesis by examining the genomes of both parents and their offspring. We discovered a notable number of male-specific structural variations (MSSVs): 1,195 to 1,709 MSSVs in homologous offspring, accounting for approximately 0.52%-0.60% of their detected SVs, and 236 to 350 MSSVs in heterologous offspring, making up about 0.10%-0.13%. These results highlight the significant influence of male genetic material on the genetic composition of offspring, particularly in homologous pairs, challenging the traditional view of asexual reproduction. The gene annotation of MSSVs revealed their presence in critical gene regions, indicating potential functional impacts. Specifically, we found 5 MSSVs in the exonic regions of protein-coding genes in homologous offspring, suggesting possible direct effects on protein structure and function. Validation of an MSSV in the exonic region of the polyunsaturated fatty acid 5-lipoxygenase gene confirmed male genetic material transmission in some offspring. This study underscores the importance of further research on the genetic diversity and gynogenesis mechanisms, providing valuable insights for reproductive biology, aquaculture, and fostering innovation in biological research and aquaculture practices.
Collapse
Affiliation(s)
- Qianhui Chen
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Biyu Wu
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Chao Li
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Liyun Ding
- Jiangxi Fisheries Research Institute, Nanchang, China
| | - Shiting Huang
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Junjie Wang
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Jun Zhao
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, China
| |
Collapse
|
4
|
Fleming A, Galey M, Briggs L, Edwards M, Hogg C, John S, Wilkinson S, Quinn E, Rai R, Burgoyne T, Rogers A, Patel MP, Griffin P, Muller S, Carr SB, Loebinger MR, Lucas JS, Shah A, Jose R, Mitchison HM, Shoemark A, Miller DE, Morris-Rosendahl DJ. Combined approaches, including long-read sequencing, address the diagnostic challenge of HYDIN in primary ciliary dyskinesia. Eur J Hum Genet 2024:10.1038/s41431-024-01599-7. [PMID: 38605126 DOI: 10.1038/s41431-024-01599-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 03/08/2024] [Accepted: 03/18/2024] [Indexed: 04/13/2024] Open
Abstract
Primary ciliary dyskinesia (PCD), a disorder of the motile cilia, is now recognised as an underdiagnosed cause of bronchiectasis. Accurate PCD diagnosis comprises clinical assessment, analysis of cilia and the identification of biallelic variants in one of 50 known PCD-related genes, including HYDIN. HYDIN-related PCD is underdiagnosed due to the presence of a pseudogene, HYDIN2, with 98% sequence homology to HYDIN. This presents a significant challenge for Short-Read Next Generation Sequencing (SR-NGS) and analysis, and many diagnostic PCD gene panels do not include HYDIN. We have used a combined approach of SR-NGS with bioinformatic masking of HYDIN2, and state-of-the-art long-read Nanopore sequencing (LR_NGS), together with analysis of respiratory cilia including transmission electron microscopy and immunofluorescence to address the underdiagnosis of HYDIN as a cause of PCD. Bioinformatic masking of HYDIN2 after SR-NGS facilitated the detection of biallelic HYDIN variants in 15 of 437 families, but compromised the detection of copy number variants. Supplementing testing with LR-NGS detected HYDIN deletions in 2 families, where SR-NGS had detected a single heterozygous HYDIN variant. LR-NGS was also able to confirm true homozygosity in 2 families when parental testing was not possible. Utilising a combined genomic diagnostic approach, biallelic HYDIN variants were detected in 17 families from 242 genetically confirmed PCD cases, comprising 7% of our PCD cohort. This represents the largest reported HYDIN cohort to date and highlights previous underdiagnosis of HYDIN-associated PCD. Moreover this provides further evidence for the utility of LR-NGS in diagnostic testing, particularly for regions of high genomic complexity.
Collapse
Affiliation(s)
- Andrew Fleming
- Clinical Genetics and Genomics Laboratory, Royal Brompton and Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
| | - Miranda Galey
- Division of Genetic Medicine, Department of Pediatrics, University of Washington and Seattle Children's Hospital, Seattle, WA, USA
- Department of Laboratory Medicine and Pathology, University of Washington and Seattle Children's Hospital, Seattle, WA, 98105, USA
| | - Lizi Briggs
- Clinical Genetics and Genomics Laboratory, Royal Brompton and Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
| | - Matthew Edwards
- Clinical Genetics and Genomics Laboratory, Royal Brompton and Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
| | - Claire Hogg
- Primary Ciliary Dyskinesia Centre, Royal Brompton and Harefield Clinical Group, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
- National Heart and Lung Institute, Imperial College London, London, SW3 6LY, UK
| | - Shibu John
- Clinical Genetics and Genomics Laboratory, Royal Brompton and Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
| | - Sam Wilkinson
- Clinical Genetics and Genomics Laboratory, Royal Brompton and Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
| | - Ellie Quinn
- Clinical Genetics and Genomics Laboratory, Royal Brompton and Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
| | - Ranjit Rai
- Primary Ciliary Dyskinesia Centre, Royal Brompton and Harefield Clinical Group, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
| | - Tom Burgoyne
- Primary Ciliary Dyskinesia Centre, Royal Brompton and Harefield Clinical Group, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
- Genetics and Genomic Medicine Department, University College London, UCL Great Ormond Street Institute of Child Health, London, WC1N 1EH, UK
| | - Andy Rogers
- Primary Ciliary Dyskinesia Centre, Royal Brompton and Harefield Clinical Group, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
| | - Mitali P Patel
- Genetics and Genomic Medicine Department, University College London, UCL Great Ormond Street Institute of Child Health, London, WC1N 1EH, UK
- MRC Prion Unit at UCL, Institute of Prion Diseases, UCL, London, W1W 7FF, UK
| | - Paul Griffin
- Primary Ciliary Dyskinesia Centre, Royal Brompton and Harefield Clinical Group, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
| | - Steven Muller
- Clinical Genetics and Genomics Laboratory, Royal Brompton and Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
| | - Siobhan B Carr
- Primary Ciliary Dyskinesia Centre, Royal Brompton and Harefield Clinical Group, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
- National Heart and Lung Institute, Imperial College London, London, SW3 6LY, UK
| | - Michael R Loebinger
- Primary Ciliary Dyskinesia Centre, Royal Brompton and Harefield Clinical Group, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
- National Heart and Lung Institute, Imperial College London, London, SW3 6LY, UK
| | - Jane S Lucas
- Primary Ciliary Dyskinesia Centre, University Hospital Southampton NHS Foundation Trust, Southampton, SO16 6YD, UK
- Clinical and Experimental Sciences Academic Unit, University of Southampton Faculty of Medicine, Southampton, SO16 6YD, UK
| | - Anand Shah
- Primary Ciliary Dyskinesia Centre, Royal Brompton and Harefield Clinical Group, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
- MRC Centre of Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, W2 1PG, UK
| | - Ricardo Jose
- Primary Ciliary Dyskinesia Centre, Royal Brompton and Harefield Clinical Group, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
| | - Hannah M Mitchison
- Genetics and Genomic Medicine Department, University College London, UCL Great Ormond Street Institute of Child Health, London, WC1N 1EH, UK
- MRC Prion Unit at UCL, Institute of Prion Diseases, UCL, London, W1W 7FF, UK
| | - Amelia Shoemark
- Primary Ciliary Dyskinesia Centre, Royal Brompton and Harefield Clinical Group, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK
- Respiratory Research Group, Molecular and Cellular Medicine, University of Dundee, Dundee, DD1 9SY, UK
| | - Danny E Miller
- Division of Genetic Medicine, Department of Pediatrics, University of Washington and Seattle Children's Hospital, Seattle, WA, USA
- Department of Laboratory Medicine and Pathology, University of Washington and Seattle Children's Hospital, Seattle, WA, 98105, USA
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA, 98195, USA
| | - Deborah J Morris-Rosendahl
- Clinical Genetics and Genomics Laboratory, Royal Brompton and Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, SW3 6NP, UK.
- National Heart and Lung Institute, Imperial College London, London, SW3 6LY, UK.
| |
Collapse
|
5
|
Reis ALM, Rapadas M, Hammond JM, Gamaarachchi H, Stevanovski I, Ayuputeri Kumaheri M, Chintalaphani SR, Dissanayake DSB, Siggs OM, Hewitt AW, Llamas B, Brown A, Baynam G, Mann GJ, McMorran BJ, Easteal S, Hermes A, Jenkins MR, Patel HR, Deveson IW. The landscape of genomic structural variation in Indigenous Australians. Nature 2023; 624:602-610. [PMID: 38093003 PMCID: PMC10733147 DOI: 10.1038/s41586-023-06842-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 11/07/2023] [Indexed: 12/20/2023]
Abstract
Indigenous Australians harbour rich and unique genomic diversity. However, Aboriginal and Torres Strait Islander ancestries are historically under-represented in genomics research and almost completely missing from reference datasets1-3. Addressing this representation gap is critical, both to advance our understanding of global human genomic diversity and as a prerequisite for ensuring equitable outcomes in genomic medicine. Here we apply population-scale whole-genome long-read sequencing4 to profile genomic structural variation across four remote Indigenous communities. We uncover an abundance of large insertion-deletion variants (20-49 bp; n = 136,797), structural variants (50 b-50 kb; n = 159,912) and regions of variable copy number (>50 kb; n = 156). The majority of variants are composed of tandem repeat or interspersed mobile element sequences (up to 90%) and have not been previously annotated (up to 62%). A large fraction of structural variants appear to be exclusive to Indigenous Australians (12% lower-bound estimate) and most of these are found in only a single community, underscoring the need for broad and deep sampling to achieve a comprehensive catalogue of genomic structural variation across the Australian continent. Finally, we explore short tandem repeats throughout the genome to characterize allelic diversity at 50 known disease loci5, uncover hundreds of novel repeat expansion sites within protein-coding genes, and identify unique patterns of diversity and constraint among short tandem repeat sequences. Our study sheds new light on the dimensions and dynamics of genomic structural variation within and beyond Australia.
Collapse
Affiliation(s)
- Andre L M Reis
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Darlinghurst, New South Wales, Australia
- Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia
| | - Melissa Rapadas
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Darlinghurst, New South Wales, Australia
| | - Jillian M Hammond
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Darlinghurst, New South Wales, Australia
| | - Hasindu Gamaarachchi
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Darlinghurst, New South Wales, Australia
- School of Computer Science and Engineering, University of New South Wales, Sydney, New South Wales, Australia
| | - Igor Stevanovski
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Darlinghurst, New South Wales, Australia
| | - Meutia Ayuputeri Kumaheri
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Darlinghurst, New South Wales, Australia
| | - Sanjog R Chintalaphani
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Darlinghurst, New South Wales, Australia
- Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia
| | - Duminda S B Dissanayake
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
- Institute for Applied Ecology, University of Canberra, Canberra, Australian Capital Territory, Australia
| | - Owen M Siggs
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Darlinghurst, New South Wales, Australia
- Department of Ophthalmology, Flinders University, Bedford Park, South Australia, Australia
| | - Alex W Hewitt
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia
| | - Bastien Llamas
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
- Australian Centre for Ancient DNA, School of Biological Sciences and Environment Institute, University of Adelaide, Adelaide, South Australia, Australia
- ARC Centre of Excellence for Australian Biodiversity and Heritage, University of Adelaide, Adelaide, South Australia, Australia
- Indigenous Genomics, Telethon Kids Institute, Adelaide, South Australia, Australia
| | - Alex Brown
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
- Indigenous Genomics, Telethon Kids Institute, Adelaide, South Australia, Australia
| | - Gareth Baynam
- Telethon Kids Institute and Division of Paediatrics, Faculty of Health and Medical Sciences, University of Western Australia, Perth, Western Australia, Australia
- Genetic Services of Western Australia, Western Australian Department of Health, Perth, Western Australia, Australia
- Western Australian Register of Developmental Anomalies, Western Australian Department of Health, Perth, Western Australia, Australia
| | - Graham J Mann
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Brendan J McMorran
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Simon Easteal
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Azure Hermes
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Misty R Jenkins
- Immunology Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
| | - Hardip R Patel
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia.
| | - Ira W Deveson
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia.
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Darlinghurst, New South Wales, Australia.
- Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia.
| |
Collapse
|