1
|
Bai X, Chen Z, Chen K, Wu Z, Wang R, Liu J, Chang L, Wen L, Tang F. Simultaneous de novo calling and phasing of genetic variants at chromosome-scale using NanoStrand-seq. Cell Discov 2024; 10:74. [PMID: 38977679 PMCID: PMC11231365 DOI: 10.1038/s41421-024-00694-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 05/23/2024] [Indexed: 07/10/2024] Open
Abstract
The successful accomplishment of the first telomere-to-telomere human genome assembly, T2T-CHM13, marked a milestone in achieving completeness of the human reference genome. The upcoming era of genome study will focus on fully phased diploid genome assembly, with an emphasis on genetic differences between individual haplotypes. Most existing sequencing approaches only achieved localized haplotype phasing and relied on additional pedigree information for further whole-chromosome scale phasing. The short-read-based Strand-seq method is able to directly phase single nucleotide polymorphisms (SNPs) at whole-chromosome scale but falls short when it comes to phasing structural variations (SVs). To shed light on this issue, we developed a Nanopore sequencing platform-based Strand-seq approach, which we named NanoStrand-seq. This method allowed for de novo SNP calling with high precision (99.52%) and acheived a superior phasing accuracy (0.02% Hamming error rate) at whole-chromosome scale, a level of performance comparable to Strand-seq for haplotype phasing of the GM12878 genome. Importantly, we demonstrated that NanoStrand-seq can efficiently resolve the MHC locus, a highly polymorphic genomic region. Moreover, NanoStrand-seq enabled independent direct calling and phasing of deletions and insertions at whole-chromosome level; when applied to long genomic regions of SNP homozygosity, it outperformed the strategy that combined Strand-seq with bulk long-read sequencing. Finally, we showed that, like Strand-seq, NanoStrand-seq was also applicable to primary cultured cells. Together, here we provided a novel methodology that enabled interrogation of a full spectrum of haplotype-resolved SNPs and SVs at whole-chromosome scale, with broad applications for species with diploid or even potentially polypoid genomes.
Collapse
Affiliation(s)
- Xiuzhen Bai
- Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
- Changping Laboratory, Beijing, China
| | - Zonggui Chen
- Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
- Changping Laboratory, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Kexuan Chen
- Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
- School of Life Sciences, Peking University, Beijing, China
| | - Zixin Wu
- Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Rui Wang
- Department of Medicine, Cancer Institute, Stanford University, Stanford, CA, USA
| | - Jun'e Liu
- Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
- Changping Laboratory, Beijing, China
- School of Life Sciences, Peking University, Beijing, China
| | - Liang Chang
- State Key Laboratory of Female Fertility Promotion, Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing, China
- National Clinical Research Center for Obstetrics and Gynecology (Peking University Third Hospital), Beijing, China
- Key Laboratory of Assisted Reproduction (Peking University), Ministry of Education Beijing, Beijing, China
- Key Laboratory of Reproductive Endocrinology and Assisted Reproductive Technology, Beijing, China
| | - Lu Wen
- Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
- Changping Laboratory, Beijing, China
| | - Fuchou Tang
- Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China.
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China.
- Changping Laboratory, Beijing, China.
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.
- School of Life Sciences, Peking University, Beijing, China.
| |
Collapse
|
2
|
Akbari V, Hanlon VC, O’Neill K, Lefebvre L, Schrader KA, Lansdorp PM, Jones SJ. Parent-of-origin detection and chromosome-scale haplotyping using long-read DNA methylation sequencing and Strand-seq. CELL GENOMICS 2022; 3:100233. [PMID: 36777186 PMCID: PMC9903809 DOI: 10.1016/j.xgen.2022.100233] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 09/08/2022] [Accepted: 11/29/2022] [Indexed: 12/24/2022]
Abstract
Hundreds of loci in human genomes have alleles that are methylated differentially according to their parent of origin. These imprinted loci generally show little variation across tissues, individuals, and populations. We show that such loci can be used to distinguish the maternal and paternal homologs for all human autosomes without the need for the parental DNA. We integrate methylation-detecting nanopore sequencing with the long-range phase information in Strand-seq data to determine the parent of origin of chromosome-length haplotypes for both DNA sequence and DNA methylation in five trios with diverse genetic backgrounds. The parent of origin was correctly inferred for all autosomes with an average mismatch error rate of 0.31% for SNVs and 1.89% for insertions or deletions (indels). Because our method can determine whether an inherited disease allele originated from the mother or the father, we predict that it will improve the diagnosis and management of many genetic diseases.
Collapse
Affiliation(s)
- Vahid Akbari
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada,Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
| | | | - Kieran O’Neill
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Louis Lefebvre
- Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Kasmintan A. Schrader
- Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada,Department of Molecular Oncology, BC Cancer, Vancouver, BC, Canada
| | - Peter M. Lansdorp
- Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada,Terry Fox Laboratory, BC Cancer, Vancouver, BC, Canada,Corresponding author
| | - Steven J.M. Jones
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada,Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada,Corresponding author
| |
Collapse
|
3
|
Hanlon VCT, Lansdorp PM, Guryev V. A survey of current methods to detect and genotype inversions. Hum Mutat 2022; 43:1576-1589. [PMID: 36047337 DOI: 10.1002/humu.24458] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 08/26/2022] [Accepted: 08/29/2022] [Indexed: 11/11/2022]
Abstract
Polymorphic inversions are ubiquitous in humans, and they have been linked to both adaptation and disease. Following their discovery in Drosophila more than a century ago, inversions have proved to be more elusive than other structural variants. A wide variety of methods for the detection and genotyping of inversions have recently been developed: multiple techniques based on selective amplification by PCR, short- and long-read sequencing approaches, principal component analysis of small variant haplotypes, template strand sequencing, optical mapping, and various genome assembly methods. Many methods apply complex wet lab protocols or increasingly refined bioinformatic analyses. This review is an attempt to provide a practical summary and comparison of the methods that are in current use, with a focus on metrics such as the maximum size of segmental duplications at inversion breakpoints that each method can tolerate, the size range of inversions that they recover, their throughput, and whether the locations of putative inversions must be known beforehand. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
| | - Peter M Lansdorp
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, BC, V5Z 1L3, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
| | - Victor Guryev
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, 9713 AV, Groningen, The Netherlands
| |
Collapse
|