1
|
Jia P, Dong L, Yang X, Wang B, Bush SJ, Wang T, Lin J, Wang S, Zhao X, Xu T, Che Y, Dang N, Ren L, Zhang Y, Wang X, Liang F, Wang Y, Ruan J, Xia H, Zheng Y, Shi L, Lv Y, Wang J, Ye K. Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet. Genome Biol 2023; 24:277. [PMID: 38049885 PMCID: PMC10694985 DOI: 10.1186/s13059-023-03116-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 11/21/2023] [Indexed: 12/06/2023] Open
Abstract
BACKGROUND Recent state-of-the-art sequencing technologies enable the investigation of challenging regions in the human genome and expand the scope of variant benchmarking datasets. Herein, we sequence a Chinese Quartet, comprising two monozygotic twin daughters and their biological parents, using four short and long sequencing platforms (Illumina, BGI, PacBio, and Oxford Nanopore Technology). RESULTS The long reads from the monozygotic twin daughters are phased into paternal and maternal haplotypes using the parent-child genetic map and for each haplotype. We also use long reads to generate haplotype-resolved whole-genome assemblies with completeness and continuity exceeding that of GRCh38. Using this Quartet, we comprehensively catalogue the human variant landscape, generating a dataset of 3,962,453 SNVs, 886,648 indels (< 50 bp), 9726 large deletions (≥ 50 bp), 15,600 large insertions (≥ 50 bp), 40 inversions, 31 complex structural variants, and 68 de novo mutations which are shared between the monozygotic twin daughters. Variants underrepresented in previous benchmarks owing to their complexity-including those located at long repeat regions, complex structural variants, and de novo mutations-are systematically examined in this study. CONCLUSIONS In summary, this study provides high-quality haplotype-resolved assemblies and a comprehensive set of benchmarking resources for two Chinese monozygotic twin samples which, relative to existing benchmarks, offers expanded genomic coverage and insight into complex variant categories.
Collapse
Affiliation(s)
- Peng Jia
- National Local Joint Engineering Research Center for Precision Surgery & Regenerative Medicine, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Lianhua Dong
- National Institute of Metrology, Beijing, 100029, China
| | - Xiaofei Yang
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- Genome Institute, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China
| | - Bo Wang
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Stephen J Bush
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Tingjie Wang
- National Local Joint Engineering Research Center for Precision Surgery & Regenerative Medicine, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Jiadong Lin
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Songbo Wang
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Xixi Zhao
- National Local Joint Engineering Research Center for Precision Surgery & Regenerative Medicine, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Tun Xu
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Yizhuo Che
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Ningxin Dang
- Genome Institute, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
| | - Yujing Zhang
- National Institute of Metrology, Beijing, 100029, China
| | - Xia Wang
- National Institute of Metrology, Beijing, 100029, China
| | - Fan Liang
- GrandOmics Biosciences, Beijing, 100089, China
| | - Yang Wang
- GrandOmics Biosciences, Beijing, 100089, China
| | - Jue Ruan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Han Xia
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
| | - Yi Lv
- National Local Joint Engineering Research Center for Precision Surgery & Regenerative Medicine, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China.
| | - Jing Wang
- National Institute of Metrology, Beijing, 100029, China.
| | - Kai Ye
- National Local Joint Engineering Research Center for Precision Surgery & Regenerative Medicine, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China.
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China.
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China.
- Genome Institute, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China.
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China.
- Faculty of Science, Leiden University, Leiden, 2311EZ, The Netherlands.
| |
Collapse
|
2
|
Liu Z, Zhao G, Xiao Y, Zeng S, Yuan Y, Zhou X, Fang Z, He R, Li B, Zhao Y, Pan H, Wang Y, Yu G, Peng IF, Wang D, Meng Q, Xu Q, Sun Q, Yan X, Shen L, Jiang H, Xia K, Wang J, Guo J, Liang F, Li J, Tang B. Profiling the Genome-Wide Landscape of Short Tandem Repeats by Long-Read Sequencing. Front Genet 2022; 13:810595. [PMID: 35601492 PMCID: PMC9117641 DOI: 10.3389/fgene.2022.810595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Accepted: 03/30/2022] [Indexed: 11/23/2022] Open
Abstract
Background: Short tandem repeats (STRs) are highly variable elements that play a pivotal role in multiple genetic diseases and the regulation of gene expression. Long-read sequencing (LRS) offers a potential solution to genome-wide STR analysis. However, characterizing STRs in human genomes using LRS on a large population scale has not been reported. Methods: We conducted the large LRS-based STR analysis in 193 unrelated samples of the Chinese population and performed genome-wide profiling of STR variation in the human genome. The repeat dynamic index (RDI) was introduced to evaluate the variability of STR. We sourced the expression data from the Genotype-Tissue Expression to explore the tissue specificity of highly variable STRs related genes across tissues. Enrichment analyses were also conducted to identify potential functional roles of the high variable STRs. Results: This study reports the large-scale analysis of human STR variation by LRS and offers a reference STR database based on the LRS dataset. We found that the disease-associated STRs (dSTRs) and STRs associated with the expression of nearby genes (eSTRs) were highly variable in the general population. Moreover, tissue-specific expression analysis showed that those highly variable STRs related genes presented the highest expression level in brain tissues, and enrichment pathways analysis found those STRs are involved in synaptic function-related pathways. Conclusion: Our study profiled the genome-wide landscape of STR using LRS and highlighted the highly variable STRs in the human genome, which provide a valuable resource for studying the role of STRs in human disease and complex traits.
Collapse
Affiliation(s)
- Zhenhua Liu
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Guihu Zhao
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | | | - Sheng Zeng
- Department of Geriatrics, The Second Xiangya Hospital, Central South University, Changsha, China
| | - Yanchun Yuan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Xun Zhou
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Zhenghuan Fang
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Runcheng He
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Bin Li
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Yuwen Zhao
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Hongxu Pan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Yige Wang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | | | | | | | - Qingtuan Meng
- Multi-Omics Research Center for Brain Disorders, The First Affiliated Hospital of University of South China, Hengyang, China
| | - Qian Xu
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Qiying Sun
- Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, China
| | - Xinxiang Yan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Lu Shen
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Hong Jiang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, China
| | - Kun Xia
- Centre for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Junling Wang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Jifeng Guo
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Fan Liang
- GrandOmics Biosciences, Beijing, China
- *Correspondence: Beisha Tang, ; Jinchen Li, ; Fan Liang,
| | - Jinchen Li
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, China
- Centre for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
- *Correspondence: Beisha Tang, ; Jinchen Li, ; Fan Liang,
| | - Beisha Tang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- Multi-Omics Research Center for Brain Disorders, The First Affiliated Hospital of University of South China, Hengyang, China
- Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, China
- *Correspondence: Beisha Tang, ; Jinchen Li, ; Fan Liang,
| |
Collapse
|
3
|
Haplotype-resolved Chinese male genome assembly based on high-fidelity sequencing. FUNDAMENTAL RESEARCH 2022. [DOI: 10.1016/j.fmre.2022.02.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
|
4
|
Xu M, Guo L, Du X, Li L, Peters BA, Deng L, Wang O, Chen F, Wang J, Jiang Z, Han J, Ni M, Yang H, Xu X, Liu X, Huang J, Fan G. Accurate Haplotype-Resolved Assembly Reveals The Origin Of Structural Variants For Human Trios. Bioinformatics 2021; 37:2095-2102. [PMID: 33538292 PMCID: PMC8613828 DOI: 10.1093/bioinformatics/btab068] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Revised: 12/07/2020] [Accepted: 01/28/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Achieving a near complete understanding of how the genome of an individual affects the phenotypes of that individual requires deciphering the order of variations along homologous chromosomes in species with diploid genomes. However, true diploid assembly of long-range haplotypes remains challenging. RESULTS To address this, we have developed Haplotype-resolved Assembly for Synthetic long reads using a Trio-binning strategy, or HAST, which uses parental information to classify reads into maternal or paternal. Once sorted, these reads are used to independently de novo assemble the parent-specific haplotypes. We applied HAST to co-barcoded second-generation sequencing data from an Asian individual, resulting in a haplotype assembly covering 94.7% of the reference genome with a scaffold N50 longer than 11 Mb. The high haplotyping precision (∼99.7%) and recall (∼95.9%) represents a substantial improvement over the commonly used tool for assembling co-barcoded reads (Supernova), and is comparable to a trio-binning-based third generation long-read based assembly method (TrioCanu) but with a significantly higher single-base accuracy (up to 99.99997% (Q65)). This makes HAST a superior tool for accurate haplotyping and future haplotype-based studies. AVAILABILITY The code of the analysis is available at https://github.com/BGI-Qingdao/HAST. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mengyang Xu
- BGI-QingDao, Qingdao, 266555, China.,BGI-Shenzhen, Shenzhen, 518083, China
| | - Lidong Guo
- BGI-QingDao, Qingdao, 266555, China.,BGI Education Center, University of Chinese Academy of Sciences, Shenzhen, 518083, China
| | - Xiao Du
- BGI-QingDao, Qingdao, 266555, China.,BGI-Shenzhen, Shenzhen, 518083, China
| | - Lei Li
- BGI-QingDao, Qingdao, 266555, China.,School of Future Technology, University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Brock A Peters
- BGI-Shenzhen, Shenzhen, 518083, China.,Complete Genomics Inc, 2904 Orchard Pkwy, San Jose, California, 95134, USA
| | - Li Deng
- BGI-QingDao, Qingdao, 266555, China
| | - Ou Wang
- BGI-Shenzhen, Shenzhen, 518083, China
| | - Fang Chen
- MGI, BGI-Shenzhen, Shenzhen, 518083, China
| | - Jun Wang
- BGI-QingDao, Qingdao, 266555, China
| | | | | | - Ming Ni
- BGI-QingDao, Qingdao, 266555, China.,BGI-Shenzhen, Shenzhen, 518083, China
| | | | - Xun Xu
- BGI-Shenzhen, Shenzhen, 518083, China
| | - Xin Liu
- BGI-QingDao, Qingdao, 266555, China.,BGI-Shenzhen, Shenzhen, 518083, China.,State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, 518083, China
| | - Jie Huang
- National Institutes for food and drug Control (NIFDC), No.2 Tiantan Xili, Dongcheng District, Beijing, 10050, China
| | - Guangyi Fan
- BGI-QingDao, Qingdao, 266555, China.,BGI-Shenzhen, Shenzhen, 518083, China.,State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, 518083, China
| |
Collapse
|