Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Yang WY, Hormozdiari F, Wang Z, He D, Pasaniuc B, Eskin E. Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data. Bioinformatics 2013;29:2245-52. [PMID: 23825370 PMCID: PMC3753566 DOI: 10.1093/bioinformatics/btt386] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2012] [Revised: 06/19/2013] [Accepted: 06/28/2013] [Indexed: 01/05/2023] Open

For:	Yang WY, Hormozdiari F, Wang Z, He D, Pasaniuc B, Eskin E. Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data. Bioinformatics 2013;29:2245-52. [PMID: 23825370 PMCID: PMC3753566 DOI: 10.1093/bioinformatics/btt386] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2012] [Revised: 06/19/2013] [Accepted: 06/28/2013] [Indexed: 01/05/2023] Open

Number

Cited by Other Article(s)

Sun S, Cheng F, Han D, Wei S, Zhong A, Massoudian S, Johnson AB. Pairwise comparative analysis of six haplotype assembly methods based on users' experience. BMC Genom Data 2023;24:35. [PMID: 37386408 PMCID: PMC10311811 DOI: 10.1186/s12863-023-01134-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 05/25/2023] [Indexed: 07/01/2023] Open

Abstract

BACKGROUND

A haplotype is a set of DNA variants inherited together from one parent or chromosome. Haplotype information is useful for studying genetic variation and disease association. Haplotype assembly (HA) is a process of obtaining haplotypes using DNA sequencing data. Currently, there are many HA methods with their own strengths and weaknesses. This study focused on comparing six HA methods or algorithms: HapCUT2, MixSIH, PEATH, WhatsHap, SDhaP, and MAtCHap using two NA12878 datasets named hg19 and hg38. The 6 HA algorithms were run on chromosome 10 of these two datasets, each with 3 filtering levels based on sequencing depth (DP1, DP15, and DP30). Their outputs were then compared.

RESULT

Run time (CPU time) was compared to assess the efficiency of 6 HA methods. HapCUT2 was the fastest HA for 6 datasets, with run time consistently under 2 min. In addition, WhatsHap was relatively fast, and its run time was 21 min or less for all 6 datasets. The other 4 HA algorithms' run time varied across different datasets and coverage levels. To assess their accuracy, pairwise comparisons were conducted for each pair of the six packages by generating their disagreement rates for both haplotype blocks and Single Nucleotide Variants (SNVs). The authors also compared them using switch distance (error), i.e., the number of positions where two chromosomes of a certain phase must be switched to match with the known haplotype. HapCUT2, PEATH, MixSIH, and MAtCHap generated output files with similar numbers of blocks and SNVs, and they had relatively similar performance. WhatsHap generated a much larger number of SNVs in the hg19 DP1 output, which caused it to have high disagreement percentages with other methods. However, for the hg38 data, WhatsHap had similar performance as the other 4 algorithms, except SDhaP. The comparison analysis showed that SDhaP had a much larger disagreement rate when it was compared with the other algorithms in all 6 datasets.

CONCLUSION

The comparative analysis is important because each algorithm is different. The findings of this study provide a deeper understanding of the performance of currently available HA algorithms and useful input for other users.

Collapse

Paes J, Silva GAV, Tarragô AM, Mourão LPDS. The Contribution of JAK2 46/1 Haplotype in the Predisposition to Myeloproliferative Neoplasms. Int J Mol Sci 2022;23:12582. [PMID: 36293440 PMCID: PMC9604447 DOI: 10.3390/ijms232012582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 10/13/2022] [Accepted: 10/15/2022] [Indexed: 11/17/2022] Open

Ura H, Togi S, Niida Y. Targeted Double-Stranded cDNA Sequencing-Based Phase Analysis to Identify Compound Heterozygous Mutations and Differential Allelic Expression. BIOLOGY 2021;10:biology10040256. [PMID: 33804940 PMCID: PMC8063809 DOI: 10.3390/biology10040256] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 03/22/2021] [Accepted: 03/22/2021] [Indexed: 11/16/2022]

Abstract

Simple Summary

Phase analysis to distinguish between in cis and in trans heterozygous mutations is important for clinical diagnosis because in trans compound heterozygous mutations cause autosomal recessive diseases. However, conventional phase analysis is limited because of the large target size of genomic DNA. Here, we performed a targeted double-stranded cDNA sequencing-based phase analysis to resolve the limitation of distance using direct adapter ligation library preparation and paired-end sequencing; we elucidated that two heterozygous mutations on a patient with Wilson disease are in trans compound heterozygous mutations. Furthermore, we detected the differential allelic expression. Our results indicate that a targeted double-stranded cDNA sequencing-based phase analysis is useful for determining compound heterozygous mutations and confers information on allelic expression.

Abstract

There are two combinations of heterozygous mutation, i.e., in trans, which carries mutations on different alleles, and in cis, which carries mutations on the same allele. Because only in trans compound heterozygous mutations have been implicated in autosomal recessive diseases, it is important to distinguish them for clinical diagnosis. However, conventional phase analysis is limited because of the large target size of genomic DNA. Here, we performed a genetic analysis on a patient with Wilson disease, and we detected two heterozygous mutations chr13:51958362;G>GG (NM_000053.4:c.2304dup r.2304dup p.Met769HisfsTer26) and chr13:51964900;C>T (NM_000053.4:c.1841G>A r.1841g>a p.Gly614Asp) in the causative gene ATP7B. The distance between the two mutations was 6.5 kb in genomic DNA but 464 bp in mRNA. Targeted double-stranded cDNA sequencing-based phase analysis was performed using direct adapter ligation library preparation and paired-end sequencing, and we elucidated they are in trans compound heterozygous mutations. Trio analysis showed that the mutation (chr13:51964900;C>T) derived from the father and the other mutation from the mother, validating that the mutations are in trans composition. Furthermore, targeted double-stranded cDNA sequencing-based phase analysis detected the differential allelic expression, suggesting that the mutation (chr13:51958362;G>GG) caused downregulation of expression by nonsense-mediated mRNA decay. Our results indicate that targeted double-stranded cDNA sequencing-based phase analysis is useful for determining compound heterozygous mutations and confers information on allelic expression.

Collapse

Mocci E, Debeljak M, Klein AP, Eshleman JR. A New Fast Phasing Method Based On Haplotype Subtraction. J Mol Diagn 2019;21:427-436. [PMID: 30872187 PMCID: PMC6504677 DOI: 10.1016/j.jmoldx.2018.12.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Revised: 10/26/2018] [Accepted: 12/31/2018] [Indexed: 11/16/2022] Open

Tian S, Yan H, Klee EW, Kalmbach M, Slager SL. Comparative analysis of de novo assemblers for variation discovery in personal genomes. Brief Bioinform 2019;19:893-904. [PMID: 28407084 PMCID: PMC6169673 DOI: 10.1093/bib/bbx037] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Accepted: 03/08/2017] [Indexed: 12/30/2022] Open

He D, Saha S, Finkers R, Parida L. Efficient algorithms for polyploid haplotype phasing. BMC Genomics 2018;19:110. [PMID: 29764364 PMCID: PMC5954289 DOI: 10.1186/s12864-018-4464-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Hashemi A, Zhu B, Vikalo H. Sparse Tensor Decomposition for Haplotype Assembly of Diploids and Polyploids. BMC Genomics 2018;19:191. [PMID: 29589554 PMCID: PMC5872563 DOI: 10.1186/s12864-018-4551-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Abstract

BACKGROUND

Haplotype assembly is the task of reconstructing haplotypes of an individual from a mixture of sequenced chromosome fragments. Haplotype information enables studies of the effects of genetic variations on an organism's phenotype. Most of the mathematical formulations of haplotype assembly are known to be NP-hard and haplotype assembly becomes even more challenging as the sequencing technology advances and the length of the paired-end reads and inserts increases. Assembly of haplotypes polyploid organisms is considerably more difficult than in the case of diploids. Hence, scalable and accurate schemes with provable performance are desired for haplotype assembly of both diploid and polyploid organisms.

RESULTS

We propose a framework that formulates haplotype assembly from sequencing data as a sparse tensor decomposition. We cast the problem as that of decomposing a tensor having special structural constraints and missing a large fraction of its entries into a product of two factors, U and [Formula: see text]; tensor [Formula: see text] reveals haplotype information while U is a sparse matrix encoding the origin of erroneous sequencing reads. An algorithm, AltHap, which reconstructs haplotypes of either diploid or polyploid organisms by iteratively solving this decomposition problem is proposed. The performance and convergence properties of AltHap are theoretically analyzed and, in doing so, guarantees on the achievable minimum error correction scores and correct phasing rate are established. The developed framework is applicable to diploid, biallelic and polyallelic polyploid species. The code for AltHap is freely available from https://github.com/realabolfazl/AltHap .

CONCLUSION

AltHap was tested in a number of different scenarios and was shown to compare favorably to state-of-the-art methods in applications to haplotype assembly of diploids, and significantly outperforms existing techniques when applied to haplotype assembly of polyploids.

Collapse

Castel SE, Mohammadi P, Chung WK, Shen Y, Lappalainen T. Rare variant phasing and haplotypic expression from RNA sequencing with phASER. Nat Commun 2016;7:12817. [PMID: 27605262 PMCID: PMC5025529 DOI: 10.1038/ncomms12817] [Citation(s) in RCA: 76] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2016] [Accepted: 08/03/2016] [Indexed: 11/09/2022] Open

Xie M, Wang J, Chen X. LGH: A Fast and Accurate Algorithm for Single Individual Haplotyping Based on a Two-Locus Linkage Graph. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015;12:1255-1266. [PMID: 26671798 DOI: 10.1109/tcbb.2015.2430352] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Rhee JK, Li H, Joung JG, Hwang KB, Zhang BT, Shin SY. Survey of computational haplotype determination methods for single individual. Genes Genomics 2015. [DOI: 10.1007/s13258-015-0342-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Wang Y, Liang B, Tong X, Marder K, Bressman S, Orr-Urtreger A, Giladi N, Zeng D. Efficient Estimation of Nonparametric Genetic Risk Function with Censored Data. Biometrika 2015;102:515-532. [PMID: 26412864 PMCID: PMC4581539 DOI: 10.1093/biomet/asv030] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open

Haplotype-resolved genome sequencing: experimental methods and applications. Nat Rev Genet 2015;16:344-58. [PMID: 25948246 DOI: 10.1038/nrg3903] [Citation(s) in RCA: 119] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Patterson M, Marschall T, Pisanti N, van Iersel L, Stougie L, Klau GW, Schönhuth A. WhatsHap: Weighted Haplotype Assembly for Future-Generation Sequencing Reads. J Comput Biol 2015;22:498-509. [PMID: 25658651 DOI: 10.1089/cmb.2014.0157] [Citation(s) in RCA: 211] [Impact Index Per Article: 23.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

Glusman G, Cox HC, Roach JC. Whole-genome haplotyping approaches and genomic medicine. Genome Med 2014;6:73. [PMID: 25473435 PMCID: PMC4254418 DOI: 10.1186/s13073-014-0073-7] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Matsumoto H, Kiryu H. Integrating dilution-based sequencing and population genotypes for single individual haplotyping. BMC Genomics 2014;15:733. [PMID: 25167975 PMCID: PMC4162929 DOI: 10.1186/1471-2164-15-733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2013] [Accepted: 08/18/2014] [Indexed: 11/30/2022] Open

Abstract

Background

Haplotype information is useful for many genetic analyses and haplotypes are usually inferred using computational approaches. Among such approaches, the importance of single individual haplotyping (SIH), which infers individual haplotypes from sequence fragments, has been increasing with the advent of novel sequencing techniques, such as dilution-based sequencing. These techniques could produce virtual long read fragments by separating DNA fragments into multiple low-concentration aliquots, sequencing and mapping each aliquot, and merging clustered short reads. Although these experimental techniques are sophisticated, they have the problem of producing chimeric fragments whose left and right parts match different chromosomes. In our previous research, we found that chimeric fragments significantly decrease the accuracy of SIH. Although chimeric fragments can be removed by using haplotypes which are determined from pedigree genotypes, pedigree genotypes are generally not available. The length of reads cluster and heterozygous calls were also used to detect chimeric fragments. Although some chimeric fragments will be removed with these features, considerable number of chimeric fragments will be undetected because of the dispersion of the length and the absence of SNPs in the overlapped regions. For these reasons, a general method to detect and remove chimeric fragments is needed.

Results

In this paper, we propose a general method to detect chimeric fragments. The basis of our method is that a chimeric fragment would correspond to an artificial recombinant haplotype and would differ from biological haplotypes. To detect differences from biological haplotypes, we integrated statistical phasing, which is a haplotype inference approach from population genotypes, into our method. We applied our method to two datasets and detected chimeric fragments with high AUC. AUC values of our method are higher than those of just using cluster length and heterozygous calls. We then used multiple SIH algorithm to compare the accuracy of SIH before and after removing the chimeric fragment candidates. The accuracy of assembled haplotypes increased significantly after removing chimeric fragment candidates.

Conclusions

Our method is useful for detecting chimeric fragments and improving SIH accuracy. The Ruby script is available at https://sites.google.com/site/hmatsu1226/software/csp.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-733) contains supplementary material, which is available to authorized users.

Collapse

Mangul S, Wu NC, Mancuso N, Zelikovsky A, Sun R, Eskin E. Accurate viral population assembly from ultra-deep sequencing data. Bioinformatics 2014;30:i329-37. [PMID: 24932001 PMCID: PMC4058922 DOI: 10.1093/bioinformatics/btu295] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open

Affiliation(s)

Serghei Mangul Computer Science Department, Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA, Department of Computer Science, Georgia State University, Atlanta, GA, 30303 and Department of Human Genetics, University of California, Los Angeles, CA 90095, USA
Nicholas C Wu Computer Science Department, Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA, Department of Computer Science, Georgia State University, Atlanta, GA, 30303 and Department of Human Genetics, University of California, Los Angeles, CA 90095, USA
Nicholas Mancuso Computer Science Department, Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA, Department of Computer Science, Georgia State University, Atlanta, GA, 30303 and Department of Human Genetics, University of California, Los Angeles, CA 90095, USA
Alex Zelikovsky Computer Science Department, Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA, Department of Computer Science, Georgia State University, Atlanta, GA, 30303 and Department of Human Genetics, University of California, Los Angeles, CA 90095, USA
Ren Sun Computer Science Department, Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA, Department of Computer Science, Georgia State University, Atlanta, GA, 30303 and Department of Human Genetics, University of California, Los Angeles, CA 90095, USA
Eleazar Eskin Computer Science Department, Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA, Department of Computer Science, Georgia State University, Atlanta, GA, 30303 and Department of Human Genetics, University of California, Los Angeles, CA 90095, USAComputer Science Department, Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA, Department of Computer Science, Georgia State University, Atlanta, GA, 30303 and Department of Human Genetics, University of California, Los Angeles, CA 90095, USA

Collapse

Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel. Nat Commun 2014;5:3934. [PMID: 25653097 PMCID: PMC4338501 DOI: 10.1038/ncomms4934] [Citation(s) in RCA: 290] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2013] [Accepted: 04/23/2014] [Indexed: 12/15/2022] Open

Fujimoto M, Bodily PM, Okuda N, Clement MJ, Snell Q. Effects of error-correction of heterozygous next-generation sequencing data. BMC Bioinformatics 2014;15 Suppl 7:S3. [PMID: 25077414 PMCID: PMC4110727 DOI: 10.1186/1471-2105-15-s7-s3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open

Patterson M, Marschall T, Pisanti N, van Iersel L, Stougie L, Klau GW, Schönhuth A. WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads. LECTURE NOTES IN COMPUTER SCIENCE 2014. [DOI: 10.1007/978-3-319-05269-4_19] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]