Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhao YY, Wu LY, Zhang JH, Wang RS, Zhang XS. Haplotype assembly from aligned weighted SNP fragments. Comput Biol Chem 2005;29:281-7. [PMID: 16051522 DOI: 10.1016/j.compbiolchem.2005.05.001] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2004] [Revised: 05/20/2005] [Accepted: 05/20/2005] [Indexed: 11/22/2022]

For:	Zhao YY, Wu LY, Zhang JH, Wang RS, Zhang XS. Haplotype assembly from aligned weighted SNP fragments. Comput Biol Chem 2005;29:281-7. [PMID: 16051522 DOI: 10.1016/j.compbiolchem.2005.05.001] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2004] [Revised: 05/20/2005] [Accepted: 05/20/2005] [Indexed: 11/22/2022]

Number

Cited by Other Article(s)

Olyaee MH, Khanteymoori A, Fazli E. A fuzzy c-means clustering approach for haplotype reconstruction based on minimum error correction. INFORMATICS IN MEDICINE UNLOCKED 2021. [DOI: 10.1016/j.imu.2021.100646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open

A chaotic viewpoint-based approach to solve haplotype assembly using hypergraph model. PLoS One 2020;15:e0241291. [PMID: 33120403 PMCID: PMC7595403 DOI: 10.1371/journal.pone.0241291] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 10/12/2020] [Indexed: 12/30/2022] Open

Motazedi E, Finkers R, Maliepaard C, de Ridder D. Exploiting next-generation sequencing to solve the haplotyping puzzle in polyploids: a simulation study. Brief Bioinform 2019;19:387-403. [PMID: 28065918 DOI: 10.1093/bib/bbw126] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Indexed: 11/12/2022] Open

Bracciali A, Aldinucci M, Patterson M, Marschall T, Pisanti N, Merelli I, Torquati M. PWHATSHAP: efficient haplotyping for future generation sequencing. BMC Bioinformatics 2016;17:342. [PMID: 28185544 PMCID: PMC5046197 DOI: 10.1186/s12859-016-1170-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Abstract

Background

Haplotype phasing is an important problem in the analysis of genomics information. Given a set of DNA fragments of an individual, it consists of determining which one of the possible alleles (alternative forms of a gene) each fragment comes from. Haplotype information is relevant to gene regulation, epigenetics, genome-wide association studies, evolutionary and population studies, and the study of mutations. Haplotyping is currently addressed as an optimisation problem aiming at solutions that minimise, for instance, error correction costs, where costs are a measure of the confidence in the accuracy of the information acquired from DNA sequencing. Solutions have typically an exponential computational complexity. WhatsHap is a recent optimal approach which moves computational complexity from DNA fragment length to fragment overlap, i.e., coverage, and is hence of particular interest when considering sequencing technology’s current trends that are producing longer fragments.

Results

Given the potential relevance of efficient haplotyping in several analysis pipelines, we have designed and engineered pWhatsHap, a parallel, high-performance version of WhatsHap. pWhatsHap is embedded in a toolkit developed in Python and supports genomics datasets in standard file formats. Building on WhatsHap, pWhatsHap exhibits the same complexity exploring a number of possible solutions which is exponential in the coverage of the dataset. The parallel implementation on multi-core architectures allows for a relevant reduction of the execution time for haplotyping, while the provided results enjoy the same high accuracy as that provided by WhatsHap, which increases with coverage.

Conclusions

Due to its structure and management of the large datasets, the parallelisation of WhatsHap posed demanding technical challenges, which have been addressed exploiting a high-level parallel programming framework. The result, pWhatsHap, is a freely available toolkit that improves the efficiency of the analysis of genomics information.

Collapse

Chen ZZ, Deng F, Shen C, Wang Y, Wang L. Better ILP-Based Approaches to Haplotype Assembly. J Comput Biol 2016;23:537-52. [DOI: 10.1089/cmb.2015.0035] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open

Rhee JK, Li H, Joung JG, Hwang KB, Zhang BT, Shin SY. Survey of computational haplotype determination methods for single individual. Genes Genomics 2015. [DOI: 10.1007/s13258-015-0342-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Pirola Y, Zaccaria S, Dondi R, Klau GW, Pisanti N, Bonizzoni P. HapCol: accurate and memory-efficient haplotype assembly from long reads. Bioinformatics 2015;32:1610-7. [PMID: 26315913 DOI: 10.1093/bioinformatics/btv495] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2015] [Accepted: 08/10/2015] [Indexed: 12/30/2022] Open

Abstract

MOTIVATION

Haplotype assembly is the computational problem of reconstructing haplotypes in diploid organisms and is of fundamental importance for characterizing the effects of single-nucleotide polymorphisms on the expression of phenotypic traits. Haplotype assembly highly benefits from the advent of 'future-generation' sequencing technologies and their capability to produce long reads at increasing coverage. Existing methods are not able to deal with such data in a fully satisfactory way, either because accuracy or performances degrade as read length and sequencing coverage increase or because they are based on restrictive assumptions.

RESULTS

By exploiting a feature of future-generation technologies-the uniform distribution of sequencing errors-we designed an exact algorithm, called HapCol, that is exponential in the maximum number of corrections for each single-nucleotide polymorphism position and that minimizes the overall error-correction score. We performed an experimental analysis, comparing HapCol with the current state-of-the-art combinatorial methods both on real and simulated data. On a standard benchmark of real data, we show that HapCol is competitive with state-of-the-art methods, improving the accuracy and the number of phased positions. Furthermore, experiments on realistically simulated datasets revealed that HapCol requires significantly less computing resources, especially memory. Thanks to its computational efficiency, HapCol can overcome the limits of previous approaches, allowing to phase datasets with higher coverage and without the traditional all-heterozygous assumption.

AVAILABILITY AND IMPLEMENTATION

Our source code is available under the terms of the GNU General Public License at http://hapcol.algolab.eu/

CONTACT

bonizzoni@disco.unimib.it

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Ahn S, Vikalo H. Joint haplotype assembly and genotype calling via sequential Monte Carlo algorithm. BMC Bioinformatics 2015;16:223. [PMID: 26178880 PMCID: PMC4503296 DOI: 10.1186/s12859-015-0651-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2014] [Accepted: 06/26/2015] [Indexed: 01/01/2023] Open

Abstract

Background

Genetic variations predispose individuals to hereditary diseases, play important role in the development of complex diseases, and impact drug metabolism. The full information about the DNA variations in the genome of an individual is given by haplotypes, the ordered lists of single nucleotide polymorphisms (SNPs) located on chromosomes. Affordable high-throughput DNA sequencing technologies enable routine acquisition of data needed for the assembly of single individual haplotypes. However, state-of-the-art high-throughput sequencing platforms generate data that is erroneous, which induces uncertainty in the SNP and genotype calling procedures and, ultimately, adversely affect the accuracy of haplotyping. When inferring haplotype phase information, the vast majority of the existing techniques for haplotype assembly assume that the genotype information is correct. This motivates the development of methods capable of joint genotype calling and haplotype assembly.

Results

We present a haplotype assembly algorithm, ParticleHap, that relies on a probabilistic description of the sequencing data to jointly infer genotypes and assemble the most likely haplotypes. Our method employs a deterministic sequential Monte Carlo algorithm that associates single nucleotide polymorphisms with haplotypes by exhaustively exploring all possible extensions of the partial haplotypes. The algorithm relies on genotype likelihoods rather than on often erroneously called genotypes, thus ensuring a more accurate assembly of the haplotypes. Results on both the 1000 Genomes Project experimental data as well as simulation studies demonstrate that the proposed approach enables highly accurate solutions to the haplotype assembly problem while being computationally efficient and scalable, generally outperforming existing methods in terms of both accuracy and speed.

Conclusions

The developed probabilistic framework and sequential Monte Carlo algorithm enable joint haplotype assembly and genotyping in a computationally efficient manner. Our results demonstrate fast and highly accurate haplotype assembly aided by the re-examination of erroneously called genotypes.

A C code implementation of ParticleHap will be available for download from https://sites.google.com/site/asynoeun/particlehap.

Collapse

Safonova Y, Bankevich A, Pevzner PA. dipSPAdes: Assembler for Highly Polymorphic Diploid Genomes. J Comput Biol 2015;22:528-45. [PMID: 25734602 DOI: 10.1089/cmb.2014.0153] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Patterson M, Marschall T, Pisanti N, van Iersel L, Stougie L, Klau GW, Schönhuth A. WhatsHap: Weighted Haplotype Assembly for Future-Generation Sequencing Reads. J Comput Biol 2015;22:498-509. [PMID: 25658651 DOI: 10.1089/cmb.2014.0157] [Citation(s) in RCA: 211] [Impact Index Per Article: 23.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

An effective haplotype assembly algorithm based on hypergraph partitioning. J Theor Biol 2014;358:85-92. [DOI: 10.1016/j.jtbi.2014.05.034] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Revised: 05/08/2014] [Accepted: 05/25/2014] [Indexed: 11/20/2022]

Patterson M, Marschall T, Pisanti N, van Iersel L, Stougie L, Klau GW, Schönhuth A. WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads. LECTURE NOTES IN COMPUTER SCIENCE 2014. [DOI: 10.1007/978-3-319-05269-4_19] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

Wu J, Liang B. A fast and accurate algorithm for diploid individual haplotype reconstruction. J Bioinform Comput Biol 2013;11:1350010. [PMID: 23859274 DOI: 10.1142/s0219720013500108] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Chen ZZ, Deng F, Wang L. Exact algorithms for haplotype assembly from whole-genome sequence data. Bioinformatics 2013;29:1938-45. [DOI: 10.1093/bioinformatics/btt349] [Citation(s) in RCA: 71] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Deng F, Cui W, Wang L. A highly accurate heuristic algorithm for the haplotype assembly problem. BMC Genomics 2013;14 Suppl 2:S2. [PMID: 23445458 PMCID: PMC3582451 DOI: 10.1186/1471-2164-14-s2-s2] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open

HMEC: A Heuristic Algorithm for Individual Haplotyping with Minimum Error Correction. ISRN BIOINFORMATICS 2013;2013:291741. [PMID: 25969753 PMCID: PMC4393065 DOI: 10.1155/2013/291741] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/26/2012] [Accepted: 12/12/2012] [Indexed: 11/18/2022]

Aguiar D, Istrail S. HapCompass: a fast cycle basis algorithm for accurate haplotype assembly of sequence data. J Comput Biol 2012;19:577-90. [PMID: 22697235 DOI: 10.1089/cmb.2012.0084] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Wang TC, Taheri J, Zomaya AY. Using genetic algorithm in reconstructing single individual haplotype with minimum error correction. J Biomed Inform 2012;45:922-30. [DOI: 10.1016/j.jbi.2012.03.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2011] [Revised: 12/09/2011] [Accepted: 03/19/2012] [Indexed: 11/24/2022]

Duitama J, McEwen GK, Huebsch T, Palczewski S, Schulz S, Verstrepen K, Suk EK, Hoehe MR. Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques. Nucleic Acids Res 2011;40:2041-53. [PMID: 22102577 PMCID: PMC3299995 DOI: 10.1093/nar/gkr1042] [Citation(s) in RCA: 94] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open

Geraci F. A comparison of several algorithms for the single individual SNP haplotyping reconstruction problem. Bioinformatics 2010;26:2217-25. [PMID: 20624781 PMCID: PMC2935405 DOI: 10.1093/bioinformatics/btq411] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2009] [Revised: 06/14/2010] [Accepted: 07/06/2010] [Indexed: 11/15/2022] Open

Kang SH, Jeong IS, Cho HG, Lim HS. HapAssembler: A web server for haplotype assembly from SNP fragments using genetic algorithm. Biochem Biophys Res Commun 2010;397:340-4. [DOI: 10.1016/j.bbrc.2010.05.125] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2010] [Accepted: 05/24/2010] [Indexed: 12/27/2022]

Chen Z, Fu B, Schweller R, Yang B, Zhao Z, Zhu B. Linear time probabilistic algorithms for the singular haplotype reconstruction problem from SNP fragments. J Comput Biol 2008;15:535-46. [PMID: 18549306 DOI: 10.1089/cmb.2008.0003] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Genovese LM, Geraci F, Pellegrini M. SpeedHap: an accurate heuristic for the single individual SNP haplotyping problem with many gaps, high reading error rate and low coverage. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008;5:492-502. [PMID: 18989037 DOI: 10.1109/tcbb.2008.67] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]

Xie M, Wang J, Chen J. A model of higher accuracy for the individual haplotyping problem based on weighted SNP fragments and genotype with errors. Bioinformatics 2008;24:i105-13. [PMID: 18586702 PMCID: PMC2718625 DOI: 10.1093/bioinformatics/btn147] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

A Fast and Accurate Heuristic for the Single Individual SNP Haplotyping Problem with Many Gaps, High Reading Error Rate and Low Coverage. LECTURE NOTES IN COMPUTER SCIENCE 2007. [DOI: 10.1007/978-3-540-74126-8_6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]