Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kim JH, Waterman MS, Li LM. Diploid genome reconstruction of Ciona intestinalis and comparative analysis with Ciona savignyi. Genome Res 2007;17:1101-10. [PMID: 17567986 PMCID: PMC1899121 DOI: 10.1101/gr.5894107] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

For:	Kim JH, Waterman MS, Li LM. Diploid genome reconstruction of Ciona intestinalis and comparative analysis with Ciona savignyi. Genome Res 2007;17:1101-10. [PMID: 17567986 PMCID: PMC1899121 DOI: 10.1101/gr.5894107] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

Number

Cited by Other Article(s)

Sankararaman A, Vikalo H, Baccelli F. ComHapDet: a spatial community detection algorithm for haplotype assembly. BMC Genomics 2020;21:586. [PMID: 32900369 PMCID: PMC7488034 DOI: 10.1186/s12864-020-06935-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Igarashi K, Funakoshi M, Kato S, Moriwaki T, Kato Y, Zhang-Akiyama QM. CiApex1 has AP endonuclease activity and abrogated AP site repair disrupts early embryonic development in Ciona intestinalis. Genes Genet Syst 2019;94:81-93. [PMID: 30930342 DOI: 10.1266/ggs.18-00043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Hashemi A, Zhu B, Vikalo H. Sparse Tensor Decomposition for Haplotype Assembly of Diploids and Polyploids. BMC Genomics 2018;19:191. [PMID: 29589554 PMCID: PMC5872563 DOI: 10.1186/s12864-018-4551-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Abstract

BACKGROUND

Haplotype assembly is the task of reconstructing haplotypes of an individual from a mixture of sequenced chromosome fragments. Haplotype information enables studies of the effects of genetic variations on an organism's phenotype. Most of the mathematical formulations of haplotype assembly are known to be NP-hard and haplotype assembly becomes even more challenging as the sequencing technology advances and the length of the paired-end reads and inserts increases. Assembly of haplotypes polyploid organisms is considerably more difficult than in the case of diploids. Hence, scalable and accurate schemes with provable performance are desired for haplotype assembly of both diploid and polyploid organisms.

RESULTS

We propose a framework that formulates haplotype assembly from sequencing data as a sparse tensor decomposition. We cast the problem as that of decomposing a tensor having special structural constraints and missing a large fraction of its entries into a product of two factors, U and [Formula: see text]; tensor [Formula: see text] reveals haplotype information while U is a sparse matrix encoding the origin of erroneous sequencing reads. An algorithm, AltHap, which reconstructs haplotypes of either diploid or polyploid organisms by iteratively solving this decomposition problem is proposed. The performance and convergence properties of AltHap are theoretically analyzed and, in doing so, guarantees on the achievable minimum error correction scores and correct phasing rate are established. The developed framework is applicable to diploid, biallelic and polyallelic polyploid species. The code for AltHap is freely available from https://github.com/realabolfazl/AltHap .

CONCLUSION

AltHap was tested in a number of different scenarios and was shown to compare favorably to state-of-the-art methods in applications to haplotype assembly of diploids, and significantly outperforms existing techniques when applied to haplotype assembly of polyploids.

Collapse

Ciona as a Simple Chordate Model for Heart Development and Regeneration. J Cardiovasc Dev Dis 2016;3. [PMID: 27642586 PMCID: PMC5023151 DOI: 10.3390/jcdd3030025] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open

Zhao X, Emery SB, Myers B, Kidd JM, Mills RE. Resolving complex structural genomic rearrangements using a randomized approach. Genome Biol 2016;17:126. [PMID: 27287201 PMCID: PMC4901421 DOI: 10.1186/s13059-016-0993-1] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Accepted: 05/25/2016] [Indexed: 12/27/2022] Open

Xing Q, Yu Q, Dou H, Wang J, Li R, Ning X, Wang R, Wang S, Zhang L, Hu X, Bao Z. Genome-wide identification, characterization and expression analyses of two TNFRs in Yesso scallop (Patinopecten yessoensis) provide insight into the disparity of responses to bacterial infections and heat stress in bivalves. FISH & SHELLFISH IMMUNOLOGY 2016;52:44-56. [PMID: 26988286 DOI: 10.1016/j.fsi.2016.03.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2015] [Revised: 01/28/2016] [Accepted: 03/10/2016] [Indexed: 05/16/2023]

Affiliation(s)

Qiang Xing Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, 5 Yushan Road, Qingdao 266003, China
Qian Yu Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, 5 Yushan Road, Qingdao 266003, China
Huaiqian Dou Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, 5 Yushan Road, Qingdao 266003, China
Jing Wang Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, 5 Yushan Road, Qingdao 266003, China
Ruojiao Li Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, 5 Yushan Road, Qingdao 266003, China
Xianhui Ning Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, 5 Yushan Road, Qingdao 266003, China
Ruijia Wang Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, 5 Yushan Road, Qingdao 266003, China.
Shi Wang Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, 5 Yushan Road, Qingdao 266003, China
Lingling Zhang Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, 5 Yushan Road, Qingdao 266003, China.
Xiaoli Hu Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, 5 Yushan Road, Qingdao 266003, China
Zhenmin Bao Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, 5 Yushan Road, Qingdao 266003, China

Collapse

Puljiz Z, Vikalo H. Decoding Genetic Variations: Communications-Inspired Haplotype Assembly. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016;13:518-530. [PMID: 27295635 DOI: 10.1109/tcbb.2015.2462367] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

José-Edwards DS, Oda-Ishii I, Kugler JE, Passamaneck YJ, Katikala L, Nibu Y, Di Gregorio A. Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord. PLoS Genet 2015;11:e1005730. [PMID: 26684323 PMCID: PMC4684326 DOI: 10.1371/journal.pgen.1005730] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 11/16/2015] [Indexed: 11/18/2022] Open

Ahn S, Vikalo H. Joint haplotype assembly and genotype calling via sequential Monte Carlo algorithm. BMC Bioinformatics 2015;16:223. [PMID: 26178880 PMCID: PMC4503296 DOI: 10.1186/s12859-015-0651-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2014] [Accepted: 06/26/2015] [Indexed: 01/01/2023] Open

Abstract

Background

Genetic variations predispose individuals to hereditary diseases, play important role in the development of complex diseases, and impact drug metabolism. The full information about the DNA variations in the genome of an individual is given by haplotypes, the ordered lists of single nucleotide polymorphisms (SNPs) located on chromosomes. Affordable high-throughput DNA sequencing technologies enable routine acquisition of data needed for the assembly of single individual haplotypes. However, state-of-the-art high-throughput sequencing platforms generate data that is erroneous, which induces uncertainty in the SNP and genotype calling procedures and, ultimately, adversely affect the accuracy of haplotyping. When inferring haplotype phase information, the vast majority of the existing techniques for haplotype assembly assume that the genotype information is correct. This motivates the development of methods capable of joint genotype calling and haplotype assembly.

Results

We present a haplotype assembly algorithm, ParticleHap, that relies on a probabilistic description of the sequencing data to jointly infer genotypes and assemble the most likely haplotypes. Our method employs a deterministic sequential Monte Carlo algorithm that associates single nucleotide polymorphisms with haplotypes by exhaustively exploring all possible extensions of the partial haplotypes. The algorithm relies on genotype likelihoods rather than on often erroneously called genotypes, thus ensuring a more accurate assembly of the haplotypes. Results on both the 1000 Genomes Project experimental data as well as simulation studies demonstrate that the proposed approach enables highly accurate solutions to the haplotype assembly problem while being computationally efficient and scalable, generally outperforming existing methods in terms of both accuracy and speed.

Conclusions

The developed probabilistic framework and sequential Monte Carlo algorithm enable joint haplotype assembly and genotyping in a computationally efficient manner. Our results demonstrate fast and highly accurate haplotype assembly aided by the re-examination of erroneously called genotypes.

A C code implementation of ParticleHap will be available for download from https://sites.google.com/site/asynoeun/particlehap.

Collapse

Heterozygous genome assembly via binary classification of homologous sequence. BMC Bioinformatics 2015;16 Suppl 7:S5. [PMID: 25952609 PMCID: PMC4423727 DOI: 10.1186/1471-2105-16-s7-s5] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

Background

Genome assemblers to date have predominantly targeted haploid reference reconstruction from homozygous data. When applied to diploid genome assembly, these assemblers perform poorly, owing to the violation of assumptions during both the contigging and scaffolding phases. Effective tools to overcome these problems are in growing demand. Increasing parameter stringency during contigging is an effective solution to obtaining haplotype-specific contigs; however, effective algorithms for scaffolding such contigs are lacking.

Methods

We present a stand-alone scaffolding algorithm, ScaffoldScaffolder, designed specifically for scaffolding diploid genomes. The algorithm identifies homologous sequences as found in "bubble" structures in scaffold graphs. Machine learning classification is used to then classify sequences in partial bubbles as homologous or non-homologous sequences prior to reconstructing haplotype-specific scaffolds. We define four new metrics for assessing diploid scaffolding accuracy: contig sequencing depth, contig homogeneity, phase group homogeneity, and heterogeneity between phase groups.

Results

We demonstrate the viability of using bubbles to identify heterozygous homologous contigs, which we term homolotigs. We show that machine learning classification trained on these homolotig pairs can be used effectively for identifying homologous sequences elsewhere in the data with high precision (assuming error-free reads).

Conclusion

More work is required to comparatively analyze this approach on real data with various parameters and classifiers against other diploid genome assembly methods. However, the initial results of ScaffoldScaffolder supply validity to the idea of employing machine learning in the difficult task of diploid genome assembly. Software is available at http://bioresearch.byu.edu/scaffoldscaffolder.

Collapse

Das S, Vikalo H. SDhaP: haplotype assembly for diploids and polyploids via semi-definite programming. BMC Genomics 2015;16:260. [PMID: 25885901 PMCID: PMC4422552 DOI: 10.1186/s12864-015-1408-5] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2014] [Accepted: 02/27/2015] [Indexed: 11/30/2022] Open

Kuleshov V. Probabilistic single-individual haplotyping. Bioinformatics 2014;30:i379-85. [PMID: 25161223 PMCID: PMC4147930 DOI: 10.1093/bioinformatics/btu484] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Matsumoto H, Kiryu H. Integrating dilution-based sequencing and population genotypes for single individual haplotyping. BMC Genomics 2014;15:733. [PMID: 25167975 PMCID: PMC4162929 DOI: 10.1186/1471-2164-15-733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2013] [Accepted: 08/18/2014] [Indexed: 11/30/2022] Open

Abstract

Background

Haplotype information is useful for many genetic analyses and haplotypes are usually inferred using computational approaches. Among such approaches, the importance of single individual haplotyping (SIH), which infers individual haplotypes from sequence fragments, has been increasing with the advent of novel sequencing techniques, such as dilution-based sequencing. These techniques could produce virtual long read fragments by separating DNA fragments into multiple low-concentration aliquots, sequencing and mapping each aliquot, and merging clustered short reads. Although these experimental techniques are sophisticated, they have the problem of producing chimeric fragments whose left and right parts match different chromosomes. In our previous research, we found that chimeric fragments significantly decrease the accuracy of SIH. Although chimeric fragments can be removed by using haplotypes which are determined from pedigree genotypes, pedigree genotypes are generally not available. The length of reads cluster and heterozygous calls were also used to detect chimeric fragments. Although some chimeric fragments will be removed with these features, considerable number of chimeric fragments will be undetected because of the dispersion of the length and the absence of SNPs in the overlapped regions. For these reasons, a general method to detect and remove chimeric fragments is needed.

Results

In this paper, we propose a general method to detect chimeric fragments. The basis of our method is that a chimeric fragment would correspond to an artificial recombinant haplotype and would differ from biological haplotypes. To detect differences from biological haplotypes, we integrated statistical phasing, which is a haplotype inference approach from population genotypes, into our method. We applied our method to two datasets and detected chimeric fragments with high AUC. AUC values of our method are higher than those of just using cluster length and heterozygous calls. We then used multiple SIH algorithm to compare the accuracy of SIH before and after removing the chimeric fragment candidates. The accuracy of assembled haplotypes increased significantly after removing chimeric fragment candidates.

Conclusions

Our method is useful for detecting chimeric fragments and improving SIH accuracy. The Ruby script is available at https://sites.google.com/site/hmatsu1226/software/csp.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-733) contains supplementary material, which is available to authorized users.

Collapse

Rimmer A, Phan H, Mathieson I, Iqbal Z, Twigg SRF, Wilkie AOM, McVean G, Lunter G. Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nat Genet 2014;46:912-918. [PMID: 25017105 PMCID: PMC4753679 DOI: 10.1038/ng.3036] [Citation(s) in RCA: 709] [Impact Index Per Article: 70.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2013] [Accepted: 06/23/2014] [Indexed: 12/19/2022]

Kajitani R, Toshimoto K, Noguchi H, Toyoda A, Ogura Y, Okuno M, Yabana M, Harada M, Nagayasu E, Maruyama H, Kohara Y, Fujiyama A, Hayashi T, Itoh T. Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads. Genome Res 2014;24:1384-95. [PMID: 24755901 PMCID: PMC4120091 DOI: 10.1101/gr.170720.113] [Citation(s) in RCA: 743] [Impact Index Per Article: 74.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Affiliation(s)

Rei Kajitani Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Meguro-ku, Tokyo 152-8550, Japan
Kouta Toshimoto Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Meguro-ku, Tokyo 152-8550, Japan; AXIOHELIX Co. Ltd., Chuo-ku, Tokyo 103-0015, Japan
Hideki Noguchi Advanced Genomics Center, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
Atsushi Toyoda Advanced Genomics Center, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan; Center for Information Biology, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
Yoshitoshi Ogura Division of Microbial Genomics, Frontier Science Research Center, University of Miyazaki, Miyazaki 889-1692, Japan; Division of Microbiology, Faculty of Medicine, University of Miyazaki, Miyazaki 889-1692, Japan
Miki Okuno Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Meguro-ku, Tokyo 152-8550, Japan
Mitsuru Yabana Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Meguro-ku, Tokyo 152-8550, Japan
Masayuki Harada Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Meguro-ku, Tokyo 152-8550, Japan
Eiji Nagayasu Division of Parasitology, Faculty of Medicine, University of Miyazaki, Miyazaki 889-1692, Japan
Haruhiko Maruyama Division of Parasitology, Faculty of Medicine, University of Miyazaki, Miyazaki 889-1692, Japan
Yuji Kohara Genetic Strains Research Center, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
Asao Fujiyama Advanced Genomics Center, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan; Center for Information Biology, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
Tetsuya Hayashi Division of Microbial Genomics, Frontier Science Research Center, University of Miyazaki, Miyazaki 889-1692, Japan; Division of Microbiology, Faculty of Medicine, University of Miyazaki, Miyazaki 889-1692, Japan
Takehiko Itoh Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Meguro-ku, Tokyo 152-8550, Japan

Collapse

Sequencing, assembling, and correcting draft genomes using recombinant populations. G3-GENES GENOMES GENETICS 2014;4:669-79. [PMID: 24531727 PMCID: PMC4059239 DOI: 10.1534/g3.114.010264] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Abstract

Current de novo whole-genome sequencing approaches often are inadequate for organisms lacking substantial preexisting genetic data. Problems with these methods are manifest as: large numbers of scaffolds that are not ordered within chromosomes or assigned to individual chromosomes, misassembly of allelic sequences as separate loci when the individual(s) being sequenced are heterozygous, and the collapse of recently duplicated sequences into a single locus, regardless of levels of heterozygosity. Here we propose a new approach for producing de novo whole-genome sequences—which we call recombinant population genome construction—that solves many of the problems encountered in standard genome assembly and that can be applied in model and nonmodel organisms. Our approach takes advantage of next-generation sequencing technologies to simultaneously barcode and sequence a large number of individuals from a recombinant population. The sequences of all recombinants can be combined to create an initial de novo assembly, followed by the use of individual recombinant genotypes to correct assembly splitting/collapsing and to order and orient scaffolds within linkage groups. Recombinant population genome construction can rapidly accelerate the transformation of nonmodel species into genome-enabled systems by simultaneously producing a high-quality genome assembly and providing genomic tools (e.g., high-confidence single-nucleotide polymorphisms) for immediate applications. In populations segregating for important functional traits, this approach also enables simultaneous mapping of quantitative trait loci. We demonstrate our method using simulated Illumina data from a recombinant population of Caenorhabditis elegans and show that the method can produce a high-fidelity, high-quality genome assembly for both parents of the cross.

Collapse

Holland LZ. Genomics, evolution and development of amphioxus and tunicates: The Goldilocks principle. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2014;324:342-52. [DOI: 10.1002/jez.b.22569] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2013] [Revised: 01/29/2014] [Accepted: 02/27/2014] [Indexed: 11/10/2022]

Parallel evolution of chordate cis-regulatory code for development. PLoS Genet 2013;9:e1003904. [PMID: 24282393 PMCID: PMC3836708 DOI: 10.1371/journal.pgen.1003904] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2013] [Accepted: 09/09/2013] [Indexed: 12/17/2022] Open

Abstract

Urochordates are the closest relatives of vertebrates and at the larval stage, possess a characteristic bilateral chordate body plan. In vertebrates, the genes that orchestrate embryonic patterning are in part regulated by highly conserved non-coding elements (CNEs), yet these elements have not been identified in urochordate genomes. Consequently the evolution of the cis-regulatory code for urochordate development remains largely uncharacterised. Here, we use genome-wide comparisons between C. intestinalis and C. savignyi to identify putative urochordate cis-regulatory sequences. Ciona conserved non-coding elements (ciCNEs) are associated with largely the same key regulatory genes as vertebrate CNEs. Furthermore, some of the tested ciCNEs are able to activate reporter gene expression in both zebrafish and Ciona embryos, in a pattern that at least partially overlaps that of the gene they associate with, despite the absence of sequence identity. We also show that the ability of a ciCNE to up-regulate gene expression in vertebrate embryos can in some cases be localised to short sub-sequences, suggesting that functional cross-talk may be defined by small regions of ancestral regulatory logic, although functional sub-sequences may also be dispersed across the whole element. We conclude that the structure and organisation of cis-regulatory modules is very different between vertebrates and urochordates, reflecting their separate evolutionary histories. However, functional cross-talk still exists because the same repertoire of transcription factors has likely guided their parallel evolution, exploiting similar sets of binding sites but in different combinations.

Vertebrates share many aspects of early development with our closest chordate ancestors, the tunicates. However, whilst the repertoire of genes that orchestrate development is essentially the same in the two lineages, the genomic code that regulates these genes appears to be very different, even though it is highly conserved within vertebrates themselves. Using comparative genomics, we have identified a parallel developmental code in tunicates and confirmed that this code, despite a lack of sequence conservation, associates with a similar repertoire of genes. However, the organisation of the code spatially is very different in the two lineages, strongly suggesting that most of it arose independently in vertebrates and tunicates, and in most cases lacking any direct sequence ancestry. We have assayed elements of the tunicate code, and found that at least some of them can regulate gene expression in zebrafish embryos. Our results suggest that regulatory code has arisen independently in different animal lineages but possesses some common functionality because its evolution has been driven by a similar cohort of developmental transcription factors. Our work helps illuminate how complex, stable gene regulatory networks evolve and become fixed within lineages.

Collapse

Cutter AD, Jovelin R, Dey A. Molecular hyperdiversity and evolution in very large populations. Mol Ecol 2013;22:2074-95. [PMID: 23506466 PMCID: PMC4065115 DOI: 10.1111/mec.12281] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2012] [Revised: 01/24/2013] [Accepted: 01/29/2013] [Indexed: 02/06/2023]

Matsumoto H, Kiryu H. MixSIH: a mixture model for single individual haplotyping. BMC Genomics 2013;14 Suppl 2:S5. [PMID: 23445519 PMCID: PMC3582441 DOI: 10.1186/1471-2164-14-s2-s5] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

Abstract

BACKGROUND

Haplotype information is useful for various genetic analyses, including genome-wide association studies. Determining haplotypes experimentally is difficult and there are several computational approaches that infer haplotypes from genomic data. Among such approaches, single individual haplotyping or haplotype assembly, which infers two haplotypes of an individual from aligned sequence fragments, has been attracting considerable attention. To avoid incorrect results in downstream analyses, it is important not only to assemble haplotypes as long as possible but also to provide means to extract highly reliable haplotype regions. Although there are several efficient algorithms for solving haplotype assembly, there are no efficient method that allow for extracting the regions assembled with high confidence.

RESULTS

We develop a probabilistic model, called MixSIH, for solving the haplotype assembly problem. The model has two mixture components representing two haplotypes. Based on the optimized model, a quality score is defined, which we call the 'minimum connectivity' (MC) score, for each segment in the haplotype assembly. Because existing accuracy measures for haplotype assembly are designed to compare the efficiency between the algorithms and are not suitable for evaluating the quality of the set of partially assembled haplotype segments, we develop an accuracy measure based on the pairwise consistency and evaluate the accuracy on the simulation and real data. By using the MC scores, our algorithm can extract highly accurate haplotype segments. We also show evidence that an existing experimental dataset contains chimeric read fragments derived from different haplotypes, which significantly degrade the quality of assembled haplotypes.

CONCLUSIONS

We develop a novel method for solving the haplotype assembly problem. We also define the quality score which is based on our model and indicates the accuracy of the haplotypes segments. In our evaluation, MixSIH has successfully extracted reliable haplotype segments. The C++ source code of MixSIH is available at https://sites.google.com/site/hmatsu1226/software/mixsih.

Collapse

Stolfi A, Christiaen L. Genetic and genomic toolbox of the chordate Ciona intestinalis. Genetics 2012;192:55-66. [PMID: 22964837 PMCID: PMC3430545 DOI: 10.1534/genetics.112.140590] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2012] [Accepted: 04/30/2012] [Indexed: 02/01/2023] Open

Beaster-Jones L. Cis-regulation and conserved non-coding elements in amphioxus. Brief Funct Genomics 2012;11:118-30. [DOI: 10.1093/bfgp/els006] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Iqbal Z, Caccamo M, Turner I, Flicek P, McVean G. De novo assembly and genotyping of variants using colored de Bruijn graphs. Nat Genet 2012;44:226-32. [PMID: 22231483 PMCID: PMC3272472 DOI: 10.1038/ng.1028] [Citation(s) in RCA: 352] [Impact Index Per Article: 29.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2011] [Accepted: 11/07/2011] [Indexed: 12/24/2022]

Carnevali P, Baccash J, Halpern AL, Nazarenko I, Nilsen GB, Pant KP, Ebert JC, Brownley A, Morenzoni M, Karpinchyk V, Martin B, Ballinger DG, Drmanac R. Computational techniques for human genome resequencing using mated gapped reads. J Comput Biol 2011;19:279-92. [PMID: 22175250 DOI: 10.1089/cmb.2011.0201] [Citation(s) in RCA: 85] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Kim JH, Kim WC, Li LM, Park S. HapEdit: an accuracy assessment viewer for haplotype assembly using massively parallel DNA-sequencing technologies. Nucleic Acids Res 2011;39:W557-61. [PMID: 21576217 PMCID: PMC3125762 DOI: 10.1093/nar/gkr354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

The importance of phase information for human genomics. Nat Rev Genet 2011;12:215-23. [PMID: 21301473 DOI: 10.1038/nrg2950] [Citation(s) in RCA: 191] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Veeman MT, Chiba S, Smith WC. Ciona genetics. Methods Mol Biol 2011;770:401-22. [PMID: 21805273 DOI: 10.1007/978-1-61779-210-6_15] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Haplotype-resolved genome sequencing of a Gujarati Indian individual. Nat Biotechnol 2010;29:59-63. [PMID: 21170042 DOI: 10.1038/nbt.1740] [Citation(s) in RCA: 184] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2010] [Accepted: 11/29/2010] [Indexed: 11/08/2022]

Tsagkogeorga G, Turon X, Galtier N, Douzery EJP, Delsuc F. Accelerated evolutionary rate of housekeeping genes in tunicates. J Mol Evol 2010;71:153-67. [PMID: 20697701 DOI: 10.1007/s00239-010-9372-9] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2010] [Accepted: 07/16/2010] [Indexed: 01/11/2023]

Smith JJ, Saha NR, Amemiya CT. Genome biology of the cyclostomes and insights into the evolutionary biology of vertebrate genomes. Integr Comp Biol 2010;50:130-7. [PMID: 21558194 PMCID: PMC3140258 DOI: 10.1093/icb/icq023] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open

Haubold B, Pfaffelhuber P, Lynch M. mlRho - a program for estimating the population mutation and recombination rates from shotgun-sequenced diploid genomes. Mol Ecol 2010;19 Suppl 1:277-84. [PMID: 20331786 DOI: 10.1111/j.1365-294x.2009.04482.x] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Detection and correction of false segmental duplications caused by genome mis-assembly. Genome Biol 2010;11:R28. [PMID: 20219098 PMCID: PMC2864568 DOI: 10.1186/gb-2010-11-3-r28] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2009] [Revised: 12/11/2009] [Accepted: 03/10/2010] [Indexed: 11/23/2022] Open

Frith MC, Wan R, Horton P. Incorporating sequence quality data into alignment improves DNA read mapping. Nucleic Acids Res 2010;38:e100. [PMID: 20110255 PMCID: PMC2853142 DOI: 10.1093/nar/gkq010] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Vavouri T, Lehner B. Conserved noncoding elements and the evolution of animal body plans. Bioessays 2009;31:727-35. [PMID: 19492354 DOI: 10.1002/bies.200900014] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]

Long Q, MacArthur D, Ning Z, Tyler-Smith C. HI: haplotype improver using paired-end short reads. Bioinformatics 2009;25:2436-7. [PMID: 19570807 PMCID: PMC2735667 DOI: 10.1093/bioinformatics/btp412] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open

Kim JH, Kim WC, Waterman MS, Park S, Li LM. HAPLOWSER: a whole-genome haplotype browser for personal genome and metagenome. Bioinformatics 2009;25:2430-1. [PMID: 19561337 DOI: 10.1093/bioinformatics/btp399] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Barrière A, Yang SP, Pekarek E, Thomas CG, Haag ES, Ruvinsky I. Detecting heterozygosity in shotgun genome assemblies: Lessons from obligately outcrossing nematodes. Genome Res 2009;19:470-80. [PMID: 19204328 DOI: 10.1101/gr.081851.108] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

An MCMC algorithm for haplotype assembly from whole-genome sequence data. Genome Res 2008;18:1336-46. [PMID: 18676820 DOI: 10.1101/gr.077065.108] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]

Bansal V, Bafna V. HapCUT: an efficient and accurate algorithm for the haplotype assembly problem. Bioinformatics 2008;24:i153-9. [DOI: 10.1093/bioinformatics/btn298] [Citation(s) in RCA: 225] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Thakur NL, Jain R, Natalio F, Hamer B, Thakur AN, Müller WE. Marine molecular biology: An emerging field of biological sciences. Biotechnol Adv 2008;26:233-45. [DOI: 10.1016/j.biotechadv.2008.01.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2007] [Revised: 01/03/2008] [Accepted: 01/03/2008] [Indexed: 12/17/2022]

Levy S, Sutton G, Ng PC, Feuk L, Halpern AL, Walenz BP, Axelrod N, Huang J, Kirkness EF, Denisov G, Lin Y, MacDonald JR, Pang AWC, Shago M, Stockwell TB, Tsiamouri A, Bafna V, Bansal V, Kravitz SA, Busam DA, Beeson KY, McIntosh TC, Remington KA, Abril JF, Gill J, Borman J, Rogers YH, Frazier ME, Scherer SW, Strausberg RL, Venter JC. The diploid genome sequence of an individual human. PLoS Biol 2008;5:e254. [PMID: 17803354 PMCID: PMC1964779 DOI: 10.1371/journal.pbio.0050254] [Citation(s) in RCA: 1117] [Impact Index Per Article: 69.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2007] [Accepted: 07/30/2007] [Indexed: 01/20/2023] Open

Harada Y, Takagaki Y, Sunagawa M, Saito T, Yamada L, Taniguchi H, Shoguchi E, Sawada H. Mechanism of self-sterility in a hermaphroditic chordate. Science 2008;320:548-50. [PMID: 18356489 DOI: 10.1126/science.1152488] [Citation(s) in RCA: 115] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Denisov G, Walenz B, Halpern AL, Miller J, Axelrod N, Levy S, Sutton G. Consensus generation and variant detection by Celera Assembler. Bioinformatics 2008;24:1035-40. [PMID: 18321888 DOI: 10.1093/bioinformatics/btn074] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open