Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Blanchette M. Computation and Analysis of Genomic Multi-Sequence Alignments. Annu Rev Genomics Hum Genet 2007;8:193-213. [PMID: 17489682 DOI: 10.1146/annurev.genom.8.080706.092300] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

For:	Blanchette M. Computation and Analysis of Genomic Multi-Sequence Alignments. Annu Rev Genomics Hum Genet 2007;8:193-213. [PMID: 17489682 DOI: 10.1146/annurev.genom.8.080706.092300] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Number

Cited by Other Article(s)

Rahmani RS, Decap D, Fostier J, Marchal K. BLSSpeller to discover novel regulatory motifs in maize. DNA Res 2022;29:6651838. [PMID: 35904558 PMCID: PMC9358016 DOI: 10.1093/dnares/dsac029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Indexed: 11/13/2022] Open

Korotkov EV, Suvorova YM, Kostenko DO, Korotkova MA. Multiple Alignment of Promoter Sequences from the Arabidopsis thaliana L. Genome. Genes (Basel) 2021;12:135. [PMID: 33494278 PMCID: PMC7909805 DOI: 10.3390/genes12020135] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Revised: 01/15/2021] [Accepted: 01/18/2021] [Indexed: 11/16/2022] Open

Khelik K, Lagesen K, Sandve GK, Rognes T, Nederbragt AJ. NucDiff: in-depth characterization and annotation of differences between two sets of DNA sequences. BMC Bioinformatics 2017;18:338. [PMID: 28701187 PMCID: PMC5508607 DOI: 10.1186/s12859-017-1748-z] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2016] [Accepted: 07/04/2017] [Indexed: 12/05/2022] Open

Abstract

Background

Comparing sets of sequences is a situation frequently encountered in bioinformatics, examples being comparing an assembly to a reference genome, or two genomes to each other. The purpose of the comparison is usually to find where the two sets differ, e.g. to find where a subsequence is repeated or deleted, or where insertions have been introduced. Such comparisons can be done using whole-genome alignments. Several tools for making such alignments exist, but none of them 1) provides detailed information about the types and locations of all differences between the two sets of sequences, 2) enables visualisation of alignment results at different levels of detail, and 3) carefully takes genomic repeats into consideration.

Results

We here present NucDiff, a tool aimed at locating and categorizing differences between two sets of closely related DNA sequences. NucDiff is able to deal with very fragmented genomes, repeated sequences, and various local differences and structural rearrangements. NucDiff determines differences by a rigorous analysis of alignment results obtained by the NUCmer, delta-filter and show-snps programs in the MUMmer sequence alignment package. All differences found are categorized according to a carefully defined classification scheme covering all possible differences between two sequences. Information about the differences is made available as GFF3 files, thus enabling visualisation using genome browsers as well as usage of the results as a component in an analysis pipeline. NucDiff was tested with varying parameters for the alignment step and compared with existing alternatives, called QUAST and dnadiff.

Conclusions

We have developed a whole genome alignment difference classification scheme together with the program NucDiff for finding such differences. The proposed classification scheme is comprehensive and can be used by other tools. NucDiff performs comparably to QUAST and dnadiff but gives much more detailed results that can easily be visualized. NucDiff is freely available on https://github.com/uio-cels/NucDiff under the MPL license.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-017-1748-z) contains supplementary material, which is available to authorized users.

Collapse

Kwak D, Kam A, Becerra D, Zhou Q, Hops A, Zarour E, Kam A, Sarmenta L, Blanchette M, Waldispühl J. Open-Phylo: a customizable crowd-computing platform for multiple sequence alignment. Genome Biol 2014;14:R116. [PMID: 24148814 DOI: 10.1186/gb-2013-14-10-r116] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2013] [Accepted: 10/22/2013] [Indexed: 11/10/2022] Open

Wang B, Kennedy MA. Principal components analysis of protein sequence clusters. ACTA ACUST UNITED AC 2014;15:1-11. [PMID: 24496727 DOI: 10.1007/s10969-014-9173-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2013] [Accepted: 01/24/2014] [Indexed: 12/21/2022]

Soto W, Becerra D. A Multi-Objective Evolutionary Algorithm for Improving Multiple Sequence Alignments. ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY 2014. [DOI: 10.1007/978-3-319-12418-6_10] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Blanchette M. Exploiting ancestral mammalian genomes for the prediction of human transcription factor binding sites. BMC Bioinformatics 2012;13 Suppl 19:S2. [PMID: 23281809 PMCID: PMC3526440 DOI: 10.1186/1471-2105-13-s19-s2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Sun H, Buhler JD. PhyLAT: a phylogenetic local alignment tool. Bioinformatics 2012;28:1336-44. [PMID: 22492645 PMCID: PMC3465089 DOI: 10.1093/bioinformatics/bts158] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2011] [Revised: 03/29/2012] [Accepted: 03/30/2012] [Indexed: 11/14/2022] Open

Kawrykow A, Roumanis G, Kam A, Kwak D, Leung C, Wu C, Zarour E, Sarmenta L, Blanchette M, Waldispühl J. Phylo: a citizen science approach for improving multiple sequence alignment. PLoS One 2012;7:e31362. [PMID: 22412834 PMCID: PMC3296692 DOI: 10.1371/journal.pone.0031362] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2011] [Accepted: 01/09/2012] [Indexed: 01/07/2023] Open

Abstract

Background

Comparative genomics, or the study of the relationships of genome structure and function across different species, offers a powerful tool for studying evolution, annotating genomes, and understanding the causes of various genetic disorders. However, aligning multiple sequences of DNA, an essential intermediate step for most types of analyses, is a difficult computational task. In parallel, citizen science, an approach that takes advantage of the fact that the human brain is exquisitely tuned to solving specific types of problems, is becoming increasingly popular. There, instances of hard computational problems are dispatched to a crowd of non-expert human game players and solutions are sent back to a central server.

Methodology/Principal Findings

We introduce Phylo, a human-based computing framework applying “crowd sourcing” techniques to solve the Multiple Sequence Alignment (MSA) problem. The key idea of Phylo is to convert the MSA problem into a casual game that can be played by ordinary web users with a minimal prior knowledge of the biological context. We applied this strategy to improve the alignment of the promoters of disease-related genes from up to 44 vertebrate species. Since the launch in November 2010, we received more than 350,000 solutions submitted from more than 12,000 registered users. Our results show that solutions submitted contributed to improving the accuracy of up to 70% of the alignment blocks considered.

Conclusions/Significance

We demonstrate that, combined with classical algorithms, crowd computing techniques can be successfully used to help improving the accuracy of MSA. More importantly, we show that an NP-hard computational problem can be embedded in casual game that can be easily played by people without significant scientific training. This suggests that citizen science approaches can be used to exploit the billions of “human-brain peta-flops” of computation that are spent every day playing games. Phylo is available at: http://phylo.cs.mcgill.ca.

Collapse

Sacan A, Ekins S, Kortagere S. Applications and limitations of in silico models in drug discovery. Methods Mol Biol 2012;910:87-124. [PMID: 22821594 DOI: 10.1007/978-1-61779-965-5_6] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Kortagere S, Lill M, Kerrigan J. Role of computational methods in pharmaceutical sciences. Methods Mol Biol 2012;929:21-48. [PMID: 23007425 DOI: 10.1007/978-1-62703-050-2_3] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2022]

Kim J, Ma J. PSAR: measuring multiple sequence alignment reliability by probabilistic sampling. Nucleic Acids Res 2011;39:6359-68. [PMID: 21576232 PMCID: PMC3159474 DOI: 10.1093/nar/gkr334] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2011] [Revised: 04/18/2011] [Accepted: 04/24/2011] [Indexed: 11/14/2022] Open

Dewey CN. Positional orthology: putting genomic evolutionary relationships into context. Brief Bioinform 2011;12:401-12. [PMID: 21705766 PMCID: PMC3178058 DOI: 10.1093/bib/bbr040] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Belal NA, Heath LS. A theoretical model for whole genome alignment. J Comput Biol 2011;18:705-28. [PMID: 21210739 DOI: 10.1089/cmb.2010.0101] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Mahmood K, Konagurthu AS, Song J, Buckle AM, Webb GI, Whisstock JC. EGM: encapsulated gene-by-gene matching to identify gene orthologs and homologous segments in genomes. Bioinformatics 2010;26:2076-84. [DOI: 10.1093/bioinformatics/btq339] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Comparative assessment of methods for aligning multiple genome sequences. Nat Biotechnol 2010;28:567-72. [PMID: 20495551 DOI: 10.1038/nbt.1637] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2009] [Accepted: 04/27/2010] [Indexed: 01/22/2023]

Bickel PJ, Brown JB, Huang H, Li Q. An overview of recent developments in genomics and associated statistical methods. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2009;367:4313-37. [PMID: 19805447 DOI: 10.1098/rsta.2009.0164] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]

Buschiazzo E, Gemmell NJ. Evolutionary and phylogenetic significance of platypus microsatellites conserved in mammalian and other vertebrate genomes. AUST J ZOOL 2009. [DOI: 10.1071/zo09038] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Lin MF, Deoras AN, Rasmussen MD, Kellis M. Performance and scalability of discriminative metrics for comparative gene identification in 12 Drosophila genomes. PLoS Comput Biol 2008;4:e1000067. [PMID: 18421375 PMCID: PMC2291194 DOI: 10.1371/journal.pcbi.1000067] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2007] [Accepted: 03/20/2008] [Indexed: 01/22/2023] Open

Yavatkar AS, Lin Y, Ross J, Fann Y, Brody T, Odenwald WF. Rapid detection and curation of conserved DNA via enhanced-BLAT and EvoPrinterHD analysis. BMC Genomics 2008;9:106. [PMID: 18307801 PMCID: PMC2268679 DOI: 10.1186/1471-2164-9-106] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2007] [Accepted: 02/28/2008] [Indexed: 12/04/2022] Open

Abstract

Background

Multi-genome comparative analysis has yielded important insights into the molecular details of gene regulation. We have developed EvoPrinter, a web-accessed genomics tool that provides a single uninterrupted view of conserved sequences as they appear in a species of interest. An EvoPrint reveals with near base-pair resolution those sequences that are essential for gene function.

Results

We describe here EvoPrinterHD, a 2^nd-generation comparative genomics tool that automatically generates from a single input sequence an enhanced view of sequence conservation between evolutionarily distant species. Currently available for 5 nematode, 3 mosquito, 12 Drosophila, 20 vertebrate, 17 Staphylococcus and 20 enteric bacteria genomes, EvoPrinterHD employs a modified BLAT algorithm [enhanced-BLAT (eBLAT)], which detects up to 75% more conserved bases than identified by the BLAT alignments used in the earlier EvoPrinter program. The new program also identifies conserved sequences within rearranged DNA, highlights repetitive DNA, and detects sequencing gaps. EvoPrinterHD currently holds over 112 billion bp of indexed genomes in memory and has the flexibility of selecting a subset of genomes for analysis. An EvoDifferences profile is also generated to portray conserved sequences that are uniquely lost in any one of the orthologs. Finally, EvoPrinterHD incorporates options that allow for (1) re-initiation of the analysis using a different genome's aligning region as the reference DNA to detect species-specific changes in less-conserved regions, (2) rapid extraction and curation of conserved sequences, and (3) for bacteria, identifies unique or uniquely shared sequences present in subsets of genomes.

Conclusion

EvoPrinterHD is a fast, high-resolution comparative genomics tool that automatically generates an uninterrupted species-centric view of sequence conservation and enables the discovery of conserved sequences within rearranged DNA. When combined with cis-Decoder, a program that discovers sequence elements shared among tissue specific enhancers, EvoPrinterHD facilitates the analysis of conserved sequences that are essential for coordinate gene regulation.

Collapse