Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Morgenstern B, Dress A, Werner T. Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc Natl Acad Sci U S A 1996;93:12098-103. [PMID: 8901539 PMCID: PMC37949 DOI: 10.1073/pnas.93.22.12098] [Citation(s) in RCA: 202] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

For:	Morgenstern B, Dress A, Werner T. Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc Natl Acad Sci U S A 1996;93:12098-103. [PMID: 8901539 PMCID: PMC37949 DOI: 10.1073/pnas.93.22.12098] [Citation(s) in RCA: 202] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

Number

Cited by Other Article(s)

Torres-Tiji Y, Sethuram H, Gupta A, McCauley J, Dutra-Molino JV, Pathania R, Saxton L, Kang K, Hillson NJ, Mayfield SP. Bioinformatic Prediction and High Throughput In Vivo Screening to Identify Cis-Regulatory Elements for the Development of Algal Synthetic Promoters. ACS Synth Biol 2024;13:2150-2165. [PMID: 38986010 PMCID: PMC11264317 DOI: 10.1021/acssynbio.4c00199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 06/21/2024] [Accepted: 06/24/2024] [Indexed: 07/12/2024]

Rajczewski AT, Jagtap PD, Griffin TJ. An overview of technologies for MS-based proteomics-centric multi-omics. Expert Rev Proteomics 2022;19:165-181. [PMID: 35466851 PMCID: PMC9613604 DOI: 10.1080/14789450.2022.2070476] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

Rehman HA, Zafar K, Khan A, Imtiaz A. Multiple sequence alignment using enhanced bird swarm align algorithm. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-210055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Zhao Y, Broholm SK, Wang F, Rijpkema AS, Lan T, Albert VA, Teeri TH, Elomaa P. TCP and MADS-Box Transcription Factor Networks Regulate Heteromorphic Flower Type Identity in Gerbera hybrida. PLANT PHYSIOLOGY 2020;184:1455-1468. [PMID: 32900982 PMCID: PMC7608168 DOI: 10.1104/pp.20.00702] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Accepted: 08/25/2020] [Indexed: 05/19/2023]

Magnus representation of genome sequences. J Theor Biol 2019;480:104-111. [DOI: 10.1016/j.jtbi.2019.08.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Revised: 07/30/2019] [Accepted: 08/05/2019] [Indexed: 11/24/2022]

Leimeister CA, Dencker T, Morgenstern B. Accurate multiple alignment of distantly related genome sequences using filtered spaced word matches as anchor points. Bioinformatics 2019;35:211-218. [PMID: 29992260 PMCID: PMC6330006 DOI: 10.1093/bioinformatics/bty592] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2017] [Accepted: 07/09/2018] [Indexed: 01/30/2023] Open

Multiple Sequence Alignment. Methods Mol Biol 2017;1525:167-189. [PMID: 27896722 DOI: 10.1007/978-1-4939-6622-6_8] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Ye Y, Lam TW, Ting HF. PnpProbs: a better multiple sequence alignment tool by better handling of guide trees. BMC Bioinformatics 2016;17 Suppl 8:285. [PMID: 27585754 PMCID: PMC5009527 DOI: 10.1186/s12859-016-1121-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Bérard S, Chateau A, Pompidor N, Guertin P, Bergeron A, Swenson KM. Aligning the unalignable: bacteriophage whole genome alignments. BMC Bioinformatics 2016;17:30. [PMID: 26757899 PMCID: PMC4711071 DOI: 10.1186/s12859-015-0869-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2015] [Accepted: 12/22/2015] [Indexed: 11/19/2022] Open

Abstract

Background

In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences.

Results

In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressiveMauve aligner – which implements a partial order strategy, but whose alignments are linearized – shows a greatly improved interactive graphic display, while avoiding misalignments.

Conclusions

Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains.

A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha).

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0869-5) contains supplementary material, which is available to authorized users.

Collapse

Zemali EA, Boukra A. Resolving the multiple sequence alignment problem using biogeography-based optimization with multiple populations. J Bioinform Comput Biol 2015;13:1550016. [PMID: 26055803 DOI: 10.1142/s021972001550016x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Kumar M. An enhanced algorithm for multiple sequence alignment of protein sequences using genetic algorithm. EXCLI JOURNAL 2015;14:1232-55. [PMID: 27065770 PMCID: PMC4820728 DOI: 10.17179/excli2015-302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/01/2015] [Accepted: 11/19/2015] [Indexed: 11/10/2022]

Morgenstern B. Multiple sequence alignment with DIALIGN. Methods Mol Biol 2014;1079:191-202. [PMID: 24170403 DOI: 10.1007/978-1-62703-646-7_12] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]

Schulze S, Mallmann J, Burscheidt J, Koczor M, Streubel M, Bauwe H, Gowik U, Westhoff P. Evolution of C4 photosynthesis in the genus flaveria: establishment of a photorespiratory CO2 pump. THE PLANT CELL 2013;25:2522-35. [PMID: 23847152 PMCID: PMC3753380 DOI: 10.1105/tpc.113.114520] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2013] [Revised: 06/20/2013] [Accepted: 06/28/2013] [Indexed: 05/18/2023]

Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns. Bioinformatics 2013;29:2112-21. [DOI: 10.1093/bioinformatics/btt360] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open

Al Ait L, Yamak Z, Morgenstern B. DIALIGN at GOBICS--multiple sequence alignment using various sources of external information. Nucleic Acids Res 2013;41:W3-7. [PMID: 23620293 PMCID: PMC3692126 DOI: 10.1093/nar/gkt283] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Astakhova TV, Lobanov MN, Poverennaya IV, Roytberg MA, Yacovlev VV. Verification of the PREFAB alignment database. Biophysics (Nagoya-shi) 2012. [DOI: 10.1134/s0006350912020030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment. BMC Bioinformatics 2011;12:353. [PMID: 21867510 PMCID: PMC3180391 DOI: 10.1186/1471-2105-12-353] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2011] [Accepted: 08/25/2011] [Indexed: 11/10/2022] Open

Castrillo G, Turck F, Leveugle M, Lecharny A, Carbonero P, Coupland G, Paz-Ares J, Oñate-Sánchez L. Speeding cis-trans regulation discovery by phylogenomic analyses coupled with screenings of an arrayed library of Arabidopsis transcription factors. PLoS One 2011;6:e21524. [PMID: 21738689 PMCID: PMC3124521 DOI: 10.1371/journal.pone.0021524] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2011] [Accepted: 05/31/2011] [Indexed: 01/27/2023] Open

Di Tommaso P, Moretti S, Xenarios I, Orobitg M, Montanyola A, Chang JM, Taly JF, Notredame C. T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension. Nucleic Acids Res 2011;39:W13-7. [PMID: 21558174 PMCID: PMC3125728 DOI: 10.1093/nar/gkr245] [Citation(s) in RCA: 800] [Impact Index Per Article: 61.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open

Jin H, Kanthasamy A, Anantharam V, Rana A, Kanthasamy AG. Transcriptional regulation of pro-apoptotic protein kinase Cdelta: implications for oxidative stress-induced neuronal cell death. J Biol Chem 2011;286:19840-59. [PMID: 21467032 DOI: 10.1074/jbc.m110.203687] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open

Abstract

We previously demonstrated that protein kinase Cδ (PKCδ; PKC delta) is an oxidative stress-sensitive kinase that plays a causal role in apoptotic cell death in neuronal cells. Although PKCδ activation has been extensively studied, relatively little is known about the molecular mechanisms controlling PKCδ expression. To characterize the regulation of PKCδ expression, we cloned an ∼2-kbp 5'-promoter segment of the mouse Prkcd gene. Deletion analysis indicated that the noncoding exon 1 region contained multiple Sp sites, including four GC boxes and one CACCC box, which directed the highest levels of transcription in neuronal cells. In addition, an upstream regulatory region containing adjacent repressive and anti-repressive elements with opposing regulatory activities was identified within the region -712 to -560. Detailed mutagenesis studies revealed that each Sp site made a positive contribution to PKCδ promoter expression. Overexpression of Sp family proteins markedly stimulated PKCδ promoter activity without any synergistic transactivating effect. Furthermore, experiments in Sp-deficient SL2 cells indicated long isoform Sp3 as the essential activator of PKCδ transcription. Importantly, both PKCδ promoter activity and endogenous PKCδ expression in NIE115 cells and primary striatal cultures were inhibited by mithramycin A. The results from chromatin immunoprecipitation and gel shift assays further confirmed the functional binding of Sp proteins to the PKCδ promoter. Additionally, we demonstrated that overexpression of p300 or CREB-binding protein increases the PKCδ promoter activity. This stimulatory effect requires intact Sp-binding sites and is independent of p300 histone acetyltransferase activity. Finally, modulation of Sp transcriptional activity or protein level profoundly altered the cell death induced by oxidative insult, demonstrating the functional significance of Sp-dependent PKCδ gene expression. Collectively, our findings may have implications for development of new translational strategies against oxidative damage.

Collapse

α-Synuclein negatively regulates protein kinase Cδ expression to suppress apoptosis in dopaminergic neurons by reducing p300 histone acetyltransferase activity. J Neurosci 2011;31:2035-51. [PMID: 21307242 DOI: 10.1523/jneurosci.5634-10.2011] [Citation(s) in RCA: 115] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open

Bessadok A, Garcia E, Jacquet H, Martin S, Garrigues A, Loiseau N, André F, Orlowski S, Vivaudou M. Recognition of sulfonylurea receptor (ABCC8/9) ligands by the multidrug resistance transporter P-glycoprotein (ABCB1): functional similarities based on common structural features between two multispecific ABC proteins. J Biol Chem 2010;286:3552-69. [PMID: 21098040 DOI: 10.1074/jbc.m110.155200] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open

Simmons MP, Müller KF, Norton AP. Alignment of, and phylogenetic inference from, random sequences: the susceptibility of alternative alignment methods to creating artifactual resolution and support. Mol Phylogenet Evol 2010;57:1004-16. [PMID: 20849963 DOI: 10.1016/j.ympev.2010.09.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2009] [Revised: 04/05/2010] [Accepted: 09/06/2010] [Indexed: 10/19/2022]

Bustos R, Castrillo G, Linhares F, Puga MI, Rubio V, Pérez-Pérez J, Solano R, Leyva A, Paz-Ares J. A central regulatory system largely controls transcriptional activation and repression responses to phosphate starvation in Arabidopsis. PLoS Genet 2010;6:e1001102. [PMID: 20838596 PMCID: PMC2936532 DOI: 10.1371/journal.pgen.1001102] [Citation(s) in RCA: 462] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2009] [Accepted: 07/29/2010] [Indexed: 01/22/2023] Open

Abstract

Plants respond to different stresses by inducing or repressing transcription of partially overlapping sets of genes. In Arabidopsis, the PHR1 transcription factor (TF) has an important role in the control of phosphate (Pi) starvation stress responses. Using transcriptomic analysis of Pi starvation in phr1, and phr1 phr1-like (phl1) mutants and in wild type plants, we show that PHR1 in conjunction with PHL1 controls most transcriptional activation and repression responses to phosphate starvation, regardless of the Pi starvation specificity of these responses. Induced genes are enriched in PHR1 binding sequences (P1BS) in their promoters, whereas repressed genes do not show such enrichment, suggesting that PHR1(-like) control of transcriptional repression responses is indirect. In agreement with this, transcriptomic analysis of a transgenic plant expressing PHR1 fused to the hormone ligand domain of the glucocorticoid receptor showed that PHR1 direct targets (i.e., displaying altered expression after GR:PHR1 activation by dexamethasone in the presence of cycloheximide) corresponded largely to Pi starvation-induced genes that are highly enriched in P1BS. A minimal promoter containing a multimerised P1BS recapitulates Pi starvation-specific responsiveness. Likewise, mutation of P1BS in the promoter of two Pi starvation-responsive genes impaired their responsiveness to Pi starvation, but not to other stress types. Phylogenetic footprinting confirmed the importance of P1BS and PHR1 in Pi starvation responsiveness and indicated that P1BS acts in concert with other cis motifs. All together, our data show that PHR1 and PHL1 are partially redundant TF acting as central integrators of Pi starvation responses, both specific and generic. In addition, they indicate that transcriptional repression responses are an integral part of adaptive responses to stress.

As sessile organisms, plants are often exposed to stress conditions, and have evolved adaptive responses to protect themselves from different types of stress. Some responses are stress type-specific whereas others are common to different stress types. Understanding how these responses are controlled is crucial for rational improvement of stress tolerance, a limiting factor in crop productivity. Here we examined the physiological and molecular responses to phosphate starvation and found that a single transcription factor family, represented by PHOSPHATE STARVATION RESPONSE REGULATOR 1 (PHR1), has a central role in the control of specific and shared phosphate starvation stress responses. In consonance with the importance of PHR1, we found that the PHR1-binding sequence, present in most PHR1 direct targets, is a crucial cis motif for Pi starvation responsiveness. An artificial promoter controlled by PHR1 recapitulates responsiveness to Pi starvation and to modulators of this response, qualifying PHR1 family members as central integrators in Pi starvation signalling. This central integrator system also controls most transcriptional repression responses to Pi starvation, indicating that they are an integral part of the adaptive response, and not a consequence of plant malfunction due to stress.

Collapse

Pitschi F, Devauchelle C, Corel E. Automatic detection of anchor points for multiple sequence alignment. BMC Bioinformatics 2010;11:445. [PMID: 20813050 PMCID: PMC2942857 DOI: 10.1186/1471-2105-11-445] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2009] [Accepted: 09/02/2010] [Indexed: 11/18/2022] Open

Finn S, Civetta A. Sexual selection and the molecular evolution of ADAM proteins. J Mol Evol 2010;71:231-40. [PMID: 20730583 DOI: 10.1007/s00239-010-9382-7] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2010] [Accepted: 08/09/2010] [Indexed: 12/12/2022]

Frölich D, Giesecke C, Mei HE, Reiter K, Daridon C, Lipsky PE, Dörner T. Secondary immunization generates clonally related antigen-specific plasma cells and memory B cells. THE JOURNAL OF IMMUNOLOGY 2010;185:3103-10. [PMID: 20693426 DOI: 10.4049/jimmunol.1000911] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Simmons MP, Müller KF, Webb CT. The deterministic effects of alignment bias in phylogenetic inference. Cladistics 2010;27:402-416. [DOI: 10.1111/j.1096-0031.2010.00333.x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

Subramanian AR, Hiran S, Steinkamp R, Meinicke P, Corel E, Morgenstern B. DIALIGN-TX and multiple protein alignment using secondary structure information at GOBICS. Nucleic Acids Res 2010;38:W19-22. [PMID: 20497995 PMCID: PMC2896137 DOI: 10.1093/nar/gkq442] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2010] [Revised: 05/04/2010] [Accepted: 05/09/2010] [Indexed: 12/29/2022] Open

Affiliation(s)

Amarendran R. Subramanian Wilhelm-Schickard-Institut für Informatik, University of Tübingen, Sand 13, 72076 Tübingen, Institute of Microbiology and Genetics, University of Göttingen, Goldschmidtstr. 1, 37077 Göttingen, Germany and Department of Mathematics, Indian Institute of Technology, Kharagpur, 721 302, India
Suvrat Hiran Wilhelm-Schickard-Institut für Informatik, University of Tübingen, Sand 13, 72076 Tübingen, Institute of Microbiology and Genetics, University of Göttingen, Goldschmidtstr. 1, 37077 Göttingen, Germany and Department of Mathematics, Indian Institute of Technology, Kharagpur, 721 302, India
Rasmus Steinkamp Wilhelm-Schickard-Institut für Informatik, University of Tübingen, Sand 13, 72076 Tübingen, Institute of Microbiology and Genetics, University of Göttingen, Goldschmidtstr. 1, 37077 Göttingen, Germany and Department of Mathematics, Indian Institute of Technology, Kharagpur, 721 302, India
Peter Meinicke Wilhelm-Schickard-Institut für Informatik, University of Tübingen, Sand 13, 72076 Tübingen, Institute of Microbiology and Genetics, University of Göttingen, Goldschmidtstr. 1, 37077 Göttingen, Germany and Department of Mathematics, Indian Institute of Technology, Kharagpur, 721 302, India
Eduardo Corel Wilhelm-Schickard-Institut für Informatik, University of Tübingen, Sand 13, 72076 Tübingen, Institute of Microbiology and Genetics, University of Göttingen, Goldschmidtstr. 1, 37077 Göttingen, Germany and Department of Mathematics, Indian Institute of Technology, Kharagpur, 721 302, India
Burkhard Morgenstern Wilhelm-Schickard-Institut für Informatik, University of Tübingen, Sand 13, 72076 Tübingen, Institute of Microbiology and Genetics, University of Göttingen, Goldschmidtstr. 1, 37077 Göttingen, Germany and Department of Mathematics, Indian Institute of Technology, Kharagpur, 721 302, India

Collapse

A min-cut algorithm for the consistency problem in multiple sequence alignment. Bioinformatics 2010;26:1015-21. [DOI: 10.1093/bioinformatics/btq082] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Mount DW. Comparing programs and methods to use for global multiple sequence alignment. Cold Spring Harb Protoc 2010;2009:pdb.ip61. [PMID: 20147201 DOI: 10.1101/pdb.ip61] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Kemena C, Notredame C. Upcoming challenges for multiple sequence alignment methods in the high-throughput era. Bioinformatics 2009;25:2455-65. [PMID: 19648142 PMCID: PMC2752613 DOI: 10.1093/bioinformatics/btp452] [Citation(s) in RCA: 150] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2009] [Revised: 06/24/2009] [Accepted: 07/16/2009] [Indexed: 12/22/2022] Open

Taheri J, Zomaya AY. RBT-GA: a novel metaheuristic for solving the Multiple Sequence Alignment problem. BMC Genomics 2009;10 Suppl 1:S10. [PMID: 19594869 PMCID: PMC2709253 DOI: 10.1186/1471-2164-10-s1-s10] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Fast statistical alignment. PLoS Comput Biol 2009;5:e1000392. [PMID: 19478997 PMCID: PMC2684580 DOI: 10.1371/journal.pcbi.1000392] [Citation(s) in RCA: 230] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2008] [Accepted: 04/20/2009] [Indexed: 02/01/2023] Open

Abstract

We describe a new program for the alignment of multiple biological sequences that is both statistically motivated and fast enough for problem sizes that arise in practice. Our Fast Statistical Alignment program is based on pair hidden Markov models which approximate an insertion/deletion process on a tree and uses a sequence annealing algorithm to combine the posterior probabilities estimated from these models into a multiple alignment. FSA uses its explicit statistical model to produce multiple alignments which are accompanied by estimates of the alignment accuracy and uncertainty for every column and character of the alignment—previously available only with alignment programs which use computationally-expensive Markov Chain Monte Carlo approaches—yet can align thousands of long sequences. Moreover, FSA utilizes an unsupervised query-specific learning procedure for parameter estimation which leads to improved accuracy on benchmark reference alignments in comparison to existing programs. The centroid alignment approach taken by FSA, in combination with its learning procedure, drastically reduces the amount of false-positive alignment on biological data in comparison to that given by other methods. The FSA program and a companion visualization tool for exploring uncertainty in alignments can be used via a web interface at http://orangutan.math.berkeley.edu/fsa/, and the source code is available at http://fsa.sourceforge.net/.

Biological sequence alignment is one of the fundamental problems in comparative genomics, yet it remains unsolved. Over sixty sequence alignment programs are listed on Wikipedia, and many new programs are published every year. However, many popular programs suffer from pathologies such as aligning unrelated sequences and producing discordant alignments in protein (amino acid) and codon (nucleotide) space, casting doubt on the accuracy of the inferred alignments. Inaccurate alignments can introduce large and unknown systematic biases into downstream analyses such as phylogenetic tree reconstruction and substitution rate estimation. We describe a new program for multiple sequence alignment which can align protein, RNA and DNA sequence and improves on the accuracy of existing approaches on benchmarks of protein and RNA structural alignments and simulated mammalian and fly genomic alignments. Our approach, which seeks to find the alignment which is closest to the truth under our statistical model, leaves unrelated sequences largely unaligned and produces concordant alignments in protein and codon space. It is fast enough for difficult problems such as aligning orthologous genomic regions or aligning hundreds or thousands of proteins. It furthermore has a companion GUI for visualizing the estimated alignment reliability.

Collapse

Misof B, Misof K. A Monte Carlo approach successfully identifies randomness in multiple sequence alignments: a more objective means of data exclusion. Syst Biol 2009;58:21-34. [PMID: 20525566 DOI: 10.1093/sysbio/syp006] [Citation(s) in RCA: 233] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Katoh K, Asimenos G, Toh H. Multiple alignment of DNA sequences with MAFFT. Methods Mol Biol 2009;537:39-64. [PMID: 19378139 DOI: 10.1007/978-1-59745-251-9_3] [Citation(s) in RCA: 868] [Impact Index Per Article: 57.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/14/2023]

Wagner H, Morgenstern B, Dress A. Stability of multiple alignments and phylogenetic trees: an analysis of ABC-transporter proteins family. Algorithms Mol Biol 2008;3:15. [PMID: 18990223 PMCID: PMC2637874 DOI: 10.1186/1748-7188-3-15] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2008] [Accepted: 11/06/2008] [Indexed: 11/17/2022] Open

Banerjee N, Sarani R, Ranjani CV, Sowmiya G, Michael D, Balakrishnan N, Sekar K. Algorithm to find distant repeats in a single protein sequence. Bioinformation 2008;3:28-32. [PMID: 19052663 PMCID: PMC2586129 DOI: 10.6026/97320630003028] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2008] [Accepted: 07/24/2008] [Indexed: 11/23/2022] Open

Lu Y, Sze SH. Multiple Sequence Alignment Based on Profile Alignment of Intermediate Sequences. J Comput Biol 2008;15:767-77. [PMID: 18662101 DOI: 10.1089/cmb.2007.0132] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Multiple sequence alignment by conformational space annealing. Biophys J 2008;95:4813-9. [PMID: 18689453 DOI: 10.1529/biophysj.108.129684] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Subramanian AR, Kaufmann M, Morgenstern B. DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. Algorithms Mol Biol 2008;3:6. [PMID: 18505568 PMCID: PMC2430965 DOI: 10.1186/1748-7188-3-6] [Citation(s) in RCA: 139] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2008] [Accepted: 05/27/2008] [Indexed: 11/10/2022] Open

Abstract

Background

DIALIGN-T is a reimplementation of the multiple-alignment program DIALIGN. Due to several algorithmic improvements, it produces significantly better alignments on locally and globally related sequence sets than previous versions of DIALIGN. However, like the original implementation of the program, DIALIGN-T uses a a straight-forward greedy approach to assemble multiple alignments from local pairwise sequence similarities. Such greedy approaches may be vulnerable to spurious random similarities and can therefore lead to suboptimal results. In this paper, we present DIALIGN-TX, a substantial improvement of DIALIGN-T that combines our previous greedy algorithm with a progressive alignment approach.

Results

Our new heuristic produces significantly better alignments, especially on globally related sequences, without increasing the CPU time and memory consumption exceedingly. The new method is based on a guide tree; to detect possible spurious sequence similarities, it employs a vertex-cover approximation on a conflict graph. We performed benchmarking tests on a large set of nucleic acid and protein sequences For protein benchmarks we used the benchmark database BALIBASE 3 and an updated release of the database IRMBASE 2 for assessing the quality on globally and locally related sequences, respectively. For alignment of nucleic acid sequences, we used BRAliBase II for global alignment and a newly developed database of locally related sequences called DIRM-BASE 1. IRMBASE 2 and DIRMBASE 1 are constructed by implanting highly conserved motives at random positions in long unalignable sequences.

Conclusion

On BALIBASE3, our new program performs significantly better than the previous program DIALIGN-T and outperforms the popular global aligner CLUSTAL W, though it is still outperformed by programs that focus on global alignment like MAFFT, MUSCLE and T-COFFEE. On the locally related test sets in IRMBASE 2 and DIRM-BASE 1, our method outperforms all other programs while MAFFT E-INSi is the only method that comes close to the performance of DIALIGN-TX.

Collapse

Simossis V, Kleinjung J, Heringa J. An overview of multiple sequence alignment. CURRENT PROTOCOLS IN BIOINFORMATICS 2008;Chapter 3:3.7.1-3.7.26. [PMID: 18428699 DOI: 10.1002/0471250953.bi0307s03] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Huang W, Nevins JR, Ohler U. Phylogenetic simulation of promoter evolution: estimation and modeling of binding site turnover events and assessment of their impact on alignment tools. Genome Biol 2008;8:R225. [PMID: 17956628 PMCID: PMC2246299 DOI: 10.1186/gb-2007-8-10-r225] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2007] [Revised: 10/20/2007] [Accepted: 10/24/2007] [Indexed: 02/07/2023] Open

Simmons MP, Cappa JJ, Archer RH, Ford AJ, Eichstedt D, Clevinger CC. Phylogeny of the Celastreae (Celastraceae) and the relationships of Catha edulis (qat) inferred from morphological characters and nuclear and plastid genes. Mol Phylogenet Evol 2008;48:745-57. [PMID: 18550389 DOI: 10.1016/j.ympev.2008.04.039] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2007] [Revised: 04/29/2008] [Accepted: 04/30/2008] [Indexed: 11/16/2022]

Pirovano W, Heringa J. Multiple sequence alignment. Methods Mol Biol 2008;452:143-61. [PMID: 18566763 DOI: 10.1007/978-1-60327-159-2_7] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Do CB, Katoh K. Protein multiple sequence alignment. Methods Mol Biol 2008;484:379-413. [PMID: 18592193 DOI: 10.1007/978-1-59745-398-1_25] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]

Notredame C. Recent evolutions of multiple sequence alignment algorithms. PLoS Comput Biol 2007;3:e123. [PMID: 17784778 PMCID: PMC1963500 DOI: 10.1371/journal.pcbi.0030123] [Citation(s) in RCA: 153] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

van Nimwegen E. Finding regulatory elements and regulatory motifs: a general probabilistic framework. BMC Bioinformatics 2007;8 Suppl 6:S4. [PMID: 17903285 PMCID: PMC1995539 DOI: 10.1186/1471-2105-8-s6-s4] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Jangam S, Chakraborti N. A novel method for alignment of two nucleic acid sequences using ant colony optimization and genetic algorithms. Appl Soft Comput 2007. [DOI: 10.1016/j.asoc.2006.11.004] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Chung YS, Lee WH, Tang CY, Lu CL. RE-MuSiC: a tool for multiple sequence alignment with regular expression constraints. Nucleic Acids Res 2007;35:W639-44. [PMID: 17488842 PMCID: PMC1933182 DOI: 10.1093/nar/gkm275] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open