Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Andreeva A, Prlić A, Hubbard TJP, Murzin AG. SISYPHUS--structural alignments for proteins with non-trivial relationships. Nucleic Acids Res 2006;35:D253-9. [PMID: 17068077 PMCID: PMC1635320 DOI: 10.1093/nar/gkl746] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

For:	Andreeva A, Prlić A, Hubbard TJP, Murzin AG. SISYPHUS--structural alignments for proteins with non-trivial relationships. Nucleic Acids Res 2006;35:D253-9. [PMID: 17068077 PMCID: PMC1635320 DOI: 10.1093/nar/gkl746] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Number

Cited by Other Article(s)

Cretin G, Périn C, Zimmermann N, Galochkina T, Gelly JC. ICARUS: flexible protein structural alignment based on Protein Units. Bioinformatics 2023;39:btad459. [PMID: 37498544 PMCID: PMC10400377 DOI: 10.1093/bioinformatics/btad459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 07/04/2023] [Accepted: 07/26/2023] [Indexed: 07/28/2023] Open

Jayaraman V, Toledo‐Patiño S, Noda‐García L, Laurino P. Mechanisms of protein evolution. Protein Sci 2022;31:e4362. [PMID: 35762715 PMCID: PMC9214755 DOI: 10.1002/pro.4362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/11/2022] [Accepted: 05/14/2022] [Indexed: 11/06/2022]

Daniluk P, Oleniecki T, Lesyng B. DAMA: a method for computing multiple alignments of protein structures using local structure descriptors. Bioinformatics 2021;38:80-85. [PMID: 34396393 PMCID: PMC8696102 DOI: 10.1093/bioinformatics/btab571] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 05/31/2021] [Accepted: 08/12/2021] [Indexed: 02/03/2023] Open

Abstract

MOTIVATION

The well-known fact that protein structures are more conserved than their sequences forms the basis of several areas of computational structural biology. Methods based on the structure analysis provide more complete information on residue conservation in evolutionary processes. This is crucial for the determination of evolutionary relationships between proteins and for the identification of recurrent structural patterns present in biomolecules involved in similar functions. However, algorithmic structural alignment is much more difficult than multiple sequence alignment. This study is devoted to the development and applications of DAMA-a novel effective environment capable to compute and analyze multiple structure alignments.

RESULTS

DAMA is based on local structural similarities, using local 3D structure descriptors and thus accounts for nearest-neighbor molecular environments of aligned residues. It is constrained neither by protein topology nor by its global structure. DAMA is an extension of our previous study (DEDAL) which demonstrated the applicability of local descriptors to pairwise alignment problems. Since the multiple alignment problem is NP-complete, an effective heuristic approach has been developed without imposing any artificial constraints. The alignment algorithm searches for the largest, consistent ensemble of similar descriptors. The new method is capable to capture most of the biologically significant similarities present in canonical test sets and is discriminatory enough to prevent the emergence of larger, but meaningless, solutions. Tests performed on the test sets, including protein kinases, demonstrate DAMA's capability of identifying equivalent residues, which should be very useful in discovering the biological nature of proteins similarity. Performance profiles show the advantage of DAMA over other methods, in particular when using a strict similarity measure QC, which is the ratio of correctly aligned columns, and when applying the methods to more difficult cases.

AVAILABILITY AND IMPLEMENTATION

DAMA is available online at http://dworkowa.imdik.pan.pl/EP/DAMA. Linux binaries of the software are available upon request.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Talibart H, Coste F. PPalign: optimal alignment of Potts models representing proteins with direct coupling information. BMC Bioinformatics 2021;22:317. [PMID: 34112081 PMCID: PMC8191105 DOI: 10.1186/s12859-021-04222-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 05/25/2021] [Indexed: 11/29/2022] Open

Abstract

Background

To assign structural and functional annotations to the ever increasing amount of sequenced proteins, the main approach relies on sequence-based homology search methods, e.g. BLAST or the current state-of-the-art methods based on profile Hidden Markov Models, which rely on significant alignments of query sequences to annotated proteins or protein families. While powerful, these approaches do not take coevolution between residues into account. Taking advantage of recent advances in the field of contact prediction, we propose here to represent proteins by Potts models, which model direct couplings between positions in addition to positional composition, and to compare proteins by aligning these models. Due to non-local dependencies, the problem of aligning Potts models is hard and remains the main computational bottleneck for their use.

Methods

We introduce here an Integer Linear Programming formulation of the problem and PPalign, a program based on this formulation, to compute the optimal pairwise alignment of Potts models representing proteins in tractable time. The approach is assessed with respect to a non-redundant set of reference pairwise sequence alignments from SISYPHUS benchmark which have lowest sequence identity (between \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$3\%$$\end{document}3% and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$20\%$$\end{document}20%) and enable to build reliable Potts models for each sequence to be aligned. This experimentation confirms that Potts models can be aligned in reasonable time (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1'37''$$\end{document}1′37′′ in average on these alignments). The contribution of couplings is evaluated in comparison with HHalign and independent-site PPalign. Although Potts models were not fully optimized for alignment purposes and simple gap scores were used, PPalign yields a better mean \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_1$$\end{document}F1 score and finds significantly better alignments than HHalign and PPalign without couplings in some cases.

Conclusions

These results show that pairwise couplings from protein Potts models can be used to improve the alignment of remotely related protein sequences in tractable time. Our experimentation suggests yet that new research on the inference of Potts models is now needed to make them more comparable and suitable for homology search. We think that PPalign’s guaranteed optimality will be a powerful asset to perform unbiased investigations in this direction.

Collapse

Le HT, Do PC, Le L. Grafting Methionine on 1F1 Ab Increases the Broad-Activity on HA Structural-Conserved Residues of H1, H2, and H3 Influenza a Viruses. Evol Bioinform Online 2021;17:11769343211003082. [PMID: 33795930 PMCID: PMC7975486 DOI: 10.1177/11769343211003082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 02/24/2021] [Indexed: 11/27/2022] Open

Searching protein space for ancient sub-domain segments. Curr Opin Struct Biol 2021;68:105-112. [PMID: 33476896 DOI: 10.1016/j.sbi.2020.11.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Accepted: 11/29/2020] [Indexed: 01/08/2023]

Carpentier M, Chomilier J. Protein multiple alignments: sequence-based versus structure-based programs. Bioinformatics 2020;35:3970-3980. [PMID: 30942864 DOI: 10.1093/bioinformatics/btz236] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Revised: 03/05/2019] [Accepted: 04/02/2019] [Indexed: 11/14/2022] Open

Rozewicki J, Li S, Amada KM, Standley DM, Katoh K. MAFFT-DASH: integrated protein sequence and structural alignment. Nucleic Acids Res 2019;47:W5-W10. [PMID: 31062021 PMCID: PMC6602451 DOI: 10.1093/nar/gkz342] [Citation(s) in RCA: 242] [Impact Index Per Article: 48.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Revised: 04/07/2019] [Accepted: 04/25/2019] [Indexed: 12/22/2022] Open

Nute M, Saleh E, Warnow T. Evaluating Statistical Multiple Sequence Alignment in Comparison to Other Alignment Methods on Protein Data Sets. Syst Biol 2019;68:396-411. [PMID: 30329135 PMCID: PMC6472439 DOI: 10.1093/sysbio/syy068] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Revised: 09/27/2018] [Accepted: 10/11/2018] [Indexed: 01/15/2023] Open

RUPEE: A fast and accurate purely geometric protein structure search. PLoS One 2019;14:e0213712. [PMID: 30875409 PMCID: PMC6420038 DOI: 10.1371/journal.pone.0213712] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Accepted: 02/27/2019] [Indexed: 11/19/2022] Open

Alva V, Söding J, Lupas AN. A vocabulary of ancient peptides at the origin of folded proteins. eLife 2015;4:e09410. [PMID: 26653858 PMCID: PMC4739770 DOI: 10.7554/elife.09410] [Citation(s) in RCA: 150] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2015] [Accepted: 12/13/2015] [Indexed: 01/01/2023] Open

Terashi G, Takeda-Shitaka M. CAB-Align: A Flexible Protein Structure Alignment Method Based on the Residue-Residue Contact Area. PLoS One 2015;10:e0141440. [PMID: 26502070 PMCID: PMC4621035 DOI: 10.1371/journal.pone.0141440] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2015] [Accepted: 10/08/2015] [Indexed: 12/26/2022] Open

Abstract

Proteins are flexible, and this flexibility has an essential functional role. Flexibility can be observed in loop regions, rearrangements between secondary structure elements, and conformational changes between entire domains. However, most protein structure alignment methods treat protein structures as rigid bodies. Thus, these methods fail to identify the equivalences of residue pairs in regions with flexibility. In this study, we considered that the evolutionary relationship between proteins corresponds directly to the residue–residue physical contacts rather than the three-dimensional (3D) coordinates of proteins. Thus, we developed a new protein structure alignment method, contact area-based alignment (CAB-align), which uses the residue–residue contact area to identify regions of similarity. The main purpose of CAB-align is to identify homologous relationships at the residue level between related protein structures. The CAB-align procedure comprises two main steps: First, a rigid-body alignment method based on local and global 3D structure superposition is employed to generate a sufficient number of initial alignments. Then, iterative dynamic programming is executed to find the optimal alignment. We evaluated the performance and advantages of CAB-align based on four main points: (1) agreement with the gold standard alignment, (2) alignment quality based on an evolutionary relationship without 3D coordinate superposition, (3) consistency of the multiple alignments, and (4) classification agreement with the gold standard classification. Comparisons of CAB-align with other state-of-the-art protein structure alignment methods (TM-align, FATCAT, and DaliLite) using our benchmark dataset showed that CAB-align performed robustly in obtaining high-quality alignments and generating consistent multiple alignments with high coverage and accuracy rates, and it performed extremely well when discriminating between homologous and nonhomologous pairs of proteins in both single and multi-domain comparisons. The CAB-align software is freely available to academic users as stand-alone software at http://www.pharm.kitasato-u.ac.jp/bmd/bmd/Publications.html.

Collapse

Goncearenco A, Berezovsky IN. Protein function from its emergence to diversity in contemporary proteins. Phys Biol 2015;12:045002. [PMID: 26057563 DOI: 10.1088/1478-3975/12/4/045002] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]

Global view of the protein universe. Proc Natl Acad Sci U S A 2014;111:11691-6. [PMID: 25071170 DOI: 10.1073/pnas.1403395111] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Kopec KO, Lupas AN. β-Propeller blades as ancestral peptides in protein evolution. PLoS One 2013;8:e77074. [PMID: 24143202 PMCID: PMC3797127 DOI: 10.1371/journal.pone.0077074] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2013] [Accepted: 09/05/2013] [Indexed: 12/04/2022] Open

Abstract

Proteins of the β-propeller fold are ubiquitous in nature and widely used as structural scaffolds for ligand binding and enzymatic activity. This fold comprises between four and twelve four-stranded β-meanders, the so called blades that are arranged circularly around a central funnel-shaped pore. Despite the large size range of β-propellers, their blades frequently show sequence similarity indicative of a common ancestry and it has been proposed that the majority of β-propellers arose divergently by amplification and diversification of an ancestral blade. Given the structural versatility of β-propellers and the hypothesis that the first folded proteins evolved from a simpler set of peptides, we investigated whether this blade may have given rise to other folds as well. Using sequence comparisons, we identified proteins of four other folds as potential homologs of β-propellers: the luminal domain of inositol-requiring enzyme 1 (IRE1-LD), type II β-prisms, β-pinwheels, and WW domains. Because, with increasing evolutionary distance and decreasing sequence length, the statistical significance of sequence comparisons becomes progressively harder to distinguish from the background of convergent similarities, we complemented our analyses with a new method that evaluates possible homology based on the correlation between sequence and structure similarity. Our results indicate a homologous relationship of IRE1-LD and type II β-prisms with β-propellers, and an analogous one for β-pinwheels and WW domains. Whereas IRE1-LD most likely originated by fold-changing mutations from a fully formed PQQ motif β-propeller, type II β-prisms originated by amplification and differentiation of a single blade, possibly also of the PQQ type. We conclude that both β-propellers and type II β-prisms arose by independent amplification of a blade-sized fragment, which represents a remnant of an ancient peptide world.

Collapse

Topham CM, Rouquier M, Tarrat N, André I. Adaptive Smith-Waterman residue match seeding for protein structural alignment. Proteins 2013;81:1823-39. [DOI: 10.1002/prot.24327] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2013] [Revised: 04/22/2013] [Accepted: 05/15/2013] [Indexed: 12/30/2022]

Kolodny R, Pereyaslavets L, Samson AO, Levitt M. On the Universe of Protein Folds. Annu Rev Biophys 2013;42:559-82. [DOI: 10.1146/annurev-biophys-083012-130432] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Minami S, Sawada K, Chikenji G. MICAN: a protein structure alignment algorithm that can handle Multiple-chains, Inverse alignments, C(α) only models, Alternative alignments, and Non-sequential alignments. BMC Bioinformatics 2013;14:24. [PMID: 23331634 PMCID: PMC3637537 DOI: 10.1186/1471-2105-14-24] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2012] [Accepted: 01/08/2013] [Indexed: 11/10/2022] Open

Abstract

Background

Protein pairs that have the same secondary structure packing arrangement but have different topologies have attracted much attention in terms of both evolution and physical chemistry of protein structures. Further investigation of such protein relationships would give us a hint as to how proteins can change their fold in the course of evolution, as well as a insight into physico-chemical properties of secondary structure packing. For this purpose, highly accurate sequence order independent structure comparison methods are needed.

Results

We have developed a novel protein structure alignment algorithm, MICAN (a structure alignment algorithm that can handle Multiple-chain complexes, Inverse direction of secondary structures, C_α only models, Alternative alignments, and Non-sequential alignments). The algorithm was designed so as to identify the best structural alignment between protein pairs by disregarding the connectivity between secondary structure elements (SSE). One of the key feature of the algorithm is utilizing the multiple vector representation for each SSE, which enables us to correctly treat bent or twisted nature of long SSE. We compared MICAN with other 9 publicly available structure alignment programs, using both reference-dependent and reference-independent evaluation methods on a variety of benchmark test sets which include both sequential and non-sequential alignments. We show that MICAN outperforms the other existing methods for reproducing reference alignments of non-sequential test sets. Further, although MICAN does not specialize in sequential structure alignment, it showed the top level performance on the sequential test sets. We also show that MICAN program is the fastest non-sequential structure alignment program among all the programs we examined here.

Conclusions

MICAN is the fastest and the most accurate program among non-sequential alignment programs we examined here. These results suggest that MICAN is a highly effective tool for automatically detecting non-trivial structural relationships of proteins, such as circular permutations and segment-swapping, many of which have been identified manually by human experts so far. The source code of MICAN is freely download-able at http://www.tbp.cse.nagoya-u.ac.jp/MICAN.

Collapse

Wohlers I, Andonov R, Klau GW. DALIX: optimal DALI protein structure alignment. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013;10:26-36. [PMID: 23702541 DOI: 10.1109/tcbb.2012.143] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]

Arriagada M, Poleksic A. On the difference in quality between current heuristic and optimal solutions to the protein structure alignment problem. BIOMED RESEARCH INTERNATIONAL 2012;2013:459248. [PMID: 23509725 PMCID: PMC3591119 DOI: 10.1155/2013/459248] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2012] [Accepted: 11/02/2012] [Indexed: 11/17/2022]

Shih HH, Tu C, Cao W, Klein A, Ramsey R, Fennell BJ, Lambert M, Ní Shúilleabháin D, Autin B, Kouranova E, Laxmanan S, Braithwaite S, Wu L, Ait-Zahra M, Milici AJ, Dumin JA, LaVallie ER, Arai M, Corcoran C, Paulsen JE, Gill D, Cunningham O, Bard J, Mosyak L, Finlay WJJ. An ultra-specific avian antibody to phosphorylated tau protein reveals a unique mechanism for phosphoepitope recognition. J Biol Chem 2012;287:44425-34. [PMID: 23148212 DOI: 10.1074/jbc.m112.415935] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Wohlers I, Malod-Dognin N, Andonov R, Klau GW. CSA: comprehensive comparison of pairwise protein structure alignments. Nucleic Acids Res 2012;40:W303-9. [PMID: 22553365 PMCID: PMC3394275 DOI: 10.1093/nar/gks362] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2012] [Revised: 03/29/2012] [Accepted: 04/10/2012] [Indexed: 11/23/2022] Open

Bliven S, Prlić A. Circular permutation in proteins. PLoS Comput Biol 2012;8:e1002445. [PMID: 22496628 PMCID: PMC3320104 DOI: 10.1371/journal.pcbi.1002445] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open

Kovalchuk N, Smith J, Bazanova N, Pyvovarenko T, Singh R, Shirley N, Ismagul A, Johnson A, Milligan AS, Hrmova M, Langridge P, Lopato S. Characterization of the wheat gene encoding a grain-specific lipid transfer protein TdPR61, and promoter activity in wheat, barley and rice. JOURNAL OF EXPERIMENTAL BOTANY 2012;63:2025-40. [PMID: 22213809 DOI: 10.1093/jxb/err409] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]

Andreeva A. Classification of proteins: available structural space for molecular modeling. Methods Mol Biol 2012;857:1-31. [PMID: 22323215 DOI: 10.1007/978-1-61779-588-6_1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Sun H, Sacan A, Ferhatosmanoglu H, Wang Y. Smolign: a spatial motifs-based protein multiple structural alignment method. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012;9:249-261. [PMID: 21464513 DOI: 10.1109/tcbb.2011.67] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Use of comparative genomics approaches to characterize interspecies differences in response to environmental chemicals: challenges, opportunities, and research needs. Toxicol Appl Pharmacol 2011;271:372-85. [PMID: 22142766 DOI: 10.1016/j.taap.2011.11.011] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2011] [Revised: 11/11/2011] [Accepted: 11/16/2011] [Indexed: 01/12/2023]

Abstract

A critical challenge for environmental chemical risk assessment is the characterization and reduction of uncertainties introduced when extrapolating inferences from one species to another. The purpose of this article is to explore the challenges, opportunities, and research needs surrounding the issue of how genomics data and computational and systems level approaches can be applied to inform differences in response to environmental chemical exposure across species. We propose that the data, tools, and evolutionary framework of comparative genomics be adapted to inform interspecies differences in chemical mechanisms of action. We compare and contrast existing approaches, from disciplines as varied as evolutionary biology, systems biology, mathematics, and computer science, that can be used, modified, and combined in new ways to discover and characterize interspecies differences in chemical mechanism of action which, in turn, can be explored for application to risk assessment. We consider how genetic, protein, pathway, and network information can be interrogated from an evolutionary biology perspective to effectively characterize variations in biological processes of toxicological relevance among organisms. We conclude that comparative genomics approaches show promise for characterizing interspecies differences in mechanisms of action, and further, for improving our understanding of the uncertainties inherent in extrapolating inferences across species in both ecological and human health risk assessment. To achieve long-term relevance and consistent use in environmental chemical risk assessment, improved bioinformatics tools, computational methods robust to data gaps, and quantitative approaches for conducting extrapolations across species are critically needed. Specific areas ripe for research to address these needs are recommended.

Collapse

SALEM SAEED, ZAKI MOHAMMEDJ, BYSTROFF CHRISTOPHER. ITERATIVE NON-SEQUENTIAL PROTEIN STRUCTURAL ALIGNMENT. J Bioinform Comput Biol 2011;7:571-96. [PMID: 19507290 DOI: 10.1142/s0219720009004205] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2008] [Revised: 11/05/2008] [Accepted: 11/06/2008] [Indexed: 11/18/2022]

Daniluk P, Lesyng B. A novel method to compare protein structures using local descriptors. BMC Bioinformatics 2011;12:344. [PMID: 21849047 PMCID: PMC3179968 DOI: 10.1186/1471-2105-12-344] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2011] [Accepted: 08/17/2011] [Indexed: 11/15/2022] Open

Nguyen MN, Tan KP, Madhusudhan MS. CLICK--topology-independent comparison of biomolecular 3D structures. Nucleic Acids Res 2011;39:W24-8. [PMID: 21602266 PMCID: PMC3125785 DOI: 10.1093/nar/gkr393] [Citation(s) in RCA: 100] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2011] [Revised: 04/19/2011] [Accepted: 05/03/2011] [Indexed: 01/28/2023] Open

Rocha J, Alberich R. The significance of the ProtDeform score for structure prediction and alignment. PLoS One 2011;6:e20889. [PMID: 21738592 PMCID: PMC3125161 DOI: 10.1371/journal.pone.0020889] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2011] [Accepted: 05/12/2011] [Indexed: 11/21/2022] Open

Abstract

BACKGROUND

When a researcher uses a program to align two proteins and gets a score, one of her main concerns is how often the program gives a similar score to pairs that are or are not in the same fold. This issue was analysed in detail recently for the program TM-align with its associated TM-score. It was shown that because the TM-score is length independent, it allows a P-value and a hit probability to be defined depending only on the score. Also, it was found that the TM-scores of gapless alignments closely follow an Extreme Value Distribution (EVD). The program ProtDeform for structural protein alignment was developed recently and is characterised by the ability to propose different transformations of different protein regions. Our goal is to analyse its associated score to allow a researcher to have objective reasons to prefer one aligner over another, and carry out a better interpretation of the output.

RESULTS

The study on the ProtDeform score reveals that it is length independent in a wider score range than TM-scores and that PD-scores of gapless (random) alignments also approximately follow an EVD. On the CASP8 predictions, PD-scores and TM-scores, with respect to native structures, are highly correlated (0.95), and show that around a fifth of the predictions have a quality as low as 99.5% of the random scores. Using the Gold Standard benchmark, ProtDeform has lower probabilities of error than TM-align both at a similar speed. The analysis is extended to homology discrimination showing that, again, ProtDeform offers higher hit probabilities than TM-align. Finally, we suggest using three different P-values according to the three different contexts: Gapless alignments, optimised alignments for fold discrimination and that for superfamily discrimination. In conclusion, PD-scores are at the very least as valuable for prediction scoring as TM-scores, and on the protein classification problem, even more reliable.

Collapse

Nguyen MN, Madhusudhan MS. Biological insights from topology independent comparison of protein 3D structures. Nucleic Acids Res 2011;39:e94. [PMID: 21596786 PMCID: PMC3152366 DOI: 10.1093/nar/gkr348] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open

Abstract

Comparing and classifying the three-dimensional (3D) structures of proteins is of crucial importance to molecular biology, from helping to determine the function of a protein to determining its evolutionary relationships. Traditionally, 3D structures are classified into groups of families that closely resemble the grouping according to their primary sequence. However, significant structural similarities exist at multiple levels between proteins that belong to these different structural families. In this study, we propose a new algorithm, CLICK, to capture such similarities. The method optimally superimposes a pair of protein structures independent of topology. Amino acid residues are represented by the Cartesian coordinates of a representative point (usually the C^α atom), side chain solvent accessibility, and secondary structure. Structural comparison is effected by matching cliques of points. CLICK was extensively benchmarked for alignment accuracy on four different sets: (i) 9537 pair-wise alignments between two structures with the same topology; (ii) 64 alignments from set (i) that were considered to constitute difficult alignment cases; (iii) 199 pair-wise alignments between proteins with similar structure but different topology; and (iv) 1275 pair-wise alignments of RNA structures. The accuracy of CLICK alignments was measured by the average structure overlap score and compared with other alignment methods, including HOMSTRAD, MUSTANG, Geometric Hashing, SALIGN, DALI, GANGSTA⁺, FATCAT, ARTS and SARA. On average, CLICK produces pair-wise alignments that are either comparable or statistically significantly more accurate than all of these other methods. We have used CLICK to uncover relationships between (previously) unrelated proteins. These new biological insights include: (i) detecting hinge regions in proteins where domain or sub-domains show flexibility; (ii) discovering similar small molecule binding sites from proteins of different folds and (iii) discovering topological variants of known structural/sequence motifs. Our method can generally be applied to compare any pair of molecular structures represented in Cartesian coordinates as exemplified by the RNA structure superimposition benchmark.

Collapse

Venkateswaran JG, Song B, Kahveci T, Jermaine C. TRIAL: a tool for finding distant structural similarities. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011;8:819-831. [PMID: 21393655 DOI: 10.1109/tcbb.2009.28] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Goncearenco A, Berezovsky IN. Prototypes of elementary functional loops unravel evolutionary connections between protein functions. ACTA ACUST UNITED AC 2010;26:i497-503. [PMID: 20823313 PMCID: PMC2935408 DOI: 10.1093/bioinformatics/btq374] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Prlic A, Bliven S, Rose PW, Bluhm WF, Bizon C, Godzik A, Bourne PE. Pre-calculated protein structure alignments at the RCSB PDB website. Bioinformatics 2010;26:2983-5. [PMID: 20937596 PMCID: PMC3003546 DOI: 10.1093/bioinformatics/btq572] [Citation(s) in RCA: 149] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Andreeva A, Murzin AG. Structural classification of proteins and structural genomics: new insights into protein folding and evolution. Acta Crystallogr Sect F Struct Biol Cryst Commun 2010;66:1190-7. [PMID: 20944210 PMCID: PMC2954204 DOI: 10.1107/s1744309110007177] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2010] [Accepted: 02/24/2010] [Indexed: 11/10/2022]

Wohlers I, Domingues FS, Klau GW. Towards optimal alignment of protein structure distance matrices. Bioinformatics 2010;26:2273-80. [PMID: 20639543 DOI: 10.1093/bioinformatics/btq420] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Konagurthu AS, Reboul CF, Schmidberger JW, Irving JA, Lesk AM, Stuckey PJ, Whisstock JC, Buckle AM. MUSTANG-MR structural sieving server: applications in protein structural analysis and crystallography. PLoS One 2010;5:e10048. [PMID: 20386610 PMCID: PMC2850368 DOI: 10.1371/journal.pone.0010048] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2009] [Accepted: 03/16/2010] [Indexed: 11/19/2022] Open

Abstract

BACKGROUND

A central tenet of structural biology is that related proteins of common function share structural similarity. This has key practical consequences for the derivation and analysis of protein structures, and is exploited by the process of "molecular sieving" whereby a common core is progressively distilled from a comparison of two or more protein structures. This paper reports a novel web server for "sieving" of protein structures, based on the multiple structural alignment program MUSTANG.

METHODOLOGY/PRINCIPAL FINDINGS

"Sieved" models are generated from MUSTANG-generated multiple alignment and superpositions by iteratively filtering out noisy residue-residue correspondences, until the resultant correspondences in the models are optimally "superposable" under a threshold of RMSD. This residue-level sieving is also accompanied by iterative elimination of the poorly fitting structures from the input ensemble. Therefore, by varying the thresholds of RMSD and the cardinality of the ensemble, multiple sieved models are generated for a given multiple alignment and superposition from MUSTANG. To aid the identification of structurally conserved regions of functional importance in an ensemble of protein structures, Lesk-Hubbard graphs are generated, plotting the number of residue correspondences in a superposition as a function of its corresponding RMSD. The conserved "core" (or typically active site) shows a linear trend, which becomes exponential as divergent parts of the structure are included into the superposition.

CONCLUSIONS

The application addresses two fundamental problems in structural biology: first, the identification of common substructures among structurally related proteins--an important problem in characterization and prediction of function; second, generation of sieved models with demonstrated uses in protein crystallographic structure determination using the technique of Molecular Replacement.

Collapse

Alva V, Remmert M, Biegert A, Lupas AN, Söding J. A galaxy of folds. Protein Sci 2010;19:124-30. [PMID: 19937658 DOI: 10.1002/pro.297] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Joerger AC, Fersht AR. The tumor suppressor p53: from structures to drug discovery. Cold Spring Harb Perspect Biol 2010;2:a000919. [PMID: 20516128 DOI: 10.1101/cshperspect.a000919] [Citation(s) in RCA: 233] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Berbalk C, Schwaiger CS, Lackner P. Accuracy analysis of multiple structure alignments. Protein Sci 2009;18:2027-35. [PMID: 19621383 DOI: 10.1002/pro.213] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

PAUL: protein structural alignment using integer linear programming and Lagrangian relaxation. BMC Bioinformatics 2009. [PMCID: PMC2764133 DOI: 10.1186/1471-2105-10-s13-p2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Structural relationships among proteins with different global topologies and their implications for function annotation strategies. Proc Natl Acad Sci U S A 2009;106:17377-82. [PMID: 19805138 DOI: 10.1073/pnas.0907971106] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Poleksic A. Algorithms for optimal protein structure alignment. ACTA ACUST UNITED AC 2009;25:2751-6. [PMID: 19734152 DOI: 10.1093/bioinformatics/btp530] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Micheletti C, Orland H. MISTRAL: a tool for energy-based multiple structural alignment of proteins. ACTA ACUST UNITED AC 2009;25:2663-9. [PMID: 19692555 DOI: 10.1093/bioinformatics/btp506] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Sippl MJ. Fold space unlimited. Curr Opin Struct Biol 2009;19:312-20. [DOI: 10.1016/j.sbi.2009.03.010] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2009] [Revised: 02/16/2009] [Accepted: 03/16/2009] [Indexed: 11/25/2022]

Hasegawa H, Holm L. Advances and pitfalls of protein structural alignment. Curr Opin Struct Biol 2009;19:341-8. [PMID: 19481444 DOI: 10.1016/j.sbi.2009.04.003] [Citation(s) in RCA: 303] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2009] [Accepted: 04/16/2009] [Indexed: 11/30/2022]

Rocha J, Segura J, Wilson RC, Dasgupta S. Flexible structural protein alignment by a sequence of local transformations. ACTA ACUST UNITED AC 2009;25:1625-31. [PMID: 19417057 PMCID: PMC2940242 DOI: 10.1093/bioinformatics/btp296] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Tidow H, Andreeva A, Rutherford TJ, Fersht AR. Solution structure of the U11-48K CHHC zinc-finger domain that specifically binds the 5' splice site of U12-type introns. Structure 2009;17:294-302. [PMID: 19217400 DOI: 10.1016/j.str.2008.11.013] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2008] [Revised: 11/24/2008] [Accepted: 11/26/2008] [Indexed: 10/21/2022]

Cradle-loop barrels and the concept of metafolds in protein classification by natural descent. Curr Opin Struct Biol 2008;18:358-65. [DOI: 10.1016/j.sbi.2008.02.006] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2007] [Accepted: 02/14/2008] [Indexed: 11/19/2022]