1
|
Daniluk P, Oleniecki T, Lesyng B. DAMA: a method for computing multiple alignments of protein structures using local structure descriptors. Bioinformatics 2021; 38:80-85. [PMID: 34396393 PMCID: PMC8696102 DOI: 10.1093/bioinformatics/btab571] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 05/31/2021] [Accepted: 08/12/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION The well-known fact that protein structures are more conserved than their sequences forms the basis of several areas of computational structural biology. Methods based on the structure analysis provide more complete information on residue conservation in evolutionary processes. This is crucial for the determination of evolutionary relationships between proteins and for the identification of recurrent structural patterns present in biomolecules involved in similar functions. However, algorithmic structural alignment is much more difficult than multiple sequence alignment. This study is devoted to the development and applications of DAMA-a novel effective environment capable to compute and analyze multiple structure alignments. RESULTS DAMA is based on local structural similarities, using local 3D structure descriptors and thus accounts for nearest-neighbor molecular environments of aligned residues. It is constrained neither by protein topology nor by its global structure. DAMA is an extension of our previous study (DEDAL) which demonstrated the applicability of local descriptors to pairwise alignment problems. Since the multiple alignment problem is NP-complete, an effective heuristic approach has been developed without imposing any artificial constraints. The alignment algorithm searches for the largest, consistent ensemble of similar descriptors. The new method is capable to capture most of the biologically significant similarities present in canonical test sets and is discriminatory enough to prevent the emergence of larger, but meaningless, solutions. Tests performed on the test sets, including protein kinases, demonstrate DAMA's capability of identifying equivalent residues, which should be very useful in discovering the biological nature of proteins similarity. Performance profiles show the advantage of DAMA over other methods, in particular when using a strict similarity measure QC, which is the ratio of correctly aligned columns, and when applying the methods to more difficult cases. AVAILABILITY AND IMPLEMENTATION DAMA is available online at http://dworkowa.imdik.pan.pl/EP/DAMA. Linux binaries of the software are available upon request. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Paweł Daniluk
- Bioinformatics Laboratory, Mossakowski Medical Research Centre, Polish Academy of Sciences, 02-106 Warsaw, Poland
| | - Tymoteusz Oleniecki
- College of Inter-Faculty Individual Studies in Mathematics and Natural Sciences, University of Warsaw, 02-089 Warsaw, Poland
| | | |
Collapse
|
2
|
Le HT, Do PC, Le L. Grafting Methionine on 1F1 Ab Increases the Broad-Activity on HA Structural-Conserved Residues of H1, H2, and H3 Influenza a Viruses. Evol Bioinform Online 2021; 17:11769343211003082. [PMID: 33795930 PMCID: PMC7975486 DOI: 10.1177/11769343211003082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 02/24/2021] [Indexed: 11/27/2022] Open
Abstract
A high level of mutation enables the influenza A virus to resist antibiotics
previously effective against the influenza A virus. A portion of the structure
of hemagglutinin HA is assumed to be well-conserved to maintain its role in
cellular fusion, and the structure tends to be more conserved than sequence. We
designed peptide inhibitors to target the conserved residues on the HA surface,
which were identified based on structural alignment. Most of the conserved and
strongly similar residues are located in the receptor-binding and esterase
regions on the HA1 domain In a later step, fragments of anti-HA antibodies were
gathered and screened for the binding ability to the found conserved residues.
As a result, Methionine amino acid got the best docking score within the −2.8 Å
radius of Van der Waals when it is interacting with Tyrosine, Arginine, and
Glutamic acid. Then, the binding affinity and spectrum of the fragments were
enhanced by grafting hotspot amino acid into the fragments to form peptide
inhibitors. Our peptide inhibitor was able to form in silico contact with a
structurally conserved region across H1, H2, and H3 HA, with the binding site at
the boundary between HA1 and HA2 domains, spreading across different monomers,
suggesting a new target for designing broad-spectrum antibody and vaccine. This
research presents an affordable method to design broad-spectrum peptide
inhibitors using fragments of an antibody as a scaffold.
Collapse
Affiliation(s)
- Hoa Thanh Le
- School of Biotechnology, International University, Ho Chi Minh City, Vietnam.,Vietnam National University, Ho Chi Minh City, Vietnam
| | - Phuc-Chau Do
- School of Biotechnology, International University, Ho Chi Minh City, Vietnam.,Vietnam National University, Ho Chi Minh City, Vietnam
| | - Ly Le
- School of Biotechnology, International University, Ho Chi Minh City, Vietnam.,Vietnam National University, Ho Chi Minh City, Vietnam.,Vingroup Big Data Institute, Hanoi, Vietnam
| |
Collapse
|
3
|
Carpentier M, Chomilier J. Protein multiple alignments: sequence-based versus structure-based programs. Bioinformatics 2020; 35:3970-3980. [PMID: 30942864 DOI: 10.1093/bioinformatics/btz236] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Revised: 03/05/2019] [Accepted: 04/02/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Multiple sequence alignment programs have proved to be very useful and have already been evaluated in the literature yet not alignment programs based on structure or both sequence and structure. In the present article we wish to evaluate the added value provided through considering structures. RESULTS We compared the multiple alignments resulting from 25 programs either based on sequence, structure or both, to reference alignments deposited in five databases (BALIBASE 2 and 3, HOMSTRAD, OXBENCH and SISYPHUS). On the whole, the structure-based methods compute more reliable alignments than the sequence-based ones, and even than the sequence+structure-based programs whatever the databases. Two programs lead, MAMMOTH and MATRAS, nevertheless the performances of MUSTANG, MATT, 3DCOMB, TCOFFEE+TM_ALIGN and TCOFFEE+SAP are better for some alignments. The advantage of structure-based methods increases at low levels of sequence identity, or for residues in regular secondary structures or buried ones. Concerning gap management, sequence-based programs set less gaps than structure-based programs. Concerning the databases, the alignments of the manually built databases are more challenging for the programs. AVAILABILITY AND IMPLEMENTATION All data and results presented in this study are available at: http://wwwabi.snv.jussieu.fr/people/mathilde/download/AliMulComp/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mathilde Carpentier
- Institut Systématique Evolution Biodiversité (ISYEB), Sorbonne Université, MNHN, CNRS, EPHE, Paris, France
| | - Jacques Chomilier
- Sorbonne Université, MNHN, CNRS, IRD, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie (IMPMC), BiBiP, Paris, France
| |
Collapse
|
4
|
Dong R, Peng Z, Zhang Y, Yang J. mTM-align: an algorithm for fast and accurate multiple protein structure alignment. Bioinformatics 2019; 34:1719-1725. [PMID: 29281009 DOI: 10.1093/bioinformatics/btx828] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2017] [Accepted: 12/20/2017] [Indexed: 12/22/2022] Open
Abstract
Motivation As protein structure is more conserved than sequence during evolution, multiple structure alignment can be more informative than multiple sequence alignment, especially for distantly related proteins. With the rapid increase of the number of protein structures in the Protein Data Bank, it becomes urgent to develop efficient algorithms for multiple structure alignment. Results A new multiple structure alignment algorithm (mTM-align) was proposed, which is an extension of the highly efficient pairwise structure alignment program TM-align. The algorithm was benchmarked on four widely used datasets, HOMSTRAD, SABmark_sup, SABmark_twi and SISY-multiple, showing that mTM-align consistently outperforms other algorithms. In addition, the comparison with the manually curated alignments in the HOMSTRAD database shows that the automated alignments built by mTM-align are in general more accurate. Therefore, mTM-align may be used as a reliable complement to construct multiple structure alignments for real-world applications. Availability and implementation http://yanglab.nankai.edu.cn/mTM-align. Contact zhng@umich.edu or yangjy@nankai.edu.cn. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Runze Dong
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| | - Zhenling Peng
- Center for Applied Mathematics, Tianjin University, Tianjin 300072, China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA
| | - Jianyi Yang
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| |
Collapse
|
5
|
Ritchie DW. Calculating and scoring high quality multiple flexible protein structure alignments. Bioinformatics 2016; 32:2650-8. [PMID: 27187202 DOI: 10.1093/bioinformatics/btw300] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2015] [Accepted: 05/07/2016] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Calculating multiple protein structure alignments (MSAs) is important for understanding functional and evolutionary relationships between protein families, and for modeling protein structures by homology. While incorporating backbone flexibility promises to circumvent many of the limitations of rigid MSA algorithms, very few flexible MSA algorithms exist today. This article describes several novel improvements to the Kpax algorithm which allow high quality flexible MSAs to be calculated. This article also introduces a new Gaussian-based MSA quality measure called 'M-score', which circumvents the pitfalls of RMSD-based quality measures. RESULTS As well as calculating flexible MSAs, the new version of Kpax can also score MSAs from other aligners and from previously aligned reference datasets. Results are presented for a large-scale evaluation of the Homstrad, SABmark and SISY benchmark sets using Kpax and Matt as examples of state-of-the-art flexible aligners and 3DCOMB as an example of a state-of-the-art rigid aligner. These results demonstrate the utility of the M-score as a measure of MSA quality and show that high quality MSAs may be achieved when structural flexibility is properly taken into account. AVAILABILITY AND IMPLEMENTATION Kpax 5.0 may be downloaded for academic use at http://kpax.loria.fr/ CONTACT dave.ritchie@inria.fr SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
6
|
Tiwari SP, Reuter N. Similarity in Shape Dictates Signature Intrinsic Dynamics Despite No Functional Conservation in TIM Barrel Enzymes. PLoS Comput Biol 2016; 12:e1004834. [PMID: 27015412 PMCID: PMC4807811 DOI: 10.1371/journal.pcbi.1004834] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 02/25/2016] [Indexed: 11/19/2022] Open
Abstract
The conservation of the intrinsic dynamics of proteins emerges as we attempt to understand the relationship between sequence, structure and functional conservation. We characterise the conservation of such dynamics in a case where the structure is conserved but function differs greatly. The triosephosphate isomerase barrel fold (TBF), renowned for its 8 β-strand-α-helix repeats that close to form a barrel, is one of the most diverse and abundant folds found in known protein structures. Proteins with this fold have diverse enzymatic functions spanning five of six Enzyme Commission classes, and we have picked five different superfamily candidates for our analysis using elastic network models. We find that the overall shape is a large determinant in the similarity of the intrinsic dynamics, regardless of function. In particular, the β-barrel core is highly rigid, while the α-helices that flank the β-strands have greater relative mobility, allowing for the many possibilities for placement of catalytic residues. We find that these elements correlate with each other via the loops that link them, as opposed to being directly correlated. We are also able to analyse the types of motions encoded by the normal mode vectors of the α-helices. We suggest that the global conservation of the intrinsic dynamics in the TBF contributes greatly to its success as an enzymatic scaffold both through evolution and enzyme design.
Collapse
Affiliation(s)
- Sandhya P. Tiwari
- Department of Molecular Biology, University of Bergen, Pb. 7803, Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Pb. 7803, Bergen, Norway
| | - Nathalie Reuter
- Department of Molecular Biology, University of Bergen, Pb. 7803, Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Pb. 7803, Bergen, Norway
- * E-mail:
| |
Collapse
|
7
|
Gandhimathi A, Ghosh P, Hariharaputran S, Mathew OK, Sowdhamini R. PASS2 database for the structure-based sequence alignment of distantly related SCOP domain superfamilies: update to version 5 and added features. Nucleic Acids Res 2016; 44:D410-4. [PMID: 26553811 PMCID: PMC4702857 DOI: 10.1093/nar/gkv1205] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Revised: 10/16/2015] [Accepted: 10/24/2015] [Indexed: 11/12/2022] Open
Abstract
Structure-based sequence alignment is an essential step in assessing and analysing the relationship of distantly related proteins. PASS2 is a database that records such alignments for protein domain superfamilies and has been constantly updated periodically. This update of the PASS2 version, named as PASS2.5, directly corresponds to the SCOPe 2.04 release. All SCOPe structural domains that share less than 40% sequence identity, as defined by the ASTRAL compendium of protein structures, are included. The current version includes 1977 superfamilies and has been assembled utilizing the structure-based sequence alignment protocol. Such an alignment is obtained initially through MATT, followed by a refinement through the COMPARER program. The JOY program has been used for structural annotations of such alignments. In this update, we have automated the protocol and focused on inclusion of new features such as mapping of GO terms, absolutely conserved residues among the domains in a superfamily and inclusion of PDBs, that are absent in SCOPe 2.04, using the HMM profiles from the alignments of the superfamily members and are provided as a separate list. We have also implemented a more user-friendly manner of data presentation and options for downloading more features. PASS2.5 version is available at http://caps.ncbs.res.in/pass2/.
Collapse
Affiliation(s)
- Arumugam Gandhimathi
- National Centre for Biological Sciences (TIFR), GKVK Campus, Bangalore 560065, Karnataka, India
| | - Pritha Ghosh
- National Centre for Biological Sciences (TIFR), GKVK Campus, Bangalore 560065, Karnataka, India
| | - Sridhar Hariharaputran
- National Centre for Biological Sciences (TIFR), GKVK Campus, Bangalore 560065, Karnataka, India Bharathidasan University, Palkalainagar, Tiruchirapalli 620024, Tamilnadu, India
| | - Oommen K Mathew
- National Centre for Biological Sciences (TIFR), GKVK Campus, Bangalore 560065, Karnataka, India SASTRA University, Tirumalaisamudram, Thanjavur 613401, Tamil Nadu, India
| | - R Sowdhamini
- National Centre for Biological Sciences (TIFR), GKVK Campus, Bangalore 560065, Karnataka, India
| |
Collapse
|
8
|
Brown P, Pullan W, Yang Y, Zhou Y. Fast and accurate non-sequential protein structure alignment using a new asymmetric linear sum assignment heuristic. Bioinformatics 2015; 32:370-7. [PMID: 26454279 DOI: 10.1093/bioinformatics/btv580] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 10/04/2015] [Indexed: 01/24/2023] Open
Abstract
MOTIVATION The three dimensional tertiary structure of a protein at near atomic level resolution provides insight alluding to its function and evolution. As protein structure decides its functionality, similarity in structure usually implies similarity in function. As such, structure alignment techniques are often useful in the classifications of protein function. Given the rapidly growing rate of new, experimentally determined structures being made available from repositories such as the Protein Data Bank, fast and accurate computational structure comparison tools are required. This paper presents SPalignNS, a non-sequential protein structure alignment tool using a novel asymmetrical greedy search technique. RESULTS The performance of SPalignNS was evaluated against existing sequential and non-sequential structure alignment methods by performing trials with commonly used datasets. These benchmark datasets used to gauge alignment accuracy include (i) 9538 pairwise alignments implied by the HOMSTRAD database of homologous proteins; (ii) a subset of 64 difficult alignments from set (i) that have low structure similarity; (iii) 199 pairwise alignments of proteins with similar structure but different topology; and (iv) a subset of 20 pairwise alignments from the RIPC set. SPalignNS is shown to achieve greater alignment accuracy (lower or comparable root-mean squared distance with increased structure overlap coverage) for all datasets, and the highest agreement with reference alignments from the challenging dataset (iv) above, when compared with both sequentially constrained alignments and other non-sequential alignments. AVAILABILITY AND IMPLEMENTATION SPalignNS was implemented in C++. The source code, binary executable, and a web server version is freely available at: http://sparks-lab.org CONTACT yaoqi.zhou@griffith.edu.au.
Collapse
Affiliation(s)
- Peter Brown
- School of ICT, Griffith University, Gold Coast, QLD 4222, Australia
| | - Wayne Pullan
- School of ICT, Griffith University, Gold Coast, QLD 4222, Australia
| | - Yuedong Yang
- Institute for Glycomics, Griffith University, Gold Coast, QLD 4222, Australia
| | - Yaoqi Zhou
- School of ICT, Griffith University, Gold Coast, QLD 4222, Australia Institute for Glycomics, Griffith University, Gold Coast, QLD 4222, Australia
| |
Collapse
|
9
|
Stamm M, Forrest LR. Structure alignment of membrane proteins: Accuracy of available tools and a consensus strategy. Proteins 2015; 83:1720-32. [PMID: 26178143 DOI: 10.1002/prot.24857] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2015] [Revised: 05/07/2015] [Accepted: 06/07/2015] [Indexed: 12/31/2022]
Abstract
Protein structure alignment methods are used for the detection of evolutionary and functionally related positions in proteins. A wide array of different methods are available, but the choice of the best method is often not apparent to the user. Several studies have assessed the alignment accuracy and consistency of structure alignment methods, but none of these explicitly considered membrane proteins, which are important targets for drug development and have distinct structural features. Here, we compared 13 widely used pairwise structural alignment methods on a test set of homologous membrane protein structures (called HOMEP3). Each pair of structures was aligned and the corresponding sequence alignment was used to construct homology models. The model accuracy compared to the known structures was assessed using scoring functions not incorporated in the tested structural alignment methods. The analysis shows that fragment-based approaches such as FR-TM-align are the most useful for aligning structures of membrane proteins. Moreover, fragment-based approaches are more suitable for comparison of protein structures that have undergone large conformational changes. Nevertheless, no method was clearly superior to all other methods. Additionally, all methods lack a measure to rate the reliability of a position within a structure alignment. To solve both of these problems, we propose a consensus-type approach, combining alignments from four different methods, namely FR-TM-align, DaliLite, MATT, and FATCAT. Agreement between the methods is used to assign confidence values to each position of the alignment. Overall, we conclude that there remains scope for the improvement of structural alignment methods for membrane proteins.
Collapse
Affiliation(s)
- Marcus Stamm
- Computational Structural Biology Group, Max Planck Institute of Biophysics, Frankfurt Am Main, Germany
| | - Lucy R Forrest
- Computational Structural Biology Group, Max Planck Institute of Biophysics, Frankfurt Am Main, Germany.,Computational Structural Biology Section, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|
10
|
Tiwari SP, Fuglebakk E, Hollup SM, Skjærven L, Cragnolini T, Grindhaug SH, Tekle KM, Reuter N. WEBnm@ v2.0: Web server and services for comparing protein flexibility. BMC Bioinformatics 2014; 15:427. [PMID: 25547242 PMCID: PMC4339738 DOI: 10.1186/s12859-014-0427-6] [Citation(s) in RCA: 72] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2014] [Accepted: 12/11/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Normal mode analysis (NMA) using elastic network models is a reliable and cost-effective computational method to characterise protein flexibility and by extension, their dynamics. Further insight into the dynamics-function relationship can be gained by comparing protein motions between protein homologs and functional classifications. This can be achieved by comparing normal modes obtained from sets of evolutionary related proteins. RESULTS We have developed an automated tool for comparative NMA of a set of pre-aligned protein structures. The user can submit a sequence alignment in the FASTA format and the corresponding coordinate files in the Protein Data Bank (PDB) format. The computed normalised squared atomic fluctuations and atomic deformation energies of the submitted structures can be easily compared on graphs provided by the web user interface. The web server provides pairwise comparison of the dynamics of all proteins included in the submitted set using two measures: the Root Mean Squared Inner Product and the Bhattacharyya Coefficient. The Comparative Analysis has been implemented on our web server for NMA, WEBnm@, which also provides recently upgraded functionality for NMA of single protein structures. This includes new visualisations of protein motion, visualisation of inter-residue correlations and the analysis of conformational change using the overlap analysis. In addition, programmatic access to WEBnm@ is now available through a SOAP-based web service. Webnm@ is available at http://apps.cbu.uib.no/webnma . CONCLUSION WEBnm@ v2.0 is an online tool offering unique capability for comparative NMA on multiple protein structures. Along with a convenient web interface, powerful computing resources, and several methods for mode analyses, WEBnm@ facilitates the assessment of protein flexibility within protein families and superfamilies. These analyses can give a good view of how the structures move and how the flexibility is conserved over the different structures.
Collapse
Affiliation(s)
- Sandhya P Tiwari
- Department of Molecular Biology, University of Bergen, Bergen, Norway.
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway.
| | - Edvin Fuglebakk
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway.
| | - Siv M Hollup
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway.
| | - Lars Skjærven
- Department of Biomedicine, University of Bergen, Bergen, Norway.
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway.
| | - Tristan Cragnolini
- Department of Molecular Biology, University of Bergen, Bergen, Norway.
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway.
- Present address: University Chemical Laboratories, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| | - Svenn H Grindhaug
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway.
| | - Kidane M Tekle
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway.
| | - Nathalie Reuter
- Department of Molecular Biology, University of Bergen, Bergen, Norway.
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway.
| |
Collapse
|
11
|
Fuglebakk E, Tiwari SP, Reuter N. Comparing the intrinsic dynamics of multiple protein structures using elastic network models. Biochim Biophys Acta Gen Subj 2014; 1850:911-922. [PMID: 25267310 DOI: 10.1016/j.bbagen.2014.09.021] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Revised: 09/15/2014] [Accepted: 09/16/2014] [Indexed: 12/15/2022]
Abstract
BACKGROUND Elastic network models (ENMs) are based on the simple idea that a protein can be described as a set of particles connected by springs, which can then be used to describe its intrinsic flexibility using, for example, normal mode analysis. Since the introduction of the first ENM by Monique Tirion in 1996, several variants using coarser protein models have been proposed and their reliability for the description of protein intrinsic dynamics has been widely demonstrated. Lately an increasing number of studies have focused on the meaning of slow dynamics for protein function and its potential conservation through evolution. This leads naturally to comparisons of the intrinsic dynamics of multiple protein structures with varying levels of similarity. SCOPE OF REVIEW We describe computational strategies for calculating and comparing intrinsic dynamics of multiple proteins using elastic network models, as well as a selection of examples from the recent literature. MAJOR CONCLUSIONS The increasing interest for comparing dynamics across protein structures with various levels of similarity, has led to the establishment and validation of reliable computational strategies using ENMs. Comparing dynamics has been shown to be a viable way for gaining greater understanding for the mechanisms employed by proteins for their function. Choices of ENM parameters, structure alignment or similarity measures will likely influence the interpretation of the comparative analysis of protein motion. GENERAL SIGNIFICANCE Understanding the relation between protein function and dynamics is relevant to the fundamental understanding of protein structure-dynamics-function relationship. This article is part of a Special Issue entitled Recent developments of molecular dynamics.
Collapse
Affiliation(s)
- Edvin Fuglebakk
- Department of Molecular Biology, University of Bergen, Pb. 7803, N-5020 Bergen, Norway; Computational Biology Unit, Department of Informatics, University of Bergen, Pb. 7803, N-5020 Bergen, Norway.
| | - Sandhya P Tiwari
- Department of Molecular Biology, University of Bergen, Pb. 7803, N-5020 Bergen, Norway; Computational Biology Unit, Department of Informatics, University of Bergen, Pb. 7803, N-5020 Bergen, Norway.
| | - Nathalie Reuter
- Department of Molecular Biology, University of Bergen, Pb. 7803, N-5020 Bergen, Norway; Computational Biology Unit, Department of Informatics, University of Bergen, Pb. 7803, N-5020 Bergen, Norway.
| |
Collapse
|
12
|
Wang HW, Chu CH, Wang WC, Pai TW. A local average distance descriptor for flexible protein structure comparison. BMC Bioinformatics 2014; 15:95. [PMID: 24694083 PMCID: PMC3992163 DOI: 10.1186/1471-2105-15-95] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2013] [Accepted: 03/22/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein structures are flexible and often show conformational changes upon binding to other molecules to exert biological functions. As protein structures correlate with characteristic functions, structure comparison allows classification and prediction of proteins of undefined functions. However, most comparison methods treat proteins as rigid bodies and cannot retrieve similarities of proteins with large conformational changes effectively. RESULTS In this paper, we propose a novel descriptor, local average distance (LAD), based on either the geodesic distances (GDs) or Euclidean distances (EDs) for pairwise flexible protein structure comparison. The proposed method was compared with 7 structural alignment methods and 7 shape descriptors on two datasets comprising hinge bending motions from the MolMovDB, and the results have shown that our method outperformed all other methods regarding retrieving similar structures in terms of precision-recall curve, retrieval success rate, R-precision, mean average precision and F1-measure. CONCLUSIONS Both ED- and GD-based LAD descriptors are effective to search deformed structures and overcome the problems of self-connection caused by a large bending motion. We have also demonstrated that the ED-based LAD is more robust than the GD-based descriptor. The proposed algorithm provides an alternative approach for blasting structure database, discovering previously unknown conformational relationships, and reorganizing protein structure classification.
Collapse
Affiliation(s)
| | | | | | - Tun-Wen Pai
- Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung, Taiwan.
| |
Collapse
|
13
|
Carvalho CS, Vlachakis D, Tsiliki G, Megalooikonomou V, Kossida S. Protein signatures using electrostatic molecular surfaces in harmonic space. PeerJ 2013; 1:e185. [PMID: 24167780 PMCID: PMC3807749 DOI: 10.7717/peerj.185] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Accepted: 10/02/2013] [Indexed: 11/20/2022] Open
Abstract
We developed a novel method based on the Fourier analysis of protein molecular surfaces to speed up the analysis of the vast structural data generated in the post-genomic era. This method computes the power spectrum of surfaces of the molecular electrostatic potential, whose three-dimensional coordinates have been either experimentally or theoretically determined. Thus we achieve a reduction of the initial three-dimensional information on the molecular surface to the one-dimensional information on pairs of points at a fixed scale apart. Consequently, the similarity search in our method is computationally less demanding and significantly faster than shape comparison methods. As proof of principle, we applied our method to a training set of viral proteins that are involved in major diseases such as Hepatitis C, Dengue fever, Yellow fever, Bovine viral diarrhea and West Nile fever. The training set contains proteins of four different protein families, as well as a mammalian representative enzyme. We found that the power spectrum successfully assigns a unique signature to each protein included in our training set, thus providing a direct probe of functional similarity among proteins. The results agree with established biological data from conventional structural biochemistry analyses.
Collapse
Affiliation(s)
- C. Sofia Carvalho
- Centro de Astronomia e Astrofísica da Universidade de Lisboa, Tapada da Ajuda, Lisbon, Portugal
- Research Center for Astronomy and Applied Mathematics, Academy of Athens, Athens, Greece
| | - Dimitrios Vlachakis
- Bioinformatics & Medical Informatics Team, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
| | - Georgia Tsiliki
- Bioinformatics & Medical Informatics Team, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
| | - Vasileios Megalooikonomou
- Computer Engineering and Informatics Department, School of Engineering, University of Patras, Patras, Greece
| | - Sophia Kossida
- Bioinformatics & Medical Informatics Team, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
| |
Collapse
|
14
|
Guhsl EE, Hofstetter G, Hemmer W, Ebner C, Vieths S, Vogel L, Breiteneder H, Radauer C. Vig r 6, the cytokinin-specific binding protein from mung bean (Vigna radiata) sprouts, cross-reacts with Bet v 1-related allergens and binds IgE from birch pollen allergic patients' sera. Mol Nutr Food Res 2013; 58:625-34. [PMID: 23996905 PMCID: PMC4135424 DOI: 10.1002/mnfr.201300153] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2013] [Revised: 07/22/2013] [Accepted: 07/24/2013] [Indexed: 11/05/2022]
Abstract
SCOPE Birch pollen associated allergy to mung bean sprouts is caused by cross-reactivity between the birch pollen allergen Bet v 1 and the mung bean allergen Vig r 1. We aimed to determine the allergenicity of the cytokinin-specific binding protein from mung bean (Vig r 6), another allergen related to Bet v 1 with only 31% sequence identity. METHODS AND RESULTS Bet v 1, Gly m 4, Vig r 1, and Vig r 6 were produced in Escherichia coli. In an ELISA, 73 and 32% of Bet v 1-sensitized birch-allergic patients' sera (n = 60) showed IgE binding to Vig r 1 and Vig r 6, respectively. Of 19 patients who reported allergic reactions or had positive prick-to-prick tests to mung bean sprouts, 79% showed IgE binding to Vig r 1 and 63% showed IgE binding to Vig r 6. Bet v 1 completely inhibited IgE binding to both mung bean allergens. Vig r 6 showed partial cross-reactivity with Vig r 1 and activated basophils sensitized with mung bean allergic patients' sera. CONCLUSION We demonstrated IgE cross-reactivity despite low sequence identity between Vig r 6 and other Bet v 1-related allergens. Thus, IgE binding to Vig r 6 may contribute to birch pollinosis-associated mung bean sprout allergy.
Collapse
Affiliation(s)
- Eva Elisabeth Guhsl
- Department of Pathophysiology and Allergy Research, Medical University of Vienna, Austria
| | | | | | | | | | | | | | | |
Collapse
|
15
|
Topham CM, Rouquier M, Tarrat N, André I. Adaptive Smith-Waterman residue match seeding for protein structural alignment. Proteins 2013; 81:1823-39. [DOI: 10.1002/prot.24327] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2013] [Revised: 04/22/2013] [Accepted: 05/15/2013] [Indexed: 12/30/2022]
Affiliation(s)
- Christopher M. Topham
- Université de Toulouse, INSA, UPS, INP, LISBP; 135 Avenue de Rangueil F-31077 Toulouse France
- CNRS, UMR5504; F-31400 Toulouse France
- INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés; F-31400 Toulouse France
| | - Mickaël Rouquier
- Université de Toulouse, INSA, UPS, INP, LISBP; 135 Avenue de Rangueil F-31077 Toulouse France
- CNRS, UMR5504; F-31400 Toulouse France
- INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés; F-31400 Toulouse France
| | - Nathalie Tarrat
- Université de Toulouse, INSA, UPS, INP, LISBP; 135 Avenue de Rangueil F-31077 Toulouse France
- CNRS, UMR5504; F-31400 Toulouse France
- INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés; F-31400 Toulouse France
| | - Isabelle André
- Université de Toulouse, INSA, UPS, INP, LISBP; 135 Avenue de Rangueil F-31077 Toulouse France
- CNRS, UMR5504; F-31400 Toulouse France
- INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés; F-31400 Toulouse France
| |
Collapse
|
16
|
Minami S, Sawada K, Chikenji G. MICAN: a protein structure alignment algorithm that can handle Multiple-chains, Inverse alignments, C(α) only models, Alternative alignments, and Non-sequential alignments. BMC Bioinformatics 2013; 14:24. [PMID: 23331634 PMCID: PMC3637537 DOI: 10.1186/1471-2105-14-24] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2012] [Accepted: 01/08/2013] [Indexed: 11/10/2022] Open
Abstract
Background Protein pairs that have the same secondary structure packing arrangement but have different topologies have attracted much attention in terms of both evolution and physical chemistry of protein structures. Further investigation of such protein relationships would give us a hint as to how proteins can change their fold in the course of evolution, as well as a insight into physico-chemical properties of secondary structure packing. For this purpose, highly accurate sequence order independent structure comparison methods are needed. Results We have developed a novel protein structure alignment algorithm, MICAN (a structure alignment algorithm that can handle Multiple-chain complexes, Inverse direction of secondary structures, Cα only models, Alternative alignments, and Non-sequential alignments). The algorithm was designed so as to identify the best structural alignment between protein pairs by disregarding the connectivity between secondary structure elements (SSE). One of the key feature of the algorithm is utilizing the multiple vector representation for each SSE, which enables us to correctly treat bent or twisted nature of long SSE. We compared MICAN with other 9 publicly available structure alignment programs, using both reference-dependent and reference-independent evaluation methods on a variety of benchmark test sets which include both sequential and non-sequential alignments. We show that MICAN outperforms the other existing methods for reproducing reference alignments of non-sequential test sets. Further, although MICAN does not specialize in sequential structure alignment, it showed the top level performance on the sequential test sets. We also show that MICAN program is the fastest non-sequential structure alignment program among all the programs we examined here. Conclusions MICAN is the fastest and the most accurate program among non-sequential alignment programs we examined here. These results suggest that MICAN is a highly effective tool for automatically detecting non-trivial structural relationships of proteins, such as circular permutations and segment-swapping, many of which have been identified manually by human experts so far. The source code of MICAN is freely download-able at http://www.tbp.cse.nagoya-u.ac.jp/MICAN.
Collapse
Affiliation(s)
- Shintaro Minami
- Department of Computational Science and Engineering, Nagoya University, Nagoya 464-8603, Japan
| | | | | |
Collapse
|
17
|
Wohlers I, Andonov R, Klau GW. DALIX: optimal DALI protein structure alignment. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:26-36. [PMID: 23702541 DOI: 10.1109/tcbb.2012.143] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
We present a mathematical model and exact algorithm for optimally aligning protein structures using the DALI scoring model. This scoring model is based on comparing the interresidue distance matrices of proteins and is used in the popular DALI software tool, a heuristic method for protein structure alignment. Our model and algorithm extend an integer linear programming approach that has been previously applied for the related, but simpler, contact map overlap problem. To this end, we introduce a novel type of constraint that handles negative score values and relax it in a Lagrangian fashion. The new algorithm, which we call DALIX, is applicable to any distance matrix-based scoring scheme. We also review options that allow to consider fewer pairs of interresidue distances explicitly because their large number hinders the optimization process. Using four known data sets of varying structural similarity, we compute many provably score-optimal DALI alignments. This allowed, for the first time, to evaluate the DALI heuristic in sound mathematical terms. The results indicate that DALI usually computes optimal or close to optimal alignments. However, we detect a subset of small proteins for which DALI fails to generate any significant alignment, although such alignments do exist.
Collapse
Affiliation(s)
- Inken Wohlers
- Genominformatik, Universität Duisburg-Essen/Universitätsklinikum, Germany.
| | | | | |
Collapse
|
18
|
Wohlers I, Malod-Dognin N, Andonov R, Klau GW. CSA: comprehensive comparison of pairwise protein structure alignments. Nucleic Acids Res 2012; 40:W303-9. [PMID: 22553365 PMCID: PMC3394275 DOI: 10.1093/nar/gks362] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2012] [Revised: 03/29/2012] [Accepted: 04/10/2012] [Indexed: 11/23/2022] Open
Abstract
CSA is a web server for the computation, evaluation and comprehensive comparison of pairwise protein structure alignments. Its exact alignment engine computes either optimal, top-scoring alignments or heuristic alignments with quality guarantee for the inter-residue distance-based scorings of contact map overlap, PAUL, DALI and MATRAS. These and additional, uploaded alignments are compared using a number of quality measures and intuitive visualizations. CSA brings new insight into the structural relationship of the protein pairs under investigation and is a valuable tool for studying structural similarities. It is available at http://csa.project.cwi.nl.
Collapse
Affiliation(s)
- Inken Wohlers
- Life Sciences Group, Centrum Wiskunde & Informatica, Science Park 123, 1098 XG Amsterdam, The Netherlands.
| | | | | | | |
Collapse
|
19
|
Daniels NM, Kumar A, Cowen LJ, Menke M. Touring protein space with Matt. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:286-93. [PMID: 21464511 PMCID: PMC3355523 DOI: 10.1109/tcbb.2011.70] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Using the Matt structure alignment program, we take a tour of protein space, producing a hierarchical clustering scheme that divides protein structural domains into clusters based on geometric dissimilarity. While it was known that purely structural, geometric, distance-based measures of structural similarity, such as Dali/FSSP, could largely replicate hand-curated schemes such as SCOP at the family level, it was an open question as to whether any such scheme could approximate SCOP at the more distant superfamily and fold levels. We partially answer this question in the affirmative, by designing a clustering scheme based on Matt that approximately matches SCOP at the superfamily level, and demonstrates qualitative differences in performance between Matt and DaliLite. Implications for the debate over the organization of protein fold space are discussed. Based on our clustering of protein space, we introduce the Mattbench benchmark set, a new collection of structural alignments useful for testing sequence aligners on more distantly homologous proteins.
Collapse
Affiliation(s)
- Noah M. Daniels
- The authors are with the Tufts University, 161 College Avenue, Halligan Hall Room 102, Medford, MA 02155
| | - Anoop Kumar
- The authors are with the Tufts University, 161 College Avenue, Halligan Hall Room 102, Medford, MA 02155
| | - Lenore J. Cowen
- The authors are with the Tufts University, 161 College Avenue, Halligan Hall Room 102, Medford, MA 02155
| | - Matt Menke
- The authors are with the Tufts University, 161 College Avenue, Halligan Hall Room 102, Medford, MA 02155
| |
Collapse
|
20
|
Liu W, Srivastava A, Zhang J. A mathematical framework for protein structure comparison. PLoS Comput Biol 2011; 7:e1001075. [PMID: 21304929 PMCID: PMC3033361 DOI: 10.1371/journal.pcbi.1001075] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2010] [Accepted: 01/04/2011] [Indexed: 11/29/2022] Open
Abstract
Comparison of protein structures is important for revealing the evolutionary relationship among proteins, predicting protein functions and predicting protein structures. Many methods have been developed in the past to align two or multiple protein structures. Despite the importance of this problem, rigorous mathematical or statistical frameworks have seldom been pursued for general protein structure comparison. One notable issue in this field is that with many different distances used to measure the similarity between protein structures, none of them are proper distances when protein structures of different sequences are compared. Statistical approaches based on those non-proper distances or similarity scores as random variables are thus not mathematically rigorous. In this work, we develop a mathematical framework for protein structure comparison by treating protein structures as three-dimensional curves. Using an elastic Riemannian metric on spaces of curves, geodesic distance, a proper distance on spaces of curves, can be computed for any two protein structures. In this framework, protein structures can be treated as random variables on the shape manifold, and means and covariance can be computed for populations of protein structures. Furthermore, these moments can be used to build Gaussian-type probability distributions of protein structures for use in hypothesis testing. The covariance of a population of protein structures can reveal the population-specific variations and be helpful in improving structure classification. With curves representing protein structures, the matching is performed using elastic shape analysis of curves, which can effectively model conformational changes and insertions/deletions. We show that our method performs comparably with commonly used methods in protein structure classification on a large manually annotated data set. Protein structure comparison is important for understanding the evolutionary relationships among proteins, predicting protein functions, and predicting protein structures. Despite its importance, there have been no rigorous mathematical or statistical frameworks for protein structure comparison. One notable issue in this field is that with many different similarity measures used in comparing protein structures, none of them are proper distances when protein structures of different sequences are compared. In this study, we develop a mathematical framework for protein structure comparison by treating protein structures as three dimensional curves. A formal distance, geodesic distance, can be computed for any two protein structures. In this framework, population-specific variations within protein families can be characterized through building probability distributions for structures of protein families. The mean and covariance computed from groups of protein structures can also help to improve the classifications of protein structures. With curves representing protein structures, the matching is performed using elastic shape analysis of curves, which can effectively model conformational changes and insertions/deletions.
Collapse
Affiliation(s)
- Wei Liu
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
| | - Anuj Srivastava
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
- * E-mail: (AS); (JZ)
| | - Jinfeng Zhang
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
- * E-mail: (AS); (JZ)
| |
Collapse
|
21
|
Wohlers I, Domingues FS, Klau GW. Towards optimal alignment of protein structure distance matrices. Bioinformatics 2010; 26:2273-80. [PMID: 20639543 DOI: 10.1093/bioinformatics/btq420] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Affiliation(s)
- Inken Wohlers
- CWI, Life Sciences Group, Amsterdam, The Netherlands.
| | | | | |
Collapse
|