1
|
Feidakis CP, Krivak R, Hoksza D, Novotny M. AHoJ-DB: A PDB-wide Assignment of apo & holo Relationships Based on Individual Protein-Ligand Interactions. J Mol Biol 2024; 436:168545. [PMID: 38508305 DOI: 10.1016/j.jmb.2024.168545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 03/12/2024] [Accepted: 03/14/2024] [Indexed: 03/22/2024]
Abstract
A single protein structure is rarely sufficient to capture the conformational variability of a protein. Both bound and unbound (holo and apo) forms of a protein are essential for understanding its geometry and making meaningful comparisons. Nevertheless, docking or drug design studies often still consider only single protein structures in their holo form, which are for the most part rigid. With the recent explosion in the field of structural biology, large, curated datasets are urgently needed. Here, we use a previously developed application (AHoJ) to perform a comprehensive search for apo-holo pairs for 468,293 biologically relevant protein-ligand interactions across 27,983 proteins. In each search, the binding pocket is captured and mapped across existing structures within the same UniProt, and the mapped pockets are annotated as apo or holo, based on the presence or absence of ligands. We assemble the results into a database, AHoJ-DB (www.apoholo.cz/db), that captures the variability of proteins with identical sequences, thereby exposing the agents responsible for the observed differences in geometry. We report several metrics for each annotated pocket, and we also include binding pockets that form at the interface of multiple chains. Analysis of the database shows that about 24% of the binding sites occur at the interface of two or more chains and that less than 50% of the total binding sites processed have an apo form in the PDB. These results can be used to train and evaluate predictors, discover potentially druggable proteins, and reveal protein- and ligand-specific relationships that were previously obscured by intermittent or partial data. Availability: www.apoholo.cz/db.
Collapse
Affiliation(s)
- Christos P Feidakis
- Department of Cell Biology, Faculty of Science, Charles University, Prague 12843, Czech Republic.
| | - Radoslav Krivak
- Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Prague 12116, Czech Republic; Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague 16000, Czech Republic
| | - David Hoksza
- Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Prague 12116, Czech Republic
| | - Marian Novotny
- Department of Cell Biology, Faculty of Science, Charles University, Prague 12843, Czech Republic.
| |
Collapse
|
2
|
D3PM: a comprehensive database for protein motions ranging from residue to domain. BMC Bioinformatics 2022; 23:70. [PMID: 35164668 PMCID: PMC8845362 DOI: 10.1186/s12859-022-04595-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 02/01/2022] [Indexed: 11/24/2022] Open
Abstract
Background Knowledge of protein motions is significant to understand its functions. While currently available databases for protein motions are mostly focused on overall domain motions, little attention is paid on local residue motions. Albeit with relatively small scale, the local residue motions, especially those residues in binding pockets, may play crucial roles in protein functioning and ligands binding. Results A comprehensive protein motion database, namely D3PM, was constructed in this study to facilitate the analysis of protein motions. The protein motions in the D3PM range from overall structural changes of macromolecule to local flip motions of binding pocket residues. Currently, the D3PM has collected 7679 proteins with overall motions and 3513 proteins with pocket residue motions. The motion patterns are classified into 4 types of overall structural changes and 5 types of pocket residue motions. Impressively, we found that less than 15% of protein pairs have obvious overall conformational adaptations induced by ligand binding, while more than 50% of protein pairs have significant structural changes in ligand binding sites, indicating that ligand-induced conformational changes are drastic and mainly confined around ligand binding sites. Based on the residue preference in binding pocket, we classified amino acids into “pocketphilic” and “pocketphobic” residues, which should be helpful for pocket prediction and drug design. Conclusion D3PM is a comprehensive database about protein motions ranging from residue to domain, which should be useful for exploring diverse protein motions and for understanding protein function and drug design. The D3PM is available on www.d3pharma.com/D3PM/index.php. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04595-0.
Collapse
|
3
|
Inokuma Y, Inaba Y. Polyketone-Based Molecular Ropes as Versatile Components for Functional Materials. BULLETIN OF THE CHEMICAL SOCIETY OF JAPAN 2021. [DOI: 10.1246/bcsj.20210223] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Yasuhide Inokuma
- Division of Applied Chemistry, Faculty of Engineering, Hokkaido University, Kita-ku, Sapporo, Hokkaido 060-8628, Japan
- Institute for Chemical Reaction Design and Discovery (ICReDD), Hokkaido University, Kita-ku, Sapporo, Hokkaido 001-0021, Japan
| | - Yuya Inaba
- Division of Applied Chemistry, Faculty of Engineering, Hokkaido University, Kita-ku, Sapporo, Hokkaido 060-8628, Japan
| |
Collapse
|
4
|
Masrati G, Landau M, Ben-Tal N, Lupas A, Kosloff M, Kosinski J. Integrative Structural Biology in the Era of Accurate Structure Prediction. J Mol Biol 2021; 433:167127. [PMID: 34224746 DOI: 10.1016/j.jmb.2021.167127] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 06/28/2021] [Accepted: 06/28/2021] [Indexed: 11/16/2022]
Abstract
Characterizing the three-dimensional structure of macromolecules is central to understanding their function. Traditionally, structures of proteins and their complexes have been determined using experimental techniques such as X-ray crystallography, NMR, or cryo-electron microscopy-applied individually or in an integrative manner. Meanwhile, however, computational methods for protein structure prediction have been improving their accuracy, gradually, then suddenly, with the breakthrough advance by AlphaFold2, whose models of monomeric proteins are often as accurate as experimental structures. This breakthrough foreshadows a new era of computational methods that can build accurate models for most monomeric proteins. Here, we envision how such accurate modeling methods can combine with experimental structural biology techniques, enhancing integrative structural biology. We highlight the challenges that arise when considering multiple structural conformations, protein complexes, and polymorphic assemblies. These challenges will motivate further developments, both in modeling programs and in methods to solve experimental structures, towards better and quicker investigation of structure-function relationships.
Collapse
Affiliation(s)
- Gal Masrati
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Meytal Landau
- Department of Biology, Technion-Israel Institute of Technology, Haifa 3200003, Israel; European Molecular Biology Laboratory (EMBL), Hamburg 22607, Germany
| | - Nir Ben-Tal
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Andrei Lupas
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany.
| | - Mickey Kosloff
- Department of Human Biology, Faculty of Natural Sciences, University of Haifa, 199 Aba Khoushy Ave., Mt. Carmel, 3498838 Haifa, Israel.
| | - Jan Kosinski
- European Molecular Biology Laboratory (EMBL), Hamburg 22607, Germany; Centre for Structural Systems Biology (CSSB), Hamburg 22607, Germany; Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany.
| |
Collapse
|
5
|
Madhurima K, Nandi B, Sekhar A. Metamorphic proteins: the Janus proteins of structural biology. Open Biol 2021; 11:210012. [PMID: 33878950 PMCID: PMC8059507 DOI: 10.1098/rsob.210012] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The structural paradigm that the sequence of a protein encodes for a unique three-dimensional native fold does not acknowledge the intrinsic plasticity encapsulated in conformational free energy landscapes. Metamorphic proteins are a recently discovered class of biomolecules that illustrate this plasticity by folding into at least two distinct native state structures of comparable stability in the absence of ligands or cofactors to facilitate fold-switching. The expanding list of metamorphic proteins clearly shows that these proteins are not mere aberrations in protein evolution, but may have actually been a consequence of distinctive patterns in selection pressure such as those found in virus–host co-evolution. In this review, we describe the structure–function relationships observed in well-studied metamorphic protein systems, with specific focus on how functional residues are sequestered or exposed in the two folds of the protein. We also discuss the implications of metamorphosis for protein evolution and the efforts that are underway to predict metamorphic systems from sequence properties alone.
Collapse
Affiliation(s)
- Kulkarni Madhurima
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560 012, India
| | - Bodhisatwa Nandi
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560 012, India
| | - Ashok Sekhar
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560 012, India
| |
Collapse
|
6
|
Sedova M, Jaroszewski L, Iyer M, Li Z, Godzik A. ModFlex: Towards Function Focused Protein Modeling. J Mol Biol 2021; 433:166828. [PMID: 33972023 DOI: 10.1016/j.jmb.2021.166828] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 01/07/2021] [Accepted: 01/09/2021] [Indexed: 11/19/2022]
Abstract
There is a wide, and continuously widening, gap between the number of proteins known only by their amino acid sequence versus those structurally characterized by direct experiment. To close this gap, we mostly rely on homology-based inference and modeling to reason about the structures of the uncharacterized proteins by using structures of homologous proteins as templates. With the rapidly growing size of the Protein Data Bank, there are often multiple choices of templates, including multiple sets of coordinates from the same protein. The substantial conformational differences observed between different experimental structures of the same protein often reflect function related structural flexibility. Thus, depending on the questions being asked, using distant homologs, or coordinate sets with lower resolution but solved in the appropriate functional form, as templates may be more informative. The ModFlex server (https://modflex.org/) addresses this seldom mentioned gap in the standard homology modeling approach by providing the user with an interface with multiple options and tools to select the most relevant template and explore the range of structural diversity in the available templates. ModFlex is closely integrated with a range of other programs and servers developed in our group for the analysis and visualization of protein structural flexibility and divergence.
Collapse
Affiliation(s)
- Mayya Sedova
- University of California Riverside School of Medicine, Biosciences Division, Riverside, CA, United States
| | - Lukasz Jaroszewski
- University of California Riverside School of Medicine, Biosciences Division, Riverside, CA, United States
| | - Mallika Iyer
- Graduate School of Biomedical Sciences, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, United States
| | - Zhanwen Li
- University of California Riverside School of Medicine, Biosciences Division, Riverside, CA, United States
| | - Adam Godzik
- University of California Riverside School of Medicine, Biosciences Division, Riverside, CA, United States.
| |
Collapse
|
7
|
De Las Rivas J, Bonavides-Martínez C, Campos-Laborie FJ. Bioinformatics in Latin America and SoIBio impact, a tale of spin-off and expansion around genomes and protein structures. Brief Bioinform 2019; 20:390-397. [PMID: 28981567 PMCID: PMC6433739 DOI: 10.1093/bib/bbx064] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2017] [Revised: 04/18/2017] [Indexed: 11/30/2022] Open
Abstract
Owing to the emerging impact of bioinformatics and computational biology, in this article, we present an overview of the history and current state of the research on this field in Latin America (LA). It will be difficult to cover without inequality all the efforts, initiatives and works that have happened for the past two decades in this vast region (that includes >19 million km2 and >600 million people). Despite the difficulty, we have done an analytical search looking for publications in the field made by researchers from 19 LA countries in the past 25 years. In this way, we find that research in bioinformatics in this region should develop twice to approach the average world scientific production in the field. We also found some of the pioneering scientists who initiated and led bioinformatics in the region and were promoters of this new scientific field. Our analysis also reveals that spin-off began around some specific areas within the biomolecular sciences: studies on genomes (anchored in the new generation of deep sequencing technologies, followed by developments in proteomics) and studies on protein structures (supported by three-dimensional structural determination technologies and their computational advancement). Finally, we show that the contribution to this endeavour of the Iberoamerican Society for Bioinformatics, founded in Mexico in 2009, has been significant, as it is a leading forum to join efforts of many scientists from LA interested in promoting research, training and education in bioinformatics.
Collapse
Affiliation(s)
- Javier De Las Rivas
- CSIC and Universidad de Salamanca, Bioinformatics and Functional Genomics Group, Cancer Research Center (IMBCC, CSIC/USAL/IBSAL), Salamanca, Spain
- Corresponding author. Javier De Las Rivas, Bioinformatics and Functional Genomics Group, Cancer Research Center (IMBCC, CSIC/USAL/IBSAL), Consejo Superior de Investigaciones Científicas (CSIC) and Universidad de Salamanca (USAL), Campus Miguel de Unamuno s/n, Salamanca 37007, Spain. Tel.: +34 923294819; Fax: +34923294743; E-mail:
| | - Cesar Bonavides-Martínez
- Universidad Nacional Autonoma de Mexico, Computational Genomics, Centro de Ciencias Genómicas, Cuernavaca, Morelos, Mexico
| | - Francisco Jose Campos-Laborie
- CSIC and Universidad de Salamanca, Bioinformatics and Functional Genomics Group, Cancer Research Center (IMBCC, CSIC/USAL/IBSAL), Salamanca, Spain
| |
Collapse
|
8
|
Marks C, Shi J, Deane CM. Predicting loop conformational ensembles. Bioinformatics 2018; 34:949-956. [PMID: 29136084 DOI: 10.1093/bioinformatics/btx718] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Accepted: 11/09/2017] [Indexed: 12/23/2022] Open
Abstract
Motivation Protein function is often facilitated by the existence of multiple stable conformations. Structure prediction algorithms need to be able to model these different conformations accurately and produce an ensemble of structures that represent a target's conformational diversity rather than just a single state. Here, we investigate whether current loop prediction algorithms are capable of this. We use the algorithms to predict the structures of loops with multiple experimentally determined conformations, and the structures of loops with only one conformation, and assess their ability to generate and select decoys that are close to any, or all, of the observed structures. Results We find that while loops with only one known conformation are predicted well, conformationally diverse loops are modelled poorly, and in most cases the predictions returned by the methods do not resemble any of the known conformers. Our results contradict the often-held assumption that multiple native conformations will be present in the decoy set, making the production of accurate conformational ensembles impossible, and hence indicating that current methodologies are not well suited to prediction of conformationally diverse, often functionally important protein regions. Contact marks@stats.ox.ac.uk. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Claire Marks
- Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
| | - Jiye Shi
- Department of Chemistry, UCB Pharma, Slough SL1 3WE, UK
| | | |
Collapse
|
9
|
Abriata LA. Structural database resources for biological macromolecules. Brief Bioinform 2017; 18:659-669. [PMID: 27273290 DOI: 10.1093/bib/bbw049] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Indexed: 12/30/2022] Open
Abstract
This Briefing reviews the widely used, currently active, up-to-date databases derived from the worldwide Protein Data Bank (PDB) to facilitate browsing, finding and exploring its entries. These databases contain visualization and analysis tools tailored to specific kinds of molecules and interactions, often including also complex metrics precomputed by experts or external programs, and connections to sequence and functional annotation databases. Importantly, updates of most of these databases involves steps of curation and error checks based on specific expertise about the subject molecules or interactions, and removal of sequence redundancy, both leading to better data sets for mining studies compared with the full list of raw PDB entries. The article presents the databases in groups such as those aimed to facilitate browsing through PDB entries, their molecules and their general information, those built to link protein structure with sequence and dynamics, those specific for transmembrane proteins, nucleic acids, interactions of biomacromolecules with each other and with small molecules or metal ions, and those concerning specific structural features or specific protein families. A few webservers directly connected to active databases, and a few databases that have been discontinued but would be important to have back, are also briefly commented on. Along the Briefing, sample cases where these databases have been used to aid structural studies or advance our knowledge about biological macromolecules are referenced. A few specific examples are also given where using these databases is easier and more informative than using raw PDB data.
Collapse
|
10
|
Chang CW, Chou CW, Chang DTH. CCProf: exploring conformational change profile of proteins. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw029. [PMID: 27016699 PMCID: PMC4808249 DOI: 10.1093/database/baw029] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2015] [Accepted: 02/23/2016] [Indexed: 12/18/2022]
Abstract
In many biological processes, proteins have important interactions with various molecules such as proteins, ions or ligands. Many proteins undergo conformational changes upon these interactions, where regions with large conformational changes are critical to the interactions. This work presents the CCProf platform, which provides conformational changes of entire proteins, named conformational change profile (CCP) in the context. CCProf aims to be a platform where users can study potential causes of novel conformational changes. It provides 10 biological features, including conformational change, potential binding target site, secondary structure, conservation, disorder propensity, hydropathy propensity, sequence domain, structural domain, phosphorylation site and catalytic site. All these information are integrated into a well-aligned view, so that researchers can capture important relevance between different biological features visually. The CCProf contains 986 187 protein structure pairs for 3123 proteins. In addition, CCProf provides a 3D view in which users can see the protein structures before and after conformational changes as well as binding targets that induce conformational changes. All information (e.g. CCP, binding targets and protein structures) shown in CCProf, including intermediate data are available for download to expedite further analyses. Database URL: http://zoro.ee.ncku.edu.tw/ccprof/
Collapse
Affiliation(s)
- Che-Wei Chang
- Department of Electrical Engineering, National Cheng Kung University, Tainan, 70101, Taiwan
| | - Chai-Wei Chou
- Department of Electrical Engineering, National Cheng Kung University, Tainan, 70101, Taiwan
| | - Darby Tien-Hao Chang
- Department of Electrical Engineering, National Cheng Kung University, Tainan, 70101, Taiwan
| |
Collapse
|
11
|
Bietz S, Rarey M. SIENA: Efficient Compilation of Selective Protein Binding Site Ensembles. J Chem Inf Model 2016; 56:248-59. [PMID: 26759067 DOI: 10.1021/acs.jcim.5b00588] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Structural flexibility of proteins has an important influence on molecular recognition and enzymatic function. In modeling, structure ensembles are therefore often applied as a valuable source of alternative protein conformations. However, their usage is often complicated by structural artifacts and inconsistent data annotation. Here, we present SIENA, a new computational approach for the automated assembly and preprocessing of protein binding site ensembles. Starting with an arbitrarily defined binding site in a single protein structure, SIENA searches for alternative conformations of the same or sequentially closely related binding sites. The method is based on an indexed database for identifying perfect k-mer matches and a recently published algorithm for the alignment of protein binding site conformations. Furthermore, SIENA provides a new algorithm for the interaction-based selection of binding site conformations which aims at covering all known ligand-binding geometries. Various experiments highlight that SIENA is able to generate comprehensive and well selected binding site ensembles improving the compatibility to both known and unconsidered ligand molecules. Starting with the whole PDB as data source, the computation time of the whole ensemble generation takes only a few seconds. SIENA is available via a Web service at www.zbh.uni-hamburg.de/siena .
Collapse
Affiliation(s)
- Stefan Bietz
- Center for Bioinformatics, University of Hamburg , Bundesstrasse 43, 20146 Hamburg, Germany
| | - Matthias Rarey
- Center for Bioinformatics, University of Hamburg , Bundesstrasse 43, 20146 Hamburg, Germany
| |
Collapse
|
12
|
Hrabe T, Li Z, Sedova M, Rotkiewicz P, Jaroszewski L, Godzik A. PDBFlex: exploring flexibility in protein structures. Nucleic Acids Res 2015; 44:D423-8. [PMID: 26615193 PMCID: PMC4702920 DOI: 10.1093/nar/gkv1316] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2015] [Accepted: 11/10/2015] [Indexed: 12/16/2022] Open
Abstract
The PDBFlex database, available freely and with no login requirements at http://pdbflex.org, provides information on flexibility of protein structures as revealed by the analysis of variations between depositions of different structural models of the same protein in the Protein Data Bank (PDB). PDBFlex collects information on all instances of such depositions, identifying them by a 95% sequence identity threshold, performs analysis of their structural differences and clusters them according to their structural similarities for easy analysis. The PDBFlex contains tools and viewers enabling in-depth examination of structural variability including: 2D-scaling visualization of RMSD distances between structures of the same protein, graphs of average local RMSD in the aligned structures of protein chains, graphical presentation of differences in secondary structure and observed structural disorder (unresolved residues), difference distance maps between all sets of coordinates and 3D views of individual structures and simulated transitions between different conformations, the latter displayed using JSMol visualization software.
Collapse
Affiliation(s)
- Thomas Hrabe
- Bioinformatics and Systems Biology Program, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Zhanwen Li
- Bioinformatics and Systems Biology Program, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Mayya Sedova
- Bioinformatics and Systems Biology Program, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Piotr Rotkiewicz
- Bioinformatics and Systems Biology Program, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Lukasz Jaroszewski
- Bioinformatics and Systems Biology Program, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Adam Godzik
- Bioinformatics and Systems Biology Program, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| |
Collapse
|
13
|
ConTemplate Suggests Possible Alternative Conformations for a Query Protein of Known Structure. Structure 2015; 23:2162-70. [DOI: 10.1016/j.str.2015.08.018] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 07/31/2015] [Accepted: 08/24/2015] [Indexed: 10/22/2022]
|
14
|
Li W, Kinch LN, Karplus PA, Grishin NV. ChSeq: A database of chameleon sequences. Protein Sci 2015; 24:1075-86. [PMID: 25970262 DOI: 10.1002/pro.2689] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2015] [Revised: 04/15/2015] [Accepted: 04/24/2015] [Indexed: 11/11/2022]
Abstract
Chameleon sequences (ChSeqs) refer to sequence strings of identical amino acids that can adopt different conformations in protein structures. Researchers have detected and studied ChSeqs to understand the interplay between local and global interactions in protein structure formation. The different secondary structures adopted by one ChSeq challenge sequence-based secondary structure predictors. With increasing numbers of available Protein Data Bank structures, we here identify a large set of ChSeqs ranging from 6 to 10 residues in length. The homologous ChSeqs discovered highlight the structural plasticity involved in biological function. When compared with previous studies, the set of unrelated ChSeqs found represents an about 20-fold increase in the number of detected sequences, as well as an increase in the longest ChSeq length from 8 to 10 residues. We applied secondary structure predictors on our ChSeqs and found that methods based on a sequence profile outperformed methods based on a single sequence. For the unrelated ChSeqs, the evolutionary information provided by the sequence profile typically allows successful prediction of the prevailing secondary structure adopted in each protein family. Our dataset will facilitate future studies of ChSeqs, as well as interpretations of the interplay between local and nonlocal interactions. A user-friendly web interface for this ChSeq database is available at prodata.swmed.edu/chseq.
Collapse
Affiliation(s)
- Wenlin Li
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, 75390-9050.,Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, 75390-9050
| | - Lisa N Kinch
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, 75390-9050
| | - P Andrew Karplus
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon, 97331
| | - Nick V Grishin
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, 75390-9050.,Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, 75390-9050.,Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, 75390-9050
| |
Collapse
|
15
|
Acharya C, Kufareva I, Ilatovskiy AV, Abagyan R. PeptiSite: a structural database of peptide binding sites in 4D. Biochem Biophys Res Commun 2014; 445:717-23. [PMID: 24406170 DOI: 10.1016/j.bbrc.2013.12.132] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2013] [Accepted: 12/26/2013] [Indexed: 12/11/2022]
Abstract
We developed PeptiSite, a comprehensive and reliable database of biologically and structurally characterized peptide-binding sites, in which each site is represented by an ensemble of its complexes with protein, peptide and small molecule partners. The unique features of the database include: (1) the ensemble site representation that provides a fourth dimension to the otherwise three dimensional data, (2) comprehensive characterization of the binding site architecture that may consist of a multimeric protein assembly with cofactors and metal ions and (3) analysis of consensus interaction motifs within the ensembles and identification of conserved determinants of these interactions. Currently the database contains 585 proteins with 650 peptide-binding sites. http://peptisite.ucsd.edu/ link allows searching for the sites of interest and interactive visualization of the ensembles using the ActiveICM web-browser plugin. This structural database for protein-peptide interactions enables understanding of structural principles of these interactions and may assist the development of an efficient peptide docking benchmark.
Collapse
Affiliation(s)
- Chayan Acharya
- UCSD, Skaggs School of Pharmacy and Pharmaceutical Sciences, La Jolla, CA 92093, USA
| | - Irina Kufareva
- UCSD, Skaggs School of Pharmacy and Pharmaceutical Sciences, La Jolla, CA 92093, USA
| | - Andrey V Ilatovskiy
- UCSD, Skaggs School of Pharmacy and Pharmaceutical Sciences, La Jolla, CA 92093, USA; Division of Molecular and Radiation Biophysics, Petersburg Nuclear Physics Institute, Gatchina 188300, Russia; Research and Education Center "Biophysics", PNPI and St. Petersburg State Polytechnical University, St. Petersburg 195251, Russia
| | - Ruben Abagyan
- UCSD, Skaggs School of Pharmacy and Pharmaceutical Sciences, La Jolla, CA 92093, USA.
| |
Collapse
|
16
|
Monzon AM, Juritz E, Fornasari MS, Parisi G. CoDNaS: a database of conformational diversity in the native state of proteins. Bioinformatics 2013; 29:2512-4. [DOI: 10.1093/bioinformatics/btt405] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
|
17
|
Palopoli N, Lanzarotti E, Parisi G. BeEP Server: Using evolutionary information for quality assessment of protein structure models. Nucleic Acids Res 2013; 41:W398-405. [PMID: 23729471 PMCID: PMC3692104 DOI: 10.1093/nar/gkt453] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
The BeEP Server (http://www.embnet.qb.fcen.uba.ar/embnet/beep.php) is an online resource aimed to help in the endgame of protein structure prediction. It is able to rank submitted structural models of a protein through an explicit use of evolutionary information, a criterion differing from structural or energetic considerations commonly used in other assessment programs. The idea behind BeEP (Best Evolutionary Pattern) is to benefit from the substitution pattern derived from structural constraints present in a set of homologous proteins adopting a given protein conformation. The BeEP method uses a model of protein evolution that takes into account the structure of a protein to build site-specific substitution matrices. The suitability of these substitution matrices is assessed through maximum likelihood calculations from which position-specific and global scores can be derived. These scores estimate how well the structural constraints derived from each structural model are represented in a sequence alignment of homologous proteins. Our assessment on a subset of proteins from the Critical Assessment of techniques for protein Structure Prediction (CASP) experiment has shown that BeEP is capable of discriminating the models and selecting one or more native-like structures. Moreover, BeEP is not explicitly parameterized to find structural similarities between models and given targets, potentially helping to explore the conformational ensemble of the native state.
Collapse
Affiliation(s)
- Nicolas Palopoli
- Departamento de Ciencia y Tecnologia, Universidad Nacional de Quilmes, B1876BXD, Bernal, Buenos Aires, Argentina, Centre for Biological Sciences, University of Southampton, SO17 1BJ, Southampton, UK and Departamento de Quimica Biologica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, C1428EHA, Buenos Aires, Argentina
| | - Esteban Lanzarotti
- Departamento de Ciencia y Tecnologia, Universidad Nacional de Quilmes, B1876BXD, Bernal, Buenos Aires, Argentina, Centre for Biological Sciences, University of Southampton, SO17 1BJ, Southampton, UK and Departamento de Quimica Biologica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, C1428EHA, Buenos Aires, Argentina
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnologia, Universidad Nacional de Quilmes, B1876BXD, Bernal, Buenos Aires, Argentina, Centre for Biological Sciences, University of Southampton, SO17 1BJ, Southampton, UK and Departamento de Quimica Biologica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, C1428EHA, Buenos Aires, Argentina
- *To whom correspondence should be addressed. Tel: +54 011 43657100 (ext. 4135); Fax: +54 011 437657101;
| |
Collapse
|
18
|
Sikosek T, Bornberg-Bauer E, Chan HS. Evolutionary dynamics on protein bi-stability landscapes can potentially resolve adaptive conflicts. PLoS Comput Biol 2012; 8:e1002659. [PMID: 23028272 PMCID: PMC3441461 DOI: 10.1371/journal.pcbi.1002659] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2012] [Accepted: 07/12/2012] [Indexed: 11/18/2022] Open
Abstract
Experimental studies have shown that some proteins exist in two alternative native-state conformations. It has been proposed that such bi-stable proteins can potentially function as evolutionary bridges at the interface between two neutral networks of protein sequences that fold uniquely into the two different native conformations. Under adaptive conflict scenarios, bi-stable proteins may be of particular advantage if they simultaneously provide two beneficial biological functions. However, computational models that simulate protein structure evolution do not yet recognize the importance of bi-stability. Here we use a biophysical model to analyze sequence space to identify bi-stable or multi-stable proteins with two or more equally stable native-state structures. The inclusion of such proteins enhances phenotype connectivity between neutral networks in sequence space. Consideration of the sequence space neighborhood of bridge proteins revealed that bi-stability decreases gradually with each mutation that takes the sequence further away from an exactly bi-stable protein. With relaxed selection pressures, we found that bi-stable proteins in our model are highly successful under simulated adaptive conflict. Inspired by these model predictions, we developed a method to identify real proteins in the PDB with bridge-like properties, and have verified a clear bi-stability gradient for a series of mutants studied by Alexander et al. (Proc Nat Acad Sci USA 2009, 106:21149–21154) that connect two sequences that fold uniquely into two different native structures via a bridge-like intermediate mutant sequence. Based on these findings, new testable predictions for future studies on protein bi-stability and evolution are discussed. Proteins are essential molecules for performing a majority of functions in all biological systems. These functions often depend on the three-dimensional structures of proteins. Here, we investigate a fundamental question in molecular evolution: how can proteins acquire new advantageous structures via mutations while not sacrificing their existing structures that are still needed? Some authors have suggested that the same protein may adopt two or more alternative structures, switch between them and thus perform different functions with each of the alternative structures. Intuitively, such a protein could provide an evolutionary compromise between conflicting demands for existing and new protein structures. Yet no theoretical study has systematically tackled the biophysical basis of such compromises during evolutionary processes. Here we devise a model of evolution that specifically recognizes protein molecules that can exist in several different stable structures. Our model demonstrates that proteins can indeed utilize multiple structures to satisfy conflicting evolutionary requirements. In light of these results, we identify data from known protein structures that are consistent with our predictions and suggest novel directions for future investigation.
Collapse
Affiliation(s)
- Tobias Sikosek
- Evolutionary Bioinformatics Group, Institute for Evolution and Biodiversity, University of Münster, Münster, Germany.
| | | | | |
Collapse
|
19
|
Juritz E, Fornasari MS, Martelli PL, Fariselli P, Casadio R, Parisi G. On the effect of protein conformation diversity in discriminating among neutral and disease related single amino acid substitutions. BMC Genomics 2012; 13 Suppl 4:S5. [PMID: 22759653 PMCID: PMC3303731 DOI: 10.1186/1471-2164-13-s4-s5] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Non-synonymous coding SNPs (nsSNPs) that are associated to disease can also be related with alterations in protein stability. Computational methods are available to predict the effect of single amino acid substitutions (SASs) on protein stability based on a single folded structure. However, the native state of a protein is not unique and it is better represented by the ensemble of its conformers in dynamic equilibrium. The maintenance of the ensemble is essential for protein function. In this work we investigated how protein conformational diversity can affect the discrimination of neutral and disease related SASs based on protein stability estimations. For this purpose, we used 119 proteins with 803 associated SASs, 60% of which are disease related. Each protein was associated with its corresponding set of available conformers as found in the Protein Conformational Database (PCDB). Our dataset contains proteins with different extensions of conformational diversity summing up a total number of 1023 conformers. RESULTS The existence of different conformers for a given protein introduces great variability in the estimation of the protein stability (ΔΔG) after a single amino acid substitution (SAS) as computed with FoldX. Indeed, in 35% of our protein set at least one SAS can be described as stabilizing, destabilizing or neutral when a cutoff value of ±2 kcal/mol is adopted for discriminating neutral from perturbing SASs. However, when the ΔΔG variability among conformers is taken into account, the correlation among the perturbation of protein stability and the corresponding disease or neutral phenotype increases as compared with the same analysis on single protein structures. At the conformer level, we also found that the different conformers correlate in a different way to the corresponding phenotype. CONCLUSIONS Our results suggest that the consideration of conformational diversity can improve the discrimination of neutral and disease related protein SASs based on the evaluation of the corresponding Gibbs free energy change.
Collapse
Affiliation(s)
- Ezequiel Juritz
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Buenos Aires, Argentina
| | - Maria Silvina Fornasari
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Buenos Aires, Argentina
| | | | - Piero Fariselli
- Biocomputing Group, Department of Computer Science, University of Bologna, Italy
| | - Rita Casadio
- Biocomputing Group, Department of Biology, University of Bologna, Italy
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Buenos Aires, Argentina
| |
Collapse
|
20
|
Swapna LS, Mahajan S, de Brevern AG, Srinivasan N. Comparison of tertiary structures of proteins in protein-protein complexes with unbound forms suggests prevalence of allostery in signalling proteins. BMC STRUCTURAL BIOLOGY 2012; 12:6. [PMID: 22554255 PMCID: PMC3427047 DOI: 10.1186/1472-6807-12-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2011] [Accepted: 04/05/2012] [Indexed: 12/31/2022]
Abstract
BACKGROUND Most signalling and regulatory proteins participate in transient protein-protein interactions during biological processes. They usually serve as key regulators of various cellular processes and are often stable in both protein-bound and unbound forms. Availability of high-resolution structures of their unbound and bound forms provides an opportunity to understand the molecular mechanisms involved. In this work, we have addressed the question "What is the nature, extent, location and functional significance of structural changes which are associated with formation of protein-protein complexes?" RESULTS A database of 76 non-redundant sets of high resolution 3-D structures of protein-protein complexes, representing diverse functions, and corresponding unbound forms, has been used in this analysis. Structural changes associated with protein-protein complexation have been investigated using structural measures and Protein Blocks description. Our study highlights that significant structural rearrangement occurs on binding at the interface as well as at regions away from the interface to form a highly specific, stable and functional complex. Notably, predominantly unaltered interfaces interact mainly with interfaces undergoing substantial structural alterations, revealing the presence of at least one structural regulatory component in every complex.Interestingly, about one-half of the number of complexes, comprising largely of signalling proteins, show substantial localized structural change at surfaces away from the interface. Normal mode analysis and available information on functions on some of these complexes suggests that many of these changes are allosteric. This change is largely manifest in the proteins whose interfaces are altered upon binding, implicating structural change as the possible trigger of allosteric effect. Although large-scale studies of allostery induced by small-molecule effectors are available in literature, this is, to our knowledge, the first study indicating the prevalence of allostery induced by protein effectors. CONCLUSIONS The enrichment of allosteric sites in signalling proteins, whose mutations commonly lead to diseases such as cancer, provides support for the usage of allosteric modulators in combating these diseases.
Collapse
Affiliation(s)
| | - Swapnil Mahajan
- Univ de la Réunion, UMR_S 665, F-97715, Saint-Denis, France
- INSERM, U 665, Saint-Denis, F-97715, France
| | - Alexandre G de Brevern
- INSERM, U 665 DSIMB, Paris, F-75739, France
- Univ Paris Diderot, Sorbonne Paris Cité, Paris, F- 75739, France
- INTS, F-75739, Paris, France
| | | |
Collapse
|
21
|
Juritz E, Palopoli N, Fornasari MS, Fernandez-Alberti S, Parisi G. Protein Conformational Diversity Modulates Sequence Divergence. Mol Biol Evol 2012; 30:79-87. [DOI: 10.1093/molbev/mss080] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
|
22
|
Kufareva I, Ilatovskiy AV, Abagyan R. Pocketome: an encyclopedia of small-molecule binding sites in 4D. Nucleic Acids Res 2012; 40:D535-40. [PMID: 22080553 PMCID: PMC3245087 DOI: 10.1093/nar/gkr825] [Citation(s) in RCA: 116] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2011] [Accepted: 09/18/2011] [Indexed: 11/22/2022] Open
Abstract
The importance of binding site plasticity in protein-ligand interactions is well-recognized, and so are the difficulties in predicting the nature and the degree of this plasticity by computational means. To assist in understanding the flexible protein-ligand interactions, we constructed the Pocketome, an encyclopedia of about one thousand experimentally solved conformational ensembles of druggable binding sites in proteins, grouped by location and consistent chain/cofactor composition. The multiplicity of pockets within the ensembles adds an extra, fourth dimension to the Pocketome entry data. Within each ensemble, the pockets were carefully classified by the degree of their pairwise similarity and compatibility with different ligands. The core of the Pocketome is derived regularly and automatically from the current releases of the Protein Data Bank and the Uniprot Knowledgebase; this core is complemented by entries built from manually provided seed ligand locations. The Pocketome website (www.pocketome.org) allows searching for the sites of interest, analysis of conformational clusters, important residues, binding compatibility matrices and interactive visualization of the ensembles using the ActiveICM web browser plugin. The Pocketome collection can be used to build multi-conformational docking and 3D activity models as well as to design cross-docking and virtual ligand screening benchmarks.
Collapse
Affiliation(s)
- Irina Kufareva
- UCSD Skaggs School of Pharmacy and Pharmaceutical Sciences, La Jolla, CA, 92093, USA, Division of Molecular and Radiation Biophysics, Petersburg Nuclear Physics Institute, Russian Academy of Sciences, Gatchina, 188300 and Research and Education Center ‘Biophysics’, PNPI RAS and St. Petersburg State Polytechnical University, St. Petersburg, 194064, Russia
| | - Andrey V. Ilatovskiy
- UCSD Skaggs School of Pharmacy and Pharmaceutical Sciences, La Jolla, CA, 92093, USA, Division of Molecular and Radiation Biophysics, Petersburg Nuclear Physics Institute, Russian Academy of Sciences, Gatchina, 188300 and Research and Education Center ‘Biophysics’, PNPI RAS and St. Petersburg State Polytechnical University, St. Petersburg, 194064, Russia
| | - Ruben Abagyan
- UCSD Skaggs School of Pharmacy and Pharmaceutical Sciences, La Jolla, CA, 92093, USA, Division of Molecular and Radiation Biophysics, Petersburg Nuclear Physics Institute, Russian Academy of Sciences, Gatchina, 188300 and Research and Education Center ‘Biophysics’, PNPI RAS and St. Petersburg State Polytechnical University, St. Petersburg, 194064, Russia
| |
Collapse
|
23
|
Galperin MY, Cochrane GR. The 2011 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection. Nucleic Acids Res 2011; 39:D1-6. [PMID: 21177655 PMCID: PMC3013748 DOI: 10.1093/nar/gkq1243] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
The current 18th Database Issue of Nucleic Acids Research features descriptions of 96 new and 83 updated online databases covering various areas of molecular biology. It includes two editorials, one that discusses COMBREX, a new exciting project aimed at figuring out the functions of the ‘conserved hypothetical’ proteins, and one concerning BioDBcore, a proposed description of the ‘minimal information about a biological database’. Papers from the members of the International Nucleotide Sequence Database collaboration (INSDC) describe each of the participating databases, DDBJ, ENA and GenBank, principles of data exchange within the collaboration, and the recently established Sequence Read Archive. A testament to the longevity of databases, this issue includes updates on the RNA modification database, Definition of Secondary Structure of Proteins (DSSP) and Homology-derived Secondary Structure of Proteins (HSSP) databases, which have not been featured here in >12 years. There is also a block of papers describing recent progress in protein structure databases, such as Protein DataBank (PDB), PDB in Europe (PDBe), CATH, SUPERFAMILY and others, as well as databases on protein structure modeling, protein–protein interactions and the organization of inter-protein contact sites. Other highlights include updates of the popular gene expression databases, GEO and ArrayExpress, several cancer gene databases and a detailed description of the UK PubMed Central project. The Nucleic Acids Research online Database Collection, available at: http://www.oxfordjournals.org/nar/database/a/, now lists 1330 carefully selected molecular biology databases. The full content of the Database Issue is freely available online at the Nucleic Acids Research web site (http://nar.oxfordjournals.org/).
Collapse
Affiliation(s)
- Michael Y Galperin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| | | |
Collapse
|