1
|
Singh K, Malik YS. ANN based prediction of ligand binding sites outside deep cavities to facilitate drug designing. Curr Res Struct Biol 2024; 7:100144. [PMID: 38681239 PMCID: PMC11047793 DOI: 10.1016/j.crstbi.2024.100144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Revised: 04/12/2024] [Accepted: 04/12/2024] [Indexed: 05/01/2024] Open
Abstract
The ever-changing environmental conditions and pollution are the prime reasons for the onset of several emerging and re-merging diseases. This demands the faster designing of new drugs to curb the deadly diseases in less waiting time to cure the animals and humans. Drug molecules interact with only protein surface on specific locations termed as ligand binding sites (LBS). Therefore, the knowledge of LBS is required for rational drug designing. Existing geometrical LBS prediction methods rely on search of cavities based on the fact that 83% of the LBS found in deep cavities, however, these methods usually fail where LBS localize outside deep cavities. To overcome this challenge, the present work provides an artificial neural network (ANN) based method to predict LBS outside deep cavities in animal proteins including human to facilitate drug designing. In the present work a feed-forward backpropagation neural network was trained by utilizing 38 structural, atomic, physiochemical, and evolutionary discriminant features of LBS and non-LBS residues localized in the extracted roughest patch on protein surface. The performance of this ANN based prediction method was found 76% better for those proteins where cavity subspace (extracted by MetaPocket 2.0, a consensus method) failed to predict LBS due to their localization outside the deep cavities. The prediction of LBS outside deep cavities will facilitate in drug designing for the proteins where it is not possible due to lack of LBS information as the geometrical LBS prediction methods rely on extraction of deep cavities.
Collapse
Affiliation(s)
- Kalpana Singh
- College of Animal Biotechnology, Guru Angad Dev Veterinary and Animal Sciences University, Ludhiana-141004, India
| | - Yashpal Singh Malik
- College of Animal Biotechnology, Guru Angad Dev Veterinary and Animal Sciences University, Ludhiana-141004, India
| |
Collapse
|
2
|
Singh K, Lahiri T. A new search subspace to compensate failure of cavity-based localization of ligand-binding sites. Comput Biol Chem 2017; 68:6-11. [DOI: 10.1016/j.compbiolchem.2017.01.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Revised: 04/27/2016] [Accepted: 01/30/2017] [Indexed: 10/20/2022]
|
3
|
Banerji A, Navare C. Fractal nature of protein surface roughness: a note on quantification of change of surface roughness in active sites, before and after binding. J Mol Recognit 2013; 26:201-14. [PMID: 23526774 DOI: 10.1002/jmr.2264] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2012] [Revised: 01/07/2013] [Accepted: 01/11/2013] [Indexed: 11/09/2022]
Abstract
Year 2010 marked the 25th year since we came to know that roughness of a protein surface has fractal symmetry. Ever since the publication of Lewis and Rees' paper, hundreds of works from a spectrum of perspectives have established that fractal dimension (FD) can be considered as a reliable marker that describes roughness of protein surface objectively. In this article, we introduce readers to the fundamentals of fractals and present categorical biophysical and geometrical reasons as to why FD-based constructs can describe protein surface roughness more accurately. We then review the commonality (and the lack of it) between numerous approaches that have attempted to investigate protein surface with fractal measures, before exploring the patterns in the results that they have produced. Apart from presenting the genealogy of approaches and results, we present an analysis that quantifies the difference in surface roughness in stretches of protein surface containing the active site, before and after binding to ligands, to underline the utility of FD-based measures further. It has been found that surface stretches containing the active site, in general, undergo a significant increment in its roughness after binding. After presenting the entire repertoire of FD-based surface roughness studies, we talk about two yet-unexplored problems where application of FD-based techniques can help in deciphering underlying patterns of surface interactions. Finally, we list the limitations of FD-based constructs and put down several precautions that one must take while working with them.
Collapse
Affiliation(s)
- Anirban Banerji
- Bioinformatics Centre, University of Pune, Pune, Maharashtra, India.
| | | |
Collapse
|
4
|
Ben-Shimon A, Eisenstein M. Computational mapping of anchoring spots on protein surfaces. J Mol Biol 2010; 402:259-77. [PMID: 20643147 DOI: 10.1016/j.jmb.2010.07.021] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2010] [Revised: 07/04/2010] [Accepted: 07/09/2010] [Indexed: 10/19/2022]
Abstract
Protein-protein and protein-peptide interactions are often controlled by few strong contacts that involve hot spot residues. Computational detection of such contacts, termed here anchoring spots, is important for understanding recognition processes and for predicting interactions; it is an essential step in designing interaction interfaces and therapeutic agents. We describe ANCHORSMAP, an algorithm for computational mapping of amino acid side chains on protein surfaces. The algorithm consists of two stages: A geometry based stage (LSMdet), in which sub-pockets adequate for binding single side chains are detected and amino acid probes are scattered near them, and an energy based stage in which optimal positions of the probes are determined through repeated energy minimization and clustering of nearby poses and their DeltaG are calculated. ANCHORSMAP employs a new function for DeltaG calculations, which is specifically designed for the context of protein-protein recognition by introducing a correction in the electrostatic energy term that compensates for the dielectric shielding exerted by a hypothetical protein bound to the probe. The algorithm successfully detects known anchoring sites and accurately positions the probes. The calculated DeltaG rank high the correct anchoring spots in maps produced for unbound proteins. We find that Arg, Trp, Glu and Tyr, which are favorite hot spot residues, are also more selective of their binding environment. The usefulness of anchoring spots mapping is demonstrated by detecting the binding surfaces in the protein-protein complex barnase/barstar and the protein-peptide complex kinase/PKI, and by identifying phenylalanine anchoring sites on the surface of the nuclear transporter NTF2, C-terminus anchors on PDZ domains and phenol anchors on thermolysin. Finally, we discuss the role of anchoring spots in molecular recognition processes.
Collapse
Affiliation(s)
- Avraham Ben-Shimon
- Department of Structural Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| | | |
Collapse
|
5
|
Pettit FK, Bare E, Tsai A, Bowie JU. HotPatch: a statistical approach to finding biologically relevant features on protein surfaces. J Mol Biol 2007; 369:863-79. [PMID: 17451744 PMCID: PMC2034327 DOI: 10.1016/j.jmb.2007.03.036] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2006] [Revised: 03/10/2007] [Accepted: 03/15/2007] [Indexed: 10/23/2022]
Abstract
We describe a fully automated algorithm for finding functional sites on protein structures. Our method finds surface patches of unusual physicochemical properties on protein structures, and estimates the patches' probability of overlapping functional sites. Other methods for predicting the locations of specific types of functional sites exist, but in previous analyses, it has been difficult to compare methods when they are applied to different types of sites. Thus, we introduce a new statistical framework that enables rigorous comparisons of the usefulness of different physicochemical properties for predicting virtually any kind of functional site. The program's statistical models were trained for 11 individual properties (electrostatics, concavity, hydrophobicity, etc.) and for 15 neural network combination properties, all optimized and tested on 15 diverse protein functions. To simulate what to expect if the program were run on proteins of unknown function, as might arise from structural genomics, we tested it on 618 proteins of diverse mixed functions. In the higher-scoring top half of all predictions, a functional residue could typically be found within the first 1.7 residues chosen at random. The program may or may not use partial information about the protein's function type as an input, depending on which statistical model the user chooses to employ. If function type is used as an additional constraint, prediction accuracy usually increases, and is particularly good for enzymes, DNA-interacting sites, and oligomeric interfaces. The program can be accessed online (at http://hotpatch.mbi.ucla.edu).
Collapse
Affiliation(s)
- Frank K. Pettit
- UCLA-DOE Institute for Genomics and Proteomics, Molecular Biology Institute, UCLA, Los Angeles, CA.
| | - Emiko Bare
- Department of Biology, Massachusettes Institute of Technology, Cambridge, MA.
| | - Albert Tsai
- Department of Biochemistry & Molecular Biology, Keck School of Medicine, University of Southern California, Los Angeles, CA.
| | - James U. Bowie
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA.
| |
Collapse
|
6
|
Nayal M, Honig B. On the nature of cavities on protein surfaces: Application to the identification of drug-binding sites. Proteins 2006; 63:892-906. [PMID: 16477622 DOI: 10.1002/prot.20897] [Citation(s) in RCA: 195] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
In this article we introduce a new method for the identification and the accurate characterization of protein surface cavities. The method is encoded in the program SCREEN (Surface Cavity REcognition and EvaluatioN). As a first test of the utility of our approach we used SCREEN to locate and analyze the surface cavities of a nonredundant set of 99 proteins cocrystallized with drugs. We find that this set of proteins has on average about 14 distinct cavities per protein. In all cases, a drug is bound at one (and sometimes more than one) of these cavities. Using cavity size alone as a criterion for predicting drug-binding sites yields a high balanced error rate of 15.7%, with only 71.7% coverage. Here we characterize each surface cavity by computing a comprehensive set of 408 physicochemical, structural, and geometric attributes. By applying modern machine learning techniques (Random Forests) we were able to develop a classifier that can identify drug-binding cavities with a balanced error rate of 7.2% and coverage of 88.9%. Only 18 of the 408 cavity attributes had a statistically significant role in the prediction. Of these 18 important attributes, almost all involved size and shape rather than physicochemical properties of the surface cavity. The implications of these results are discussed. A SCREEN Web server is available at http://interface.bioc.columbia.edu/screen.
Collapse
Affiliation(s)
- Murad Nayal
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, USA
| | | |
Collapse
|
7
|
Horng JC, Raleigh DP. phi-Values beyond the ribosomally encoded amino acids: kinetic and thermodynamic consequences of incorporating trifluoromethyl amino acids in a globular protein. J Am Chem Soc 2003; 125:9286-7. [PMID: 12889945 DOI: 10.1021/ja0353199] [Citation(s) in RCA: 59] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The consequences of the substitution of 4,4,4-trifluorovaline for valine on the folding kinetics and thermodynamics of a globular protein are presented. Variants of the N-terminal domain of L9, a small alpha-beta protein, were prepared in which V3 or V21 was replaced by trifluorovaline. CD and NMR demonstrate that the structure is not perturbed. Both are more stable, the V3 variant by 0.8 kcal mol-1 and the V21 variant by 1.4 kcal mol-1. The increase of stability is significantly larger than that observed in coiled-coils on a per trifluoromethyl group basis. Folding is two-state, and the variants both fold faster than the wild type. The Phi-values are 0.16 and 0.11, respectively.
Collapse
Affiliation(s)
- Jia-Cherng Horng
- Department of Chemistry, State University of New York at Stony Brook, Stony Brook, New York 11794-3400, USA
| | | |
Collapse
|
8
|
Abstract
Pharmaceutical design is usually directed at developing small molecules that can specifically bind and alter the activity of a target protein. Here, we show that high-affinity binding of small molecules requires a rough patch on a protein surface. Drug design strategies should therefore be targeted to rough areas on a protein. Our results indicate that the roughness of small functional sites may reflect the complex local shapes needed to fit specific interactions into small areas.
Collapse
Affiliation(s)
- F K Pettit
- Department of Chemistry and Biochemistry, and UCLA-DOE Laboratory of Structural Biology and Molecular Medicine, Los Angeles, CA, 90095, USA
| | | |
Collapse
|
9
|
Laskowski RA, Luscombe NM, Swindells MB, Thornton JM. Protein clefts in molecular recognition and function. Protein Sci 1996; 5:2438-52. [PMID: 8976552 PMCID: PMC2143314 DOI: 10.1002/pro.5560051206] [Citation(s) in RCA: 131] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
One of the primary factors determining how proteins interact with other molecules is the size of clefts in the protein's surface. In enzymes, for example, the active site is often characterized by a particularly large and deep cleft, while interactions between the molecules of a protein dimer tend to involve approximately planar surfaces. Here we present an analysis of how cleft volumes in proteins relate to their molecular interactions and functions. Three separate datasets are used, representing enzyme-ligand binding, protein-protein dimerization and antibody-antigen complexes. We find that, in single-chain enzymes, the ligand is bound in the largest cleft in over 83% of the proteins. Usually the largest cleft is considerably larger than the others, suggesting that size is a functional requirement. Thus, in many cases, the likely active sites of an enzyme can be identified using purely geometrical criteria alone. In other cases, where there is no predominantly large cleft, chemical interactions are required for pinpointing the correct location. In antibody-antigen interactions the antibody usually presents a large cleft for antigen binding. In contrast, protein-protein interactions in homodimers are characterized by approximately planar interfaces with several clefts involved. However, the largest cleft in each subunit still tends to be involved.
Collapse
Affiliation(s)
- R A Laskowski
- Department of Biochemistry and Molecular Biology, University College London, England
| | | | | | | |
Collapse
|
10
|
Lara-Ochoa F, Almagro JC, Vargas-Madrazo E, Mendez I. Frequency analysis of amino acids in the recognition regions of T-cell receptors. Biosystems 1996; 39:77-85. [PMID: 8735389 DOI: 10.1016/0303-2647(95)01602-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
In immunoglobulins (Igs), key amino acids in the Complementarity Determining Regions (CDR) are responsible for maintaining specific conformations called canonical structures. In T-cell receptors (TCRs), protein members of the Ig superfamily, the corresponding residues for maintaining these canonical structures have not been found. In previous studies we have found in Igs that the frequency of use of amino acids in some positions of the CDRs follows an inverse power law distribution, while the frequency of amino acids in the rest of the positions of the CDRs follows an exponential law distribution. The positions that follow the inverse power law distribution are precisely those involved in maintaining the canonical structures, while those positions for which the distribution fits the exponential distribution are those that should be properly involved in the recognition mechanism. In this paper, when the same analysis is applied to the use frequency of amino acids on the CDRs of TCRs, it is found that some positions that have been previously identified as having a structural role are those fitting the inverse power law. That finding combined with the cooperative or long-range interaction properties of systems that follow the inverse power law leads us to propose that the lack of determined key residues in certain positions is compensated by "equivalent' residues in other positions within the CDRs in order to maintain the canonical structures. Other positions that follow the exponential distribution are those which can be involved in the recognition process. These results coincide with a computer-generated model of TCR/peptide/MHC interaction previously published by the authors.
Collapse
Affiliation(s)
- F Lara-Ochoa
- Instituto de Química, Universidad Nacional Autónoma de México, D.F., Mexico
| | | | | | | |
Collapse
|
11
|
Liu RS, Liu CW, Li XY, Asato AE. Butyl conformational reorganization as a possible explanation for the longitudinal flexibility of the binding site of bacteriorhodopsin. The azulene and C-22 retinoid analogs. Photochem Photobiol 1991; 54:625-31. [PMID: 1796116 DOI: 10.1111/j.1751-1097.1991.tb02066.x] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The UV-VIS absorption data of four bacteriorhodopsin (BR) analogs formed from azulene-retinals of varying polyene chain length show that the one-bond-shortened to one-bond-lengthened analogs possess comparable opsin shift values to that of BR. A two-bond-shortened analog exhibited a much smaller opsin shift. These data, combined with those reported for the C-22 retinal analog (Tokunaga et al., 1977, Biophys. J. 19, 191-198) were analyzed by molecular modelling and computer graphics in terms of a model where conformational flexibility of the appended butyl is the controlling factor in determining ease of pigment formation and protein/substrate interaction.
Collapse
Affiliation(s)
- R S Liu
- Department of Chemistry, University of Hawaii, Honolulu 96822
| | | | | | | |
Collapse
|