1
|
Artificial intelligence in drug design: algorithms, applications, challenges and ethics. FUTURE DRUG DISCOVERY 2021. [DOI: 10.4155/fdd-2020-0028] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
The discovery paradigm of drugs is rapidly growing due to advances in machine learning (ML) and artificial intelligence (AI). This review covers myriad faces of AI and ML in drug design. There is a plethora of AI algorithms, the most common of which are summarized in this review. In addition, AI is fraught with challenges that are highlighted along with plausible solutions to them. Examples are provided to illustrate the use of AI and ML in drug discovery and in predicting drug properties such as binding affinities and interactions, solubility, toxicology, blood–brain barrier permeability and chemical properties. The review also includes examples depicting the implementation of AI and ML in tackling intractable diseases such as COVID-19, cancer and Alzheimer’s disease. Ethical considerations and future perspectives of AI are also covered in this review.
Collapse
|
2
|
Cloutier TK, Sudrik C, Mody N, Sathish HA, Trout BL. Machine Learning Models of Antibody–Excipient Preferential Interactions for Use in Computational Formulation Design. Mol Pharm 2020; 17:3589-3599. [DOI: 10.1021/acs.molpharmaceut.0c00629] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Theresa K. Cloutier
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Chaitanya Sudrik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Neil Mody
- Dosage Form Design and Development, AstraZeneca, Gaithersburg, Maryland 20878, United States
| | - Hasige A. Sathish
- Dosage Form Design and Development, AstraZeneca, Gaithersburg, Maryland 20878, United States
| | - Bernhardt L. Trout
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
3
|
Douguet D, Payan F. sensaas: Shape-based Alignment by Registration of Colored Point-based Surfaces. Mol Inform 2020; 39:e2000081. [PMID: 32573978 PMCID: PMC7507133 DOI: 10.1002/minf.202000081] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Accepted: 06/04/2020] [Indexed: 12/11/2022]
Abstract
sensaas is a tool developed for aligning and comparing molecular shapes and sub-shapes. Alignment is obtained by registration of 3D point-based representations of the van der Waals surface. The method uses local properties of the shape to identify the correspondence relationships between two point clouds containing up to several thousand colored (labeled) points. Our rigid-body superimposition method follows a two-stage approach. An initial alignment is obtained by matching pose-invariant local 3D descriptors, called FPFH, of the input point clouds. This stage provides a global superimposition of the molecular surfaces, without any knowledge of their initial pose in 3D space. This alignment is then refined by optimizing the matching of colored points. In our study, each point is colored according to its closest atom, which itself belongs to a user defined physico-chemical class. Finally, sensaas provides an alignment and evaluates the molecular similarity by using Tversky coefficients. To assess the efficiency of this approach, we tested its ability to reproduce the superimposition of X-ray structures of the benchmarking AstraZeneca (AZ) data set and, compared its results with those generated by the two shape-alignment approaches shaep and shafts. We also illustrated submatching properties of our method with respect to few substructures and bioisosteric fragments. The code is available upon request from the authors (demo version at https://chemoinfo.ipmc.cnrs.fr/SENSAAS).
Collapse
Affiliation(s)
- Dominique Douguet
- Université Côte d'AzurInserm, CNRS, IPMC660 route des lucioles06560ValbonneFrance
| | - Frédéric Payan
- Université Côte d'AzurCNRS, I3S, Les Algorithmes - Euclide B2000 route des lucioles06900Sophia AntipolisFrance
| |
Collapse
|
4
|
Changing pictures of molecular faces and depths of potential acting on an electron in molecule for intramolecular proton transfer reactions of formic acid and malonaldehyde. COMPUT THEOR CHEM 2017. [DOI: 10.1016/j.comptc.2017.05.035] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
5
|
Shivashankar N, Patil S, Bhosle A, Chandra N, Natarajan V. MS3ALIGN: an efficient molecular surface aligner using the topology of surface curvature. BMC Bioinformatics 2016; 17:26. [PMID: 26753741 PMCID: PMC4710026 DOI: 10.1186/s12859-015-0874-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Accepted: 12/15/2015] [Indexed: 11/17/2022] Open
Abstract
Background Aligning similar molecular structures is an important step in the process of bio-molecular structure and function analysis. Molecular surfaces are simple representations of molecular structure that are easily constructed from various forms of molecular data such as 3D atomic coordinates (PDB) and Electron Microscopy (EM) data. Methods We present a Multi-Scale Morse-Smale Molecular-Surface Alignment tool, MS3ALIGN, which aligns molecular surfaces based on significant protrusions on the molecular surface. The input is a pair of molecular surfaces represented as triangle meshes. A key advantage of MS3ALIGN is computational efficiency that is achieved because it processes only a few carefully chosen protrusions on the molecular surface. Furthermore, the alignments are partial in nature and therefore allows for inexact surfaces to be aligned. Results The method is evaluated in four settings. First, we establish performance using known alignments with varying overlap and noise values. Second, we compare the method with SurfComp, an existing surface alignment method. We show that we are able to determine alignments reported by SurfComp, as well as report relevant alignments not found by SurfComp. Third, we validate the ability of MS3ALIGN to determine alignments in the case of structurally dissimilar binding sites. Fourth, we demonstrate the ability of MS3ALIGN to align iso-surfaces derived from cryo-electron microscopy scans. Conclusions We have presented an algorithm that aligns Molecular Surfaces based on the topology of surface curvature. A webserver and standalone software implementation of the algorithm available at http://vgl.serc.iisc.ernet.in/ms3align. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0874-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Nithin Shivashankar
- Department of Computer Science and Automation, Indian Institute of Science, Bangalore, 560012, India.
| | - Sonali Patil
- Department of Computer Science and Automation, Indian Institute of Science, Bangalore, 560012, India
| | - Amrisha Bhosle
- Department of Biochemistry, Indian Institute of Science, Bangalore, 560012, India
| | - Nagasuma Chandra
- Department of Biochemistry, Indian Institute of Science, Bangalore, 560012, India
| | - Vijay Natarajan
- Department of Computer Science and Automation, and Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, 560012, India.
| |
Collapse
|
6
|
|
7
|
Saberi Fathi SM, Tuszynski JA. A simple method for finding a protein's ligand-binding pockets. BMC STRUCTURAL BIOLOGY 2014; 14:18. [PMID: 25038637 PMCID: PMC4112621 DOI: 10.1186/1472-6807-14-18] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/09/2014] [Accepted: 07/11/2014] [Indexed: 12/03/2022]
Abstract
BACKGROUND This paper provides a simple and rapid method for a protein-clustering strategy. The basic idea implemented here is to use computational geometry methods to predict and characterize ligand-binding pockets of a given protein structure. In addition to geometrical characteristics of the protein structure, we consider some simple biochemical properties that help recognize the best candidates for pockets in a protein's active site. RESULTS Our results are shown to produce good agreement with known empirical results. CONCLUSIONS The method presented in this paper is a low-cost rapid computational method that could be used to classify proteins and other biomolecules, and furthermore could be useful in reducing the cost and time of drug discovery.
Collapse
Affiliation(s)
| | - Jack A Tuszynski
- Department of Physics, University of Alberta, Edmonton, Alberta, Canada
| |
Collapse
|
8
|
Gong LD, Yang ZZ. Investigation of the molecular surface area and volume: Defined and calculated by the molecular face theory. J Comput Chem 2010; 31:2098-108. [PMID: 20222055 DOI: 10.1002/jcc.21496] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Based on the molecular face (MF) theory, the molecular face surface area (MFSA) and molecular face volume (MFV) are defined. For a variety of organic molecules and several inorganic molecules, the MFSA and MFV have been studied and calculated in terms of an algorithm of our own via the Matlab package. The MFV shows a very good linear relationship with the experimentally measured critical molar volume. It is also found that the MFSA and MFV have significant linear correlations with those of the commonly used hard-sphere model and the electron density isosurface.
Collapse
Affiliation(s)
- Li-Dong Gong
- School of Chemistry and Chemical Engineering, Liaoning Normal University, Dalian 116029, People's Republic of China
| | | |
Collapse
|
9
|
Reisen F, Weisel M, Kriegl JM, Schneider G. Self-organizing fuzzy graphs for structure-based comparison of protein pockets. J Proteome Res 2010; 9:6498-510. [PMID: 20883038 DOI: 10.1021/pr100719n] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Patterns of receptor-ligand interaction can be conserved in functionally equivalent proteins even in the absence of sequence homology. Therefore, structural comparison of ligand-binding pockets and their pharmacophoric features allow for the characterization of so-called "orphan" proteins with known three-dimensional structure but unknown function, and predict ligand promiscuity of binding pockets. We present an algorithm for rapid pocket comparison (PoLiMorph), in which protein pockets are represented by self-organizing graphs that fill the volume of the cavity. Vertices in these three-dimensional frameworks contain information about the local ligand-receptor interaction potential coded by fuzzy property labels. For framework matching, we developed a fast heuristic based on the maximum dispersion problem, as an alternative to techniques utilizing clique detection or geometric hashing algorithms. A sophisticated scoring function was applied that incorporates knowledge about property distributions and ligand-receptor interaction patterns. In an all-against-all virtual screening experiment with 207 pocket frameworks extracted from a subset of PDBbind, PoLiMorph correctly assigned 81% of 69 distinct structural classes and demonstrated sustained ability to group pockets accommodating the same ligand chemotype. We determined a score threshold that indicates "true" pocket similarity with high reliability, which not only supports structure-based drug design but also allows for sequence-independent studies of the proteome.
Collapse
Affiliation(s)
- Felix Reisen
- Computer-Assisted Drug Design, Eidgenössische Technische Hochschule, Zürich, Zürich, Switzerland
| | | | | | | |
Collapse
|
10
|
Pyrkov TV, Ozerov IV, Blitskaia ED, Efremov RG. [Molecular docking: role of intermolecular contacts in formation of complexes of proteins with nucleotides and peptides]. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2010; 36:482-92. [PMID: 20823916 DOI: 10.1134/s1068162010040023] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Knowledge of 3D-structure of protein-ligand complex is a major prerequisite for understanding the functioning mechanism of cellular proteins and membrane receptors. This is also of a great help in rational drug design projects. In the present paper we briefly review the molecular docking approaches used to predict possible orientation of a ligand in the protein binding site. The recent trends to improve the accuracy and efficiency of docking algorithms are demonstrated with the results obtained in Laboratory of Biomolecular Modeling. Particular attention is paid to protein-ligand hydrophobic and stacking interactions responsible for molecular recognition of ligand fragments. Such type of interactions are not always adequately represented in scoring criteria of docking applications that leads to mismatch in 3D-structure complexes predictions. That is why further inquiry of methods to account for these interactions is now the area of active research.
Collapse
|
11
|
Potential for protein surface shape analysis using spherical harmonics and 3D Zernike descriptors. Cell Biochem Biophys 2009; 54:23-32. [PMID: 19521674 DOI: 10.1007/s12013-009-9051-x] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2009] [Accepted: 05/22/2009] [Indexed: 10/20/2022]
Abstract
With structure databases expanding at a rapid rate, the task at hand is to provide reliable clues to their molecular function and to be able to do so on a large scale. This, however, requires suitable encodings of the molecular structure which are amenable to fast screening. To this end, moment-based representations provide a compact and nonredundant description of molecular shape and other associated properties. In this article, we present an overview of some commonly used representations with specific focus on two schemes namely spherical harmonics and their extension, the 3D Zernike descriptors. Key features and differences of the two are reviewed and selected applications are highlighted. We further discuss recent advances covering aspects of shape and property-based comparison at both global and local levels and demonstrate their applicability through some of our studies.
Collapse
|
12
|
Li B, Turuvekere S, Agrawal M, La D, Ramani K, Kihara D. Characterization of local geometry of protein surfaces with the visibility criterion. Proteins 2008; 71:670-83. [PMID: 17975834 DOI: 10.1002/prot.21732] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Experimentally determined protein tertiary structures are rapidly accumulating in a database, partly due to the structural genomics projects. Included are proteins of unknown function, whose function has not been investigated by experiments and was not able to be predicted by conventional sequence-based search. Those uncharacterized protein structures highlight the urgent need of computational methods for annotating proteins from tertiary structures, which include function annotation methods through characterizing protein local surfaces. Toward structure-based protein annotation, we have developed VisGrid algorithm that uses the visibility criterion to characterize local geometric features of protein surfaces. Unlike existing methods, which only concerns identifying pockets that could be potential ligand-binding sites in proteins, VisGrid is also aimed to identify large protrusions, hollows, and flat regions, which can characterize geometric features of a protein structure. The visibility used in VisGrid is defined as the fraction of visible directions from a target position on a protein surface. A pocket or a hollow is recognized as a cluster of positions with a small visibility. A large protrusion in a protein structure is recognized as a pocket in the negative image of the structure. VisGrid correctly identified 95.0% of ligand-binding sites as one of the three largest pockets in 5616 benchmark proteins. To examine how natural flexibility of proteins affects pocket identification, VisGrid was tested on distorted structures by molecular dynamics simulation. Sensitivity decreased approximately 20% for structures of a root mean square deviation of 2.0 A to the original crystal structure, but specificity was not much affected. Because of its intuitiveness and simplicity, the visibility criterion will lay the foundation for characterization and function annotation of local shape of proteins.
Collapse
Affiliation(s)
- Bin Li
- Department of Computer Science, College of Science, Purdue University, West Lafayette, Indiana 47907, USA
| | | | | | | | | | | |
Collapse
|
13
|
Visual Analysis of Biomolecular Surfaces. ACTA ACUST UNITED AC 2008. [DOI: 10.1007/978-3-540-72630-2_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
14
|
Pattern recognition based on color-coded quantum mechanical surfaces for molecular alignment. J Mol Model 2007; 14:49-57. [PMID: 18038163 DOI: 10.1007/s00894-007-0251-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2007] [Accepted: 10/19/2007] [Indexed: 12/11/2022]
Abstract
A pattern recognition algorithm for the alignment of drug-like molecules has been implemented. The method is based on the calculation of quantum mechanical derived local properties defined on a molecular surface. This approach has been shown to be very useful in attempting to derive generalized, non-atom based representations of molecular structure. The visualization of these surfaces is described together with details of the methodology developed for their use in molecular overlay and similarity calculations. In addition, this paper also introduces an additional local property, the local curvature (C (L)), which can be used together with the quantum mechanical properties to describe the local shape. The method is exemplified using some problems representing common tasks encountered in molecular similarity.
Collapse
|
15
|
Abstract
We present a method, termed AutoLigand, for the prediction of ligand-binding sites in proteins of known structure. The method searches the space surrounding the protein and finds the contiguous envelope with the specified volume of atoms, which has the largest possible interaction energy with the protein. It uses a full atomic representation, with atom types for carbon, hydrogen, oxygen, nitrogen and sulfur (and others, if desired), and is designed to minimize the need for artificial geometry. Testing on a set of 187 diverse protein-ligand complexes has shown that the method is successful in predicting the location and approximate volume of the binding site in 73% of cases. Additional testing was performed on a set of 96 protein-ligand complexes with crystallographic structures of apo and holo forms, and AutoLigand was able to predict the binding site in 80% of the apo structures.
Collapse
Affiliation(s)
- Rodney Harris
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, California 92037, USA
| | | | | |
Collapse
|
16
|
Proschak E, Rupp M, Derksen S, Schneider G. Shapelets: Possibilities and limitations of shape-based virtual screening. J Comput Chem 2007; 29:108-14. [PMID: 17516427 DOI: 10.1002/jcc.20770] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Complementarity of molecular surfaces is crucial for molecular recognition. A method for representation of molecular shape is presented. We decompose the molecular surface into commensurate patches with defined shape by fitting hyperbolical paraboloids onto a triangulated isosurface of the Gaussian model of a molecule. As a result of this decomposition we obtain a 3D graph representation of the molecular shape, which can be used for complete and partial shape matching and isosteric group searching. To point out the possibilities and limitations of shape-only models, we challenged our method by three scenarios in a virtual screening contest: rigid body alignment, consensus shape filtering, and target-specific screening.
Collapse
Affiliation(s)
- Ewgenij Proschak
- Johann Wolfgang Goethe-Universität, Beilstein Endowed Chair for Cheminformatics, Institut für Organische Chemie und Chemische Biologie, Siesmayerstr 70, D-60323, Frankfurt am Main, Germany
| | | | | | | |
Collapse
|
17
|
Nayal M, Honig B. On the nature of cavities on protein surfaces: Application to the identification of drug-binding sites. Proteins 2006; 63:892-906. [PMID: 16477622 DOI: 10.1002/prot.20897] [Citation(s) in RCA: 195] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
In this article we introduce a new method for the identification and the accurate characterization of protein surface cavities. The method is encoded in the program SCREEN (Surface Cavity REcognition and EvaluatioN). As a first test of the utility of our approach we used SCREEN to locate and analyze the surface cavities of a nonredundant set of 99 proteins cocrystallized with drugs. We find that this set of proteins has on average about 14 distinct cavities per protein. In all cases, a drug is bound at one (and sometimes more than one) of these cavities. Using cavity size alone as a criterion for predicting drug-binding sites yields a high balanced error rate of 15.7%, with only 71.7% coverage. Here we characterize each surface cavity by computing a comprehensive set of 408 physicochemical, structural, and geometric attributes. By applying modern machine learning techniques (Random Forests) we were able to develop a classifier that can identify drug-binding cavities with a balanced error rate of 7.2% and coverage of 88.9%. Only 18 of the 408 cavity attributes had a statistically significant role in the prediction. Of these 18 important attributes, almost all involved size and shape rather than physicochemical properties of the surface cavity. The implications of these results are discussed. A SCREEN Web server is available at http://interface.bioc.columbia.edu/screen.
Collapse
Affiliation(s)
- Murad Nayal
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, USA
| | | |
Collapse
|
18
|
Filipek S. Organization of rhodopsin molecules in native membranes of rod cells–an old theoretical model compared to new experimental data. J Mol Model 2005; 11:385-91. [PMID: 15928919 DOI: 10.1007/s00894-005-0268-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2004] [Accepted: 02/01/2005] [Indexed: 11/25/2022]
Abstract
It has been shown that rhodopsin forms an oligomer in the shape of long double rows of monomers. Because of the importance of rhodopsin as a template for all G protein-coupled receptors, its dimeric, tetrameric and higher-oligomeric structures also provide a useful pattern for similar structures in GPCRs. New experimental data published recently are discussed in the context of a proposed model of the rhodopsin oligomer 1N3M deposited in the protein data bank. The new rhodopsin structure at 2.2 A resolution with all residues resolved as well as an electron cryomicroscopy structure from 2D crystals of rhodopsin are in agreement with the 1N3M model. Accommodation of movement of transmembrane helix VI, regarded as a major event during the activation of rhodopsin, in a steady structure of the oligomer is also discussed. [Figure: see text]. Superimposition of the 1U19 (red wire), 1GZM (purple wire) and 1N3M (blue wire) rhodopsin structures. Size of the wires is proportional to thermal factors of backbone C(alpha) atoms, view parallel to the membrane.
Collapse
Affiliation(s)
- Slawomir Filipek
- International Institute of Molecular and Cell Biology, 4 Ks. Trojdena St, 02-109, Warsaw, Poland.
| |
Collapse
|
19
|
Hofbauer C, Aszódi A. SH2 Binding Site Comparison: A New Application of the SURFCOMP Method. J Chem Inf Model 2005; 45:414-21. [PMID: 15807507 DOI: 10.1021/ci0497049] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
To avoid side effects, it is often desirable to increase the specificity of a drug candidate when targeting one member of a family of related proteins, whereby one exploits small differences between the structures of the binding sites. Identification of such differences can be carried out by analyzing the distributions of physicochemical properties mapped onto molecular surfaces. Here we demonstrate that SURFCOMP, our local surface similarity detection method, is able to detect differences between the binding sites of two closely related proteins. We analyzed the SH2 domains of Sap and Eat-2, two highly similar signal transduction molecules involved in inflammatory processes and found differences between their binding sites that can possibly lead to a better understanding of the different specificities of the two proteins.
Collapse
Affiliation(s)
- Christian Hofbauer
- In Silico Sciences Unit, Informatics and Knowledge Management, Novartis Institutes for BioMedical Research Vienna, Brunnerstrasse 59, A-1235 Vienna, Austria
| | | |
Collapse
|
20
|
Morris RJ, Najmanovich RJ, Kahraman A, Thornton JM. Real spherical harmonic expansion coefficients as 3D shape descriptors for protein binding pocket and ligand comparisons. Bioinformatics 2005; 21:2347-55. [PMID: 15728116 DOI: 10.1093/bioinformatics/bti337] [Citation(s) in RCA: 112] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION An increasing number of protein structures are being determined for which no biochemical characterization is available. The analysis of protein structure and function assignment is becoming an unexpected challenge and a major bottleneck towards the goal of well-annotated genomes. As shape plays a crucial role in biomolecular recognition and function, the examination and development of shape description and comparison techniques is likely to be of prime importance for understanding protein structure-function relationships. RESULTS A novel technique is presented for the comparison of protein binding pockets. The method uses the coefficients of a real spherical harmonics expansion to describe the shape of a protein's binding pocket. Shape similarity is computed as the L2 distance in coefficient space. Such comparisons in several thousands per second can be carried out on a standard linux PC. Other properties such as the electrostatic potential fit seamlessly into the same framework. The method can also be used directly for describing the shape of proteins and other molecules. AVAILABILITY A limited version of the software for the real spherical harmonics expansion of a set of points in PDB format is freely available upon request from the authors. Binding pocket comparisons and ligand prediction will be made available through the protein structure annotation pipeline Profunc (written by Roman Laskowski) which will be accessible from the EBI website shortly.
Collapse
Affiliation(s)
- Richard J Morris
- EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | | | |
Collapse
|
21
|
Hofbauer C, Lohninger H, Aszódi A. SURFCOMP: A Novel Graph-Based Approach to Molecular Surface Comparison. ACTA ACUST UNITED AC 2004; 44:837-47. [PMID: 15154748 DOI: 10.1021/ci0342371] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Analysis of the distributions of physicochemical properties mapped onto molecular surfaces can highlight important similarities or differences between compound classes, contributing to rational drug design efforts. Here we present an approach that uses maximal common subgraph comparison and harmonic shape image matching to detect locally similar regions between two molecular surfaces augmented with properties such as the electrostatic potential or lipophilicity. The complexity of the problem is reduced by a set of filters that implement various geometric and physicochemical heuristics. The approach was tested on dihydrofolate reductase and thermolysin inhibitors and was shown to recover the correct alignments of the compounds bound in the active sites.
Collapse
Affiliation(s)
- Christian Hofbauer
- Novartis Institutes for BioMedical Research, Brunnerstrasse 59, A-1235 Vienna, Austria
| | | | | |
Collapse
|
22
|
Tsuchiya Y, Kinoshita K, Nakamura H. Structure-based prediction of DNA-binding sites on proteins Using the empirical preference of electrostatic potential and the shape of molecular surfaces. Proteins 2004; 55:885-94. [PMID: 15146487 DOI: 10.1002/prot.20111] [Citation(s) in RCA: 74] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Protein-DNA interactions play an essential role in the genetic activities of life. Many structures of protein-DNA complexes are already known, but the common rules on how and where proteins bind to DNA have not emerged. Many attempts have been made to predict protein-DNA interactions using structural information, but the success rate is still about 80%. We analyzed 63 protein-DNA complexes by focusing our attention on the shape of the molecular surface of the protein and DNA, along with the electrostatic potential on the surface, and constructed a new statistical evaluation function to make predictions of DNA interaction sites on protein molecular surfaces. The shape of the molecular surface was described by a combination of local and global average curvature, which are intended to describe the small convex and concave and the large-scale concave curvatures of the protein surface preferentially appearing at DNA-binding sites. Using these structural features, along with the electrostatic potential obtained by solving the Poisson-Boltzmann equation numerically, we have developed prediction schemes with 86% and 96% accuracy for DNA-binding and non-DNA-binding proteins, respectively.
Collapse
Affiliation(s)
- Yuko Tsuchiya
- Institute for Protein Research, Osaka University, Osaka, Japan
| | | | | |
Collapse
|
23
|
Yang ZZ, Gong LD, Zhao DX, Zhang MB. Method and algorithm of obtaining the molecular intrinsic characteristic contours (MICCs) of organic molecules. J Comput Chem 2004; 26:35-47. [PMID: 15526323 DOI: 10.1002/jcc.20140] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The molecular intrinsic characteristic contour (MICC) is defined as the set of all the classical turning points of electron movement in a molecule. Studies on the MICCs of some medium organic molecules, such as dimethylether, acetone, and some homologues of alkanes, alkenes, and alkynes, as well as the electron density distributions on the MICCs, are shown for the first time. Results show that the MICC is an intrinsic approach to shape and size of a molecule. Unlike the van der Waals hard-sphere model, the MICC is a smooth contour, and it has a clear physical meaning. Detailed investigations on the cross-sections of MICCs have provided a kind of important information about atomic size changing in the process of forming molecules. Studies on electron density distribution on the MICC not only provide a new insight into molecular shape, but also show that the electron density distribution on the boundary surface relates closely with molecular properties and reactivities. For the homologues of alkanes, Rout(H), Dmin, and Dmax (the minimum and maximum of electron density on the MICC), all have very good linear relationships with minus of the molecular ionization potential. This work may serve as a basis for exploring a new reactivity indicator of chemical reactions and for studying molecular shape properties of large organic and biological molecules.
Collapse
Affiliation(s)
- Zhong-Zhi Yang
- Department of Chemistry, Liaoning Normal University, Dalian, 116029, People's Republic of China.
| | | | | | | |
Collapse
|
24
|
Keil M, Exner TE, Brickmann J. Pattern recognition strategies for molecular surfaces: III. Binding site prediction with a neural network. J Comput Chem 2004; 25:779-89. [PMID: 15011250 DOI: 10.1002/jcc.10361] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
An algorithm for the identification of possible binding sites of biomolecules, which are represented as regions of the molecular surface, is introduced. The algorithm is based on the segmentation of the molecular surface into overlapping patches as described in the first article of this series.1 The properties of these patches (calculated on the basis of physical and chemical properties) are used for the analysis of the molecular surfaces of 7821 proteins and protein complexes. Special attention is drawn to known protein binding sites. A binding site identification algorithm is realized on the basis of the calculated data using a neural network strategy. The neural network is able to classify surface patches as protein-protein, protein-DNA, protein-ligand, or nonbinding sites. To show the capability of the algorithm, results of the surface analysis and the predictions are presented and discussed with representative examples.
Collapse
Affiliation(s)
- Matthias Keil
- Department of Physical Chemistry, Darmstadt University of Technology, 64287 Darmstadt, Germany
| | | | | |
Collapse
|
25
|
Exner TE, Keil M, Brickmann J. New fuzzy logic strategies for bio-molecular recognition. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2003; 14:421-431. [PMID: 14758985 DOI: 10.1080/10629360310001624006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
The concepts of molecular similarity and molecular complementarity, playing important roles in the broad field of molecular recognition, are chemical problems, in which the eyeball technique used by a human observer is very successful but which are very hard to code into a computer algorithm. Based on the model of molecular surfaces, our new approach defines overlapping surface patches with similar molecular properties. These patches are used to represent local features of the molecule in a way, which is beyond the atomistic resolution but can nevertheless be applied in partial similarity as well as complementarity analyses in a very general sense. It is shown that this molecular description can be used as the first step in a docking algorithm for complexes, where the structures of both molecules are known, as well as for the identification of possible active sites without the knowledge of specific molecules binding to this site.
Collapse
Affiliation(s)
- T E Exner
- Mathematical Chemistry Research Unit, Department of Chemistry, University of Saskatchewan, 110 Science Place, Saskatoon, SK, Canada S7N 5C9.
| | | | | |
Collapse
|
26
|
Kinoshita K, Nakamura H. Identification of protein biochemical functions by similarity search using the molecular surface database eF-site. Protein Sci 2003; 12:1589-95. [PMID: 12876308 PMCID: PMC2323945 DOI: 10.1110/ps.0368703] [Citation(s) in RCA: 131] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
The identification of protein biochemical functions based on their three-dimensional structures is strongly required in the post-genome-sequencing era. We have developed a new method to identify and predict protein biochemical functions using the similarity information of molecular surface geometries and electrostatic potentials on the surfaces. Our prediction system consists of a similarity search method based on a clique search algorithm and the molecular surface database eF-site (electrostatic surface of functional-site in proteins). Using this system, functional sites similar to those of phosphoenoylpyruvate carboxy kinase were detected in several mononucleotide-binding proteins, which have different folds. We also applied our method to a hypothetical protein, MJ0226 from Methanococcus jannaschii, and detected the mononucleotide binding site from the similarity to other proteins having different folds.
Collapse
Affiliation(s)
- Kengo Kinoshita
- Graduate School of Integrated Science, Yokohama City University, Yokohama 230-0045, Japan.
| | | |
Collapse
|
27
|
Abstract
The study of structural genomics and structural proteomics has determined the tertiary structures of many hypothetical proteins, whose molecular functions could not be understood using conventional methods. In order to infer the geometrical location of the functional site, the biochemical function and the biological function of the hypothetical protein, much effort has been made in protein informatics. The importance of heterogeneous databases and various descriptors of amino acid sequences, tertiary structures and pathways on the proteome scale has been emphasised.
Collapse
Affiliation(s)
- Kengo Kinoshita
- Graduate School of Integrated Science, Yokohama City University, 1-7-29 Suehiro-cho, Turumi-ku, 230-0045, Yokohama, Japan.
| | | |
Collapse
|
28
|
Exner TE, Keil M, Brickmann J. Pattern recognition strategies for molecular surfaces. II. Surface complementarity. J Comput Chem 2002; 23:1188-97. [PMID: 12116388 DOI: 10.1002/jcc.10087] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Fuzzy logic based algorithms for the quantitative treatment of complementarity of molecular surfaces are presented. Therein, the overlapping surface patches defined in article I1 of this series are used. The identification of complementary surface patches can be considered as a first step for the solution of molecular docking problems. Standard technologies can then be used for further optimization of the resulting complex structures. The algorithms are applied to 33 biomolecular complexes. After the optimization with a downhill simplex method, for all these complexes one structure was found, which is in very good agreement with the experimental results.
Collapse
Affiliation(s)
- Thomas E Exner
- Department of Chemistry, Mathematical Chemistry Research Unit, University of Saskatchewan, 110 Science Place, Saskatoon, SK, Canada, S7N 5C9
| | | | | |
Collapse
|