51
|
Abstract
The shortening of the telomeric DNA sequences at the ends of chromosomes is thought to play a critical role in regulating the lifespan of human cells. Since all dividing cells are subject to the loss of telomeric sequences, cells with long proliferative lifespans need mechanisms to maintain telomere integrity. It appears that the activation of the enzyme telomerase is the major mechanism by which these cells maintain their telomeres. The proposal that a critical step in the process of the malignant transformation of cells is the upregulation of expression of telomerase has made this enzyme a potentially useful prognostic and diagnostic marker for cancer, as well as a new target for therapeutic intervention for the treatment of patients with cancer. It is now clear that simply inhibiting telomerase may not result in the anticancer effects that were originally hypothesized. While telomerase may not be the universal target for cancer therapy, we certainly believe that targeting the telomere maintenance mechanisms will be important in future research aimed toward a successful strategy for curing cancer.
Collapse
Affiliation(s)
- D J Bearss
- The Arizona Cancer Center, The University of Arizona, Tucson 85724, USA
| | | | | |
Collapse
|
52
|
Taraviras SL, Ivanciuc O, Cabrol-Bass D. Identification of groupings of graph theoretical molecular descriptors using a hybrid cluster analysis approach. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES 2000; 40:1128-46. [PMID: 11045805 DOI: 10.1021/ci990149y] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
There is an abundance of structural molecular descriptors of various forms that have been proposed and tested over the years. Very often different descriptors represent, more or less, the same aspects of molecular structures and, thus, they have diminished discriminating power for the identification of different structural features that might contribute to the molecular property, or activity of interest. Therefore, it is essential that noncorrelated descriptors be employed to ensure the wider and the less inflated possible coverage of the chemical space. The most usual approach for reducing the number of descriptors and employing noncorrelated (or orthogonal) descriptors involves principal component analysis (PCA) or other factor analytical techniques. In this work we present an approach for determining relationships (groupings) among 240 graph-theoretical descriptors, as a means for selecting nonredundant ones, based on the application of cluster analysis (CA). To remove inherent biases and particularities of different CA algorithms, several clustering solutions, using these algorithms, were "hybridized" to obtain a reliable and confident overall solution concerning how the interrelationships within the data are structured. The calculated correlation coefficients between descriptors were used as a reference for a discussion on the different CA methods employed, and the resulted clusters of descriptors were statistically analyzed for deriving the intercorrelations between the different operators, weighting schemes and matrices used for the computation of these descriptors.
Collapse
Affiliation(s)
- S L Taraviras
- Arômes, Synthèses et Interactions Lab, University of Nice-Sophia Antipolis, France
| | | | | |
Collapse
|
53
|
Abstract
A number of recent advances have been made in deriving function information from protein structure. A fold relationship to an already characterized protein will often allow general information about function to be deduced. More detailed information can be obtained using sequence relationships to already studied proteins. Methods of deducing function directly from structure, without the use of evolutionary relationships, are developing rapidly. All such methods may be used with models of protein structure, rather than with experimentally determined ones, but model accuracy imposes limitations. The rapid expansion of the structural genomics field has created a new urgency for improved methods of structure-based annotation of function.
Collapse
Affiliation(s)
- J Moult
- Center for Advanced Research in Biotechnology, University of Maryland, Biotechnology Institute, Rockville, MD 20850, USA.
| | | |
Collapse
|
54
|
Wintner EA, Moallemi CC. Quantized surface complementarity diversity (QSCD): a model based on small molecule-target complementarity. J Med Chem 2000; 43:1993-2006. [PMID: 10821712 DOI: 10.1021/jm990504b] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A model of molecular diversity is presented. The model, termed "Quantized Surface Complementarity Diversity" (QSCD), defines molecular diversity by measuring molecular complementarity to a fully enumerated set of theoretical target surfaces. Molecular diversity space is defined as the molecular complement to this set of enumerated surfaces. Using a set of known test compounds, the model is shown to be biologically relevant, consistently scoring known actives as similar. At the resolution of the model, which examines molecules "quantized" into 4.24 A cubic units and treats four points of specific energetic complementarity, the minimum number of compounds needed to fully cover molecular diversity space up to volume 1070 cubic A is estimated to be on the order of 24 million molecules. Most importantly, QSCD allows for individual points in diversity space to be filled by direct modeling of molecular libraries into detailed 3D templates of shape and functionality.
Collapse
Affiliation(s)
- E A Wintner
- NeoGenesis, Inc., 840 Memorial Drive, 4th Floor, Cambridge, Massachusetts 02139, USA.
| | | |
Collapse
|
55
|
Abstract
In addition to the familiar duplex DNA, certain DNA sequences can fold into secondary structures that are four-stranded; because they are made up of guanine (G) bases, such structures are called G-quadruplexes. Considerable circumstantial evidence suggests that these structures can exist in vivo in specific regions of the genome including the telomeric ends of chromosomes and oncogene regulatory regions. Recent studies have demonstrated that small molecules can facilitate the formation of, and stabilize, G-quadruplexes. The possible role of G-quadruplex-interactive compounds as pharmacologically important molecules is explored in this article.
Collapse
Affiliation(s)
- H Han
- Arizona Cancer Center, Tucson, AZ 85724, USA.
| | | |
Collapse
|
56
|
Skolnick J, Fetrow JS, Kolinski A. Structural genomics and its importance for gene function analysis. Nat Biotechnol 2000; 18:283-7. [PMID: 10700142 DOI: 10.1038/73723] [Citation(s) in RCA: 161] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Structural genomics projects aim to solve the experimental structures of all possible protein folds. Such projects entail a conceptual shift from traditional structural biology in which structural information is obtained on known proteins to one in which the structure of a protein is determined first and the function assigned only later. Whereas the goal of converting protein structure into function can be accomplished by traditional sequence motif-based approaches, recent studies have shown that assignment of a protein's biochemical function can also be achieved by scanning its structure for a match to the geometry and chemical identity of a known active site. Importantly, this approach can use low-resolution structures provided by contemporary structure prediction methods. When applied to genomes, structural information (either experimental or predicted) is likely to play an important role in high-throughput function assignment.
Collapse
Affiliation(s)
- J Skolnick
- Laboratory of Computational Genomics, The Danforth Plant Science Center, 893 N, Warson Rd., St. Louis, MO 63141, USA.
| | | | | |
Collapse
|
57
|
Flexsim-X: a method for the detection of molecules with similar biological activity. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES 2000; 40:246-53. [PMID: 10761125 DOI: 10.1021/ci990439e] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We describe the development of the method Flexsim-X, which can be used to detect molecules with similar biological activity. This procedure is based on comparing virtual affinity fingerprints made up from docking scores of the molecules with respect to a reference set of binding sites. Using a test data set consisting of ligands from five different activity classes and randomly chosen compounds, the reference panel of binding sites was optimized in terms of size and composition. Systematic approaches as well as genetic algorithm based (GA) optimization procedures have been evaluated. Additionally, the effectiveness of the method is illustrated.
Collapse
|
58
|
Jain AN. Morphological similarity: a 3D molecular similarity method correlated with protein-ligand recognition. J Comput Aided Mol Des 2000; 14:199-213. [PMID: 10721506 DOI: 10.1023/a:1008100132405] [Citation(s) in RCA: 97] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Recognition of small molecules by proteins depends on three-dimensional molecular surface complementarity. However, the dominant techniques for analyzing the similarity of small molecules are based on two-dimensional chemical structure, with such techniques often outperforming three-dimensional techniques in side-by-side comparisons of correlation to biological activity. This paper introduces a new molecular similarity method, termed morphological similarity (MS), that addresses the apparent paradox. Two sets of molecule pairs are identified from a set of ligands whose protein-bound states are known crystallographically. Pairs that bind the same protein sites form the first set, and pairs that bind different sites from the second. MS is shown to separate the two sets significantly better than a benchmark 2D similarity technique. Further, MS agrees with crystallographic observation of bound ligand states, independent of information about bound states. MS is efficient to compute and can be practically applied to large libraries of compounds.
Collapse
Affiliation(s)
- A N Jain
- UCSF Cancer Center, San Francisco, CA 94143-0128, USA.
| |
Collapse
|
59
|
Abstract
Two major advances have been made in the computational perception and utilization of pharmacophores in compound libraries, both real and virtual. Firstly, a hierarchical set of filtering calculations has emerged that can be used to efficiently partition a library into a trial set of pharmacophores. This sequential filtering permits large libraries to be efficiently processed, as well as compounds judged as 'hits' to be analyzed in great detail. Secondly, new and extended methods of QSAR (quantitative structure-activity relationship) analysis have evolved to translate pharmacophore information into QSAR models that, in turn, can be used as virtual high-throughput screens for activity profiling of a library.
Collapse
Affiliation(s)
- A J Hopfinger
- (M/C-781), Laboratory of Molecular Modeling and Design, The University of Illinois at Chicago, College of Pharmacy, Chicago, IL 60612-7231, USA.
| | | |
Collapse
|
60
|
Abstract
We continue our study of the common features present in drug molecules by looking in detail at drug side chains. Using shape description methods, we divide a database of commercially available drugs into a list of common drug side chains. On the basis of the atom pair shape descriptor (taking into account atom type, hybridization, and bond order), there are 1,246 different side chains among the 5,090 compounds analyzed. The average number of side chains per molecule is 4, and the average number of heavy atoms per side chain is 2. If we ignore the carbonyl side chain, then there are approximately 15,000 occurrences of side chains. Of these 15,000 approximately 11,000 are from the "top 20" group of side chains. This suggests that the diversity that side chains provide to drug molecules is quite low. We discuss ways that this work could be used to provide guidance for molecular design efforts.
Collapse
Affiliation(s)
- G W Bemis
- Vertex Pharmaceuticals, 130 Waverly Street, Cambridge, Massachusetts 02139-4242, USA
| | | |
Collapse
|
61
|
Schneider G, Neidhart W, Giller T, Schmid G. „Grundgerüstwechsel” (Scaffold-Hopping) durch topologische Pharmakophorsuche: ein Beitrag zum virtuellen Screening. Angew Chem Int Ed Engl 1999. [DOI: 10.1002/(sici)1521-3757(19991004)111:19<3068::aid-ange3068>3.0.co;2-0] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
62
|
Ghuloum AM, Sage CR, Jain AN. Molecular hashkeys: a novel method for molecular characterization and its application for predicting important pharmaceutical properties of molecules. J Med Chem 1999; 42:1739-48. [PMID: 10346926 DOI: 10.1021/jm980527a] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We define a novel numerical molecular representation, called the molecular hashkey, that captures sufficient information about a molecule to predict pharmaceutically interesting properties directly from three-dimensional molecular structure. The molecular hashkey represents molecular surface properties as a linear array of pairwise surface-based comparisons of the target molecule against a common 'basis-set' of molecules. Hashkey-measured molecular similarity correlates well with direct methods of measuring molecular surface similarity. Using a simple machine-learning technique with the molecular hashkeys, we show that it is possible to accurately predict the octanol-water partition coefficient, log P. Using more sophisticated learning techniques, we show that an accurate model of intestinal absorption for a set of drugs can be constructed using the same hashkeys used in the aforementioned experiments. Once a set of molecular hashkeys is calculated, its use in the training and testing of property-based models is very fast. Further, the required amount of data for model construction is very small. Neural network-based hashkey models trained on data sets as small as 30 molecules yield statistically significant prediction of molecular properties. The lack of a requirement for large data sets lends itself well to the prediction of pharmaceutically relevant molecular parameters for which data generation is expensive and slow. Molecular hashkeys coupled with machine-learning techniques can yield models that predict key pharmacological aspects of biologically important molecules and should therefore be important in the design of effective therapeutics.
Collapse
Affiliation(s)
- A M Ghuloum
- MetaXen, 280 East Grand Avenue, South San Francisco, California 94080, USA
| | | | | |
Collapse
|
63
|
Mount J, Ruppert J, Welch W, Jain AN. IcePick: a flexible surface-based system for molecular diversity. J Med Chem 1999; 42:60-6. [PMID: 9888833 DOI: 10.1021/jm970775r] [Citation(s) in RCA: 45] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
IcePick is a system for computationally selecting diverse sets of molecules. It computes the dissimilarity of the surface-accessible features of two molecules, taking into account conformational flexibility. Then, the intrinsic diversity of an entire set of molecules is calculated from a spanning tree over the pairwise dissimilarities. IcePick's dissimilarity measure is compared against traditional 2D topological approaches, and the spanning tree diversity measure is compared against commonly used variance techniques. The method has proven easy to implement and is fast enough to be used in selection of reactants for numerous production-sized combinatorial libraries.
Collapse
Affiliation(s)
- J Mount
- Axys Pharmaceuticals, 180 Kimball Way, South San Francisco, California 94080, USA
| | | | | | | |
Collapse
|
64
|
Abstract
If no structural information about a particular target protein is available, methods of rational drug design try to superimpose putative ligands with a given reference, e.g., an endogenous ligand. The goal of such structural alignments is, on the one hand, to approximate the binding geometry and, on the other hand, to provide a relative ranking of the ligands with respect to their similarity. An accurate superposition is the prerequisite of subsequent exploitation of ligand data by either 3D QSAR analyses, pharmacophore hypotheses, or receptor modeling. We present the automatic method FLEXS for structurally superimposing pairs of ligands, approximating their putative binding site geometry. One of the ligands is treated as flexible, while the other one, used as a reference, is kept rigid. FLEXS is an incremental construction procedure. The molecules to be superimposed are partitioned into fragments. Starting with placements of a selected anchor fragment, computed by two alternative approaches, the remaining fragments are added iteratively. At each step, flexibility is considered by allowing the respective added fragment to adopt a discrete set of conformations. The mean computing time per test case is about 1:30 min on a common-day workstation. FLEXS is fast enough to be used as a tool for virtual ligand screening. A database of typical drug molecules has been screened for potential fibrinogen receptor antagonists. FLEXS is capable of retrieving all ligands assigned to platelet aggregation properties among the first 20 hits. Furthermore, the program suggests additional interesting candidates, likely to be active at the same receptor. FLEXS proves to be superior to commonly used retrieval techniques based on 2D fingerprint similarities. The accuracy of computed superpositions determines the relevance of subsequently performed ligand analyses. In order to validate the quality of FLEXS alignments, we attempted to reproduce a set of 284 mutual superpositions derived from experimental data on 76 protein-ligand complexes of 14 proteins. The ligands considered cover the whole range of drug-size molecules from 18 to 158 atoms (PDB codes: 3ptb, 2er7). The performance of the algorithm critically depends on the sizes of the molecules to be superimposed. The limitations are clearly demonstrated with large peptidic inhibitors in the HIV and the endothiapepsin data set. Problems also occur in the presence of multiple binding modes (e.g., elastase and human rhinovirus). The most convincing results are achieved with small- and medium-sized molecules (as, e.g., the ligands of trypsin, thrombin, and dihydrofolate reductase). In more than half of the entire test set, we achieve rms deviations between computed and observed alignment of below 1.5 A. This underlines the reliability of FLEXS-generated alignments.
Collapse
Affiliation(s)
- C Lemmen
- Institute for Algorithms and Scientific Computing (SCAI), German National Research Center for Information Technology (GMD), Schlobeta Birlinghoven, 53754 Sankt Augustin, Germany.
| | | | | |
Collapse
|
65
|
Kauvar LM, Villar HO, Sportsman JR, Higgins DL, Schmidt DE. Protein affinity map of chemical space. JOURNAL OF CHROMATOGRAPHY. B, BIOMEDICAL SCIENCES AND APPLICATIONS 1998; 715:93-102. [PMID: 9792501 DOI: 10.1016/s0378-4347(98)00045-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Affinity fingerprinting is a quantitative method for mapping chemical space based on binding preferences of compounds for a reference panel of proteins. An effective reference panel of <20 proteins can be empirically selected which shows differential interaction with nearly all compounds. By using this map to iteratively sample the chemical space, identification of active ligands from a library of 30,000 candidate compounds has been accomplished for a wide spectrum of specific protein targets. In each case, <200 compounds were directly assayed against the target. Further, analysis of the fingerprint database suggests a strategy for effective selection of affinity chromatography ligands and scaffolds for combinatorial chemistry. With such a system, the large numbers of potential therapeutic targets emerging from genome research can be categorized according to ligand binding properties, complementing sequence based classification.
Collapse
Affiliation(s)
- L M Kauvar
- Terrapin Technologies, Inc., San Francisco, CA 94080, USA
| | | | | | | | | |
Collapse
|
66
|
Rarey M, Dixon JS. Feature trees: a new molecular similarity measure based on tree matching. J Comput Aided Mol Des 1998; 12:471-90. [PMID: 9834908 DOI: 10.1023/a:1008068904628] [Citation(s) in RCA: 207] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
In this paper we present a new method for evaluating molecular similarity between small organic compounds. Instead of a linear representation like fingerprints, a more complex description, a feature tree, is calculated for a molecule. A feature tree represents hydrophobic fragments and functional groups of the molecule and the way these groups are linked together. Each node in the tree is labeled with a set of features representing chemical properties of the part of the molecule corresponding to the node. The comparison of feature trees is based on matching subtrees of two feature trees onto each other. Two algorithms for tackling the matching problem are described throughout this paper. On a dataset of about 1000 molecules, we demonstrate the ability of our approach to identify molecules belonging to the same class of inhibitors. With a second dataset of 58 molecules with known binding modes taken from the Brookhaven Protein Data Bank, we show that the matchings produced by our algorithms are compatible with the relative orientation of the molecules in the active site in 61% of the test cases. The average computation time for a pair comparison is about 50 ms on a current workstation.
Collapse
Affiliation(s)
- M Rarey
- German National Research Center for Information Technology (GMD), Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany.
| | | |
Collapse
|
67
|
Salo JP, Yliniemelä A, Taskinen J. Parameter Refinement for Molecular Docking. ACTA ACUST UNITED AC 1998. [DOI: 10.1021/ci9801825] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jukka-Pekka Salo
- Division of Pharmaceutical Chemistry, Department of Pharmacy, P.O. Box 56 (Viikinkaari 5), FIN-00014 University of Helsinki, Finland
| | - Ari Yliniemelä
- Division of Pharmaceutical Chemistry, Department of Pharmacy, P.O. Box 56 (Viikinkaari 5), FIN-00014 University of Helsinki, Finland
| | - Jyrki Taskinen
- Division of Pharmaceutical Chemistry, Department of Pharmacy, P.O. Box 56 (Viikinkaari 5), FIN-00014 University of Helsinki, Finland
| |
Collapse
|
68
|
Abstract
Rapid expansion in the number of plausible drug targets arising from genomics research has created new pressures for increased efficiency in discovery of high specificity candidate drug compounds. Improved understanding of conserved features among protein structures provides a promising route to achieving this goal. Indirect evidence implies that important similarities are now ripe for elucidation by emerging experimental approaches.
Collapse
Affiliation(s)
- LM Kauvar
- Terrapin Technologies, Inc 750 Gateway Blvd, South San Francisco, CA 94080, USA
| | | |
Collapse
|
69
|
Abstract
Molecular diversity, combinatorial chemistry and automated synthesis are helping usher in a new age in medicinal chemistry. The tools and practices of computational chemistry and molecular modeling are rising to the challenges and opportunities presented by the current trends in drug discovery and design. Recent advances include a number of new and meaningful measures of molecular diversity and the use of genetic algorithms to help design diverse libraries.
Collapse
Affiliation(s)
- M G Bures
- Abbott Laboratories, Abbott Park, IL 60064-3500, USA.
| | | |
Collapse
|
70
|
Menard PR, Lewis RA, Mason JS. Rational Screening Set Design and Compound Selection: Cascaded Clustering. ACTA ACUST UNITED AC 1998. [DOI: 10.1021/ci980003j] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Paul R. Menard
- Computer-Assisted Drug Design, New Lead Generation, Rhône-Poulenc Rorer, Collegeville, Pennsylvania 19426 and Dagenham, RM10 7XS, UK
| | - Richard A. Lewis
- Computer-Assisted Drug Design, New Lead Generation, Rhône-Poulenc Rorer, Collegeville, Pennsylvania 19426 and Dagenham, RM10 7XS, UK
| | - Jonathan S. Mason
- Computer-Assisted Drug Design, New Lead Generation, Rhône-Poulenc Rorer, Collegeville, Pennsylvania 19426 and Dagenham, RM10 7XS, UK
| |
Collapse
|