1
|
Yi J, Lee S, Lim S, Cho C, Piao Y, Yeo M, Kim D, Kim S, Lee S. Exploring chemical space for lead identification by propagating on chemical similarity network. Comput Struct Biotechnol J 2023; 21:4187-4195. [PMID: 37680266 PMCID: PMC10480321 DOI: 10.1016/j.csbj.2023.08.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 08/08/2023] [Accepted: 08/20/2023] [Indexed: 09/09/2023] Open
Abstract
Motivation Lead identification is a fundamental step to prioritize candidate compounds for downstream drug discovery process. Machine learning (ML) and deep learning (DL) approaches are widely used to identify lead compounds using both chemical property and experimental information. However, ML or DL methods rarely consider compound similarity information directly since ML and DL models use abstract representation of molecules for model construction. Alternatively, data mining approaches are also used to explore chemical space with drug candidates by screening undesirable compounds. A major challenge for data mining approaches is to develop efficient data mining methods that search large chemical space for desirable lead compounds with low false positive rate. Results In this work, we developed a network propagation (NP) based data mining method for lead identification that performs search on an ensemble of chemical similarity networks. We compiled 14 fingerprint-based similarity networks. Given a target protein of interest, we use a deep learning-based drug target interaction model to narrow down compound candidates and then we use network propagation to prioritize drug candidates that are highly correlated with drug activity score such as IC50. In an extensive experiment with BindingDB, we showed that our approach successfully discovered intentionally unlabeled compounds for given targets. To further demonstrate the prediction power of our approach, we identified 24 candidate leads for CLK1. Two out of five synthesizable candidates were experimentally validated in binding assays. In conclusion, our framework can be very useful for lead identification from very large compound databases such as ZINC.
Collapse
Affiliation(s)
- Jungseob Yi
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul, 08826, South Korea
| | - Sangseon Lee
- Institute of Computer Technology, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul, 08826, South Korea
| | - Sangsoo Lim
- School of AI Software Convergence, Dongguk University, Pildong-ro 1-gil, Jung-gu, Seoul, South Korea
| | - Changyun Cho
- Interdisciplinary Program in Bioinformatics, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul, 08826, South Korea
| | - Yinhua Piao
- Department of Computer Science and Engineering, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul, 08826, South Korea
| | - Marie Yeo
- PHARMGENSCIENCE CO., LTD., 216, Dongjak-daero, Seocho-gu, Seoul, 06554, South Korea
| | - Dongkyu Kim
- PHARMGENSCIENCE CO., LTD., 216, Dongjak-daero, Seocho-gu, Seoul, 06554, South Korea
| | - Sun Kim
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul, 08826, South Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul, 08826, South Korea
- Department of Computer Science and Engineering, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul, 08826, South Korea
- AIGENDRUG CO., LTD., Gwanak-ro 1, Gwanak-gu, Seoul, 08826, South Korea
| | - Sunho Lee
- AIGENDRUG CO., LTD., Gwanak-ro 1, Gwanak-gu, Seoul, 08826, South Korea
| |
Collapse
|
2
|
Luo L, Yang J, Wang C, Wu J, Li Y, Zhang X, Li H, Zhang H, Zhou Y, Lu A, Chen S. Natural products for infectious microbes and diseases: an overview of sources, compounds, and chemical diversities. SCIENCE CHINA. LIFE SCIENCES 2022; 65:1123-1145. [PMID: 34705221 PMCID: PMC8548270 DOI: 10.1007/s11427-020-1959-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Accepted: 07/27/2021] [Indexed: 12/13/2022]
Abstract
As coronavirus disease 2019 (COVID-19) threatens human health globally, infectious disorders have become one of the most challenging problem for the medical community. Natural products (NP) have been a prolific source of antimicrobial agents with widely divergent structures and a range vast biological activities. A dataset comprising 618 articles, including 646 NP-based compounds from 672 species of natural sources with biological activities against 21 infectious pathogens from five categories, was assembled through manual selection of published articles. These data were used to identify 268 NP-based compounds classified into ten groups, which were used for network pharmacology analysis to capture the most promising lead-compounds such as agelasine D, dicumarol, dihydroartemisinin and pyridomycin. The distribution of maximum Tanimoto scores indicated that compounds which inhibited parasites exhibited low diversity, whereas the chemistries inhibiting bacteria, fungi, and viruses showed more structural diversity. A total of 331 species of medicinal plants with compounds exhibiting antimicrobial activities were selected to classify the family sources. The family Asteraceae possesses various compounds against C. neoformans, the family Anacardiaceae has compounds against Salmonella typhi, the family Cucurbitacea against the human immunodeficiency virus (HIV), and the family Ancistrocladaceae against Plasmodium. This review summarizes currently available data on NP-based antimicrobials against refractory infections to provide information for further discovery of drugs and synthetic strategies for anti-infectious agents.
Collapse
Affiliation(s)
- Lu Luo
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
| | - Jun Yang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
| | - Cheng Wang
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100006, China
| | - Jie Wu
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
| | - Yafang Li
- Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Tianjin, 300020, China
| | - Xu Zhang
- weMED Health, Houston, 77054, USA
| | - Hui Li
- Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Hui Zhang
- Akupunktur Akademiet, Aabyhoej, Aarhus, 8230, Denmark
| | - Yumei Zhou
- The Fourth Clinical Medical College of Guangzhou University of Chinese Medicine, Guangzhou, 518033, China
| | - Aiping Lu
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, 999077, China
| | - Shilin Chen
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China.
| |
Collapse
|
3
|
Rodríguez-Pérez R, Miljković F, Bajorath J. Machine Learning in Chemoinformatics and Medicinal Chemistry. Annu Rev Biomed Data Sci 2022; 5:43-65. [PMID: 35440144 DOI: 10.1146/annurev-biodatasci-122120-124216] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In chemoinformatics and medicinal chemistry, machine learning has evolved into an important approach. In recent years, increasing computational resources and new deep learning algorithms have put machine learning onto a new level, addressing previously unmet challenges in pharmaceutical research. In silico approaches for compound activity predictions, de novo design, and reaction modeling have been further advanced by new algorithmic developments and the emergence of big data in the field. Herein, novel applications of machine learning and deep learning in chemoinformatics and medicinal chemistry are reviewed. Opportunities and challenges for new methods and applications are discussed, placing emphasis on proper baseline comparisons, robust validation methodologies, and new applicability domains. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany; .,Current affiliation: Novartis Institutes for Biomedical Research, Novartis Campus, Basel, Switzerland
| | - Filip Miljković
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany; .,Current affiliation: Data Science and AI, Imaging and Data Analytics, Clinical Pharmacology and Safety Sciences, R&D AstraZeneca, Gothenburg, Sweden
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany;
| |
Collapse
|
4
|
Agamah FE, Mazandu GK, Hassan R, Bope CD, Thomford NE, Ghansah A, Chimusa ER. Computational/in silico methods in drug target and lead prediction. Brief Bioinform 2020; 21:1663-1675. [PMID: 31711157 PMCID: PMC7673338 DOI: 10.1093/bib/bbz103] [Citation(s) in RCA: 109] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 07/17/2019] [Accepted: 07/18/2019] [Indexed: 01/10/2023] Open
Abstract
Drug-like compounds are most of the time denied approval and use owing to the unexpected clinical side effects and cross-reactivity observed during clinical trials. These unexpected outcomes resulting in significant increase in attrition rate centralizes on the selected drug targets. These targets may be disease candidate proteins or genes, biological pathways, disease-associated microRNAs, disease-related biomarkers, abnormal molecular phenotypes, crucial nodes of biological network or molecular functions. This is generally linked to several factors, including incomplete knowledge on the drug targets and unpredicted pharmacokinetic expressions upon target interaction or off-target effects. A method used to identify targets, especially for polygenic diseases, is essential and constitutes a major bottleneck in drug development with the fundamental stage being the identification and validation of drug targets of interest for further downstream processes. Thus, various computational methods have been developed to complement experimental approaches in drug discovery. Here, we present an overview of various computational methods and tools applied in predicting or validating drug targets and drug-like molecules. We provide an overview on their advantages and compare these methods to identify effective methods which likely lead to optimal results. We also explore major sources of drug failure considering the challenges and opportunities involved. This review might guide researchers on selecting the most efficient approach or technique during the computational drug discovery process.
Collapse
Affiliation(s)
- Francis E Agamah
- Division of Human Genetics, Department of Pathology, University of Cape Town, Observatory 7925, South Africa
| | - Gaston K Mazandu
- Division of Human Genetics, Department of Pathology, University of Cape Town, Observatory 7925, South Africa
- African Institute for Mathematical Sciences, Muizenberg, Cape Town 7945, South Africa
| | - Radia Hassan
- Division of Human Genetics, Department of Pathology, University of Cape Town, Observatory 7925, South Africa
| | - Christian D Bope
- Division of Human Genetics, Department of Pathology, University of Cape Town, Observatory 7925, South Africa
- Faculty of Sciences, University of Kinshasa, Kinshasa, Democratic Republic of Congo
| | - Nicholas E Thomford
- Division of Human Genetics, Department of Pathology, University of Cape Town, Observatory 7925, South Africa
- School of Medical Sciences, University of Cape Coast, PMB, Cape Coast, Ghana
| | - Anita Ghansah
- Noguchi Memorial Institute for Medical Research, College of Health Sciences, University of Ghana, PO Box LG 581, Legon, Ghana
| | - Emile R Chimusa
- Division of Human Genetics, Department of Pathology, University of Cape Town, Observatory 7925, South Africa
| |
Collapse
|
5
|
Hussain W, Rasool N, Khan YD. Insights into Machine Learning-based Approaches for Virtual Screening in Drug Discovery: Existing Strategies and Streamlining Through FP-CADD. Curr Drug Discov Technol 2020; 18:463-472. [PMID: 32767944 DOI: 10.2174/1570163817666200806165934] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 07/01/2020] [Accepted: 07/03/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND Machine learning is an active area of research in computer science by the availability of big data collection of all sorts prompting interest in the development of novel tools for data mining. Machine learning methods have wide applications in computer-aided drug discovery methods. Most incredible approaches to machine learning are used in drug designing, which further aid the process of biological modelling in drug discovery. Mainly, two main categories are present which are Ligand-Based Virtual Screening (LBVS) and Structure-Based Virtual Screening (SBVS), however, the machine learning approaches fall mostly in the category of LBVS. OBJECTIVES This study exposits the major machine learning approaches being used in LBVS. Moreover, we have introduced a protocol named FP-CADD which depicts a 4-steps rule of thumb for drug discovery, the four protocols of computer-aided drug discovery (FP-CADD). Various important aspects along with SWOT analysis of FP-CADD are also discussed in this article. CONCLUSION By this thorough study, we have observed that in LBVS algorithms, Support Vector Machines (SVM) and Random Forest (RF) are those which are widely used due to high accuracy and efficiency. These virtual screening approaches have the potential to revolutionize the drug designing field. Also, we believe that the process flow presented in this study, named FP-CADD, can streamline the whole process of computer-aided drug discovery. By adopting this rule, the studies related to drug discovery can be made homogeneous and this protocol can also be considered as an evaluation criterion in the peer-review process of research articles.
Collapse
Affiliation(s)
| | | | - Yaser Daanial Khan
- Department of Computer Science, University of Management and Technology, Lahore, Pakistan
| |
Collapse
|
6
|
Lavecchia A. Machine-learning approaches in drug discovery: methods and applications. Drug Discov Today 2014; 20:318-31. [PMID: 25448759 DOI: 10.1016/j.drudis.2014.10.012] [Citation(s) in RCA: 358] [Impact Index Per Article: 35.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2014] [Revised: 09/27/2014] [Accepted: 10/24/2014] [Indexed: 12/19/2022]
Abstract
During the past decade, virtual screening (VS) has evolved from traditional similarity searching, which utilizes single reference compounds, into an advanced application domain for data mining and machine-learning approaches, which require large and representative training-set compounds to learn robust decision rules. The explosive growth in the amount of public domain-available chemical and biological data has generated huge effort to design, analyze, and apply novel learning methodologies. Here, I focus on machine-learning techniques within the context of ligand-based VS (LBVS). In addition, I analyze several relevant VS studies from recent publications, providing a detailed view of the current state-of-the-art in this field and highlighting not only the problematic issues, but also the successes and opportunities for further advances.
Collapse
Affiliation(s)
- Antonio Lavecchia
- Department of Pharmacy, Drug Discovery Laboratory, University of Napoli 'Federico II', via D. Montesano 49, I-80131 Napoli, Italy.
| |
Collapse
|
7
|
Antimicrobial peptides design by evolutionary multiobjective optimization. PLoS Comput Biol 2013; 9:e1003212. [PMID: 24039565 PMCID: PMC3764005 DOI: 10.1371/journal.pcbi.1003212] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Accepted: 07/23/2013] [Indexed: 02/03/2023] Open
Abstract
Antimicrobial peptides (AMPs) are an abundant and wide class of molecules produced by many tissues and cell types in a variety of mammals, plant and animal species. Linear alpha-helical antimicrobial peptides are among the most widespread membrane-disruptive AMPs in nature, representing a particularly successful structural arrangement in innate defense. Recently, AMPs have received increasing attention as potential therapeutic agents, owing to their broad activity spectrum and their reduced tendency to induce resistance. The introduction of non-natural amino acids will be a key requisite in order to contrast host resistance and increase compound's life. In this work, the possibility to design novel AMP sequences with non-natural amino acids was achieved through a flexible computational approach, based on chemophysical profiles of peptide sequences. Quantitative structure-activity relationship (QSAR) descriptors were employed to code each peptide and train two statistical models in order to account for structural and functional properties of alpha-helical amphipathic AMPs. These models were then used as fitness functions for a multi-objective evolutional algorithm, together with a set of constraints for the design of a series of candidate AMPs. Two ab-initio natural peptides were synthesized and experimentally validated for antimicrobial activity, together with a series of control peptides. Furthermore, a well-known Cecropin-Mellitin alpha helical antimicrobial hybrid (CM18) was optimized by shortening its amino acid sequence while maintaining its activity and a peptide with non-natural amino acids was designed and tested, demonstrating the higher activity achievable with artificial residues. In recent years, the increasing and rapid spread of pathogenic microorganisms resistant to conventional antibiotics especially in hospital settings spurred research for the identification of novel molecules endowed with antimicrobial activities and new mechanisms of action. Antimicrobial peptides (AMPs) received an increasing attention as potential therapeutic agents because of their wide spectrum of activity and low rate in inducing bacterial resistance. Currently, research is focused on the design and optimization of novel AMPs to improve their antimicrobial activity, minimize the cytotoxicity and reduce the proteolytic degradation, also in biological fluids. To this end, the introduction of non-natural amino acids will be a key requisite in order to contrast host resistance and increase compound's life. However, the amino acidic alphabet extension to non-natural elements makes a systematic approach to AMPs design unfeasible. A rational in-silico approach can drastically reduce the number of testing compounds and consequently the production costs and the time required for evaluation of activity and toxicity. In this article, AMP in-silico design with non-natural amino acids was performed and a series of candidates were tested in order to demonstrate the potentiality of this approach.
Collapse
|
8
|
State-of-the-art and dissemination of computational tools for drug-design purposes: a survey among Italian academics and industrial institutions. Future Med Chem 2013; 5:907-27. [PMID: 23682568 DOI: 10.4155/fmc.13.59] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
During the first edition of the Computationally Driven Drug Discovery meeting, held in November 2011 at Dompé Pharma (L'Aquila, Italy), a questionnaire regarding the diffusion and the use of computational tools for drug-design purposes in both academia and industry was distributed among all participants. This is a follow-up of a previously reported investigation carried out among a few companies in 2007. The new questionnaire implemented five sections dedicated to: research group identification and classification; 18 different computational techniques; software information; hardware data; and economical business considerations. In this article, together with a detailed history of the different computational methods, a statistical analysis of the survey results that enabled the identification of the prevalent computational techniques adopted in drug-design projects is reported and a profile of the computational medicinal chemist currently working in academia and pharmaceutical companies in Italy is highlighted.
Collapse
|
9
|
Downs GM, Barnard JM. Chemical patent information systems. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2011. [DOI: 10.1002/wcms.41] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
10
|
Fjell CD, Jenssen H, Cheung WA, Hancock REW, Cherkasov A. Optimization of antibacterial peptides by genetic algorithms and cheminformatics. Chem Biol Drug Des 2010; 77:48-56. [PMID: 20942839 DOI: 10.1111/j.1747-0285.2010.01044.x] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Pathogens resistant to available drug therapies are a pressing global health problem. Short, cationic peptides represent a novel class of agents that have lower rates of drug resistance than derivatives of current antibiotics. Previously, we created a software system utilizing artificial neural networks that were trained on quantitative structure-activity relationship descriptors calculated for a total of 1400 synthetic peptides for which antibacterial activity was determined. Using the trained system, we correctly identified additional peptides with activity of 94% accuracy; active peptides were 47 of the top rated 50 peptides chosen from an in silico library of nearly 100,000 sequences. Here, we report a method of generating candidate peptide sequences using the heuristic evolutionary programming method of genetic algorithms (GA), which provided a large (19-fold) improvement in identification of novel antibacterial peptides. Approximately 0.50% of peptides evaluated during the GA method were classified as highly active, while only 0.026% of the nearly 100,000 sequences we previously screened were classified as highly active. A selection of these peptides was tested in vitro and activities reported here. While GA significantly improves the possibility of identifying candidate peptides, we encountered important pitfalls to this method that should be considered when using GA.
Collapse
Affiliation(s)
- Christopher D Fjell
- Faculty of Medicine, Division of Infectious Diseases, Department of Medicine, University of British Columbia, 2733 Heather Street, Vancouver, BC, Canada
| | | | | | | | | |
Collapse
|
11
|
Takahashi T, Fukui N, Arakawa M, Funatsu K, Ema Y. An Automatic Modeling System of the Calculation Process of a CVD Film Deposition Simulator. JOURNAL OF CHEMICAL ENGINEERING OF JAPAN 2010. [DOI: 10.1252/jcej.10we003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Takahiro Takahashi
- Department of Electrical and Electronic Engineering, Faculty of Engineering, Shizuoka University
| | - Noriyuki Fukui
- Department of Electrical and Electronic Engineering, Faculty of Engineering, Shizuoka University
| | - Masamoto Arakawa
- Department of Chemical System Engineering, School of Engineering, The University of Tokyo
| | - Kimito Funatsu
- Department of Chemical System Engineering, School of Engineering, The University of Tokyo
| | - Yoshinori Ema
- Department of Electrical and Electronic Engineering, Faculty of Engineering, Shizuoka University
| |
Collapse
|
12
|
Barnard JM, Wright PM. Towards in-house searching of Markush structures from patents. WORLD PATENT INFORMATION 2009. [DOI: 10.1016/j.wpi.2008.09.012] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
13
|
Valero S, Argente E, Botti V, Serra J, Serna P, Moliner M, Corma A. DoE framework for catalyst development based on soft computing techniques. Comput Chem Eng 2009. [DOI: 10.1016/j.compchemeng.2008.08.012] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
14
|
Askjaer S, Langgård M. Combining Pharmacophore Fingerprints and PLS-Discriminant Analysis for Virtual Screening and SAR Elucidation. J Chem Inf Model 2008; 48:476-88. [DOI: 10.1021/ci700356w] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Sune Askjaer
- Department of Medicinal Chemistry, Faculty of Pharmaceutical Sciences, University of Copenhagen, DK-2100 Copenhagen, Denmark, and Department of Computational Chemistry, H. Lundbeck A/S, Ottiliavej 9, DK-2500 Copenhagen, Valby, Denmark
| | - Morten Langgård
- Department of Medicinal Chemistry, Faculty of Pharmaceutical Sciences, University of Copenhagen, DK-2100 Copenhagen, Denmark, and Department of Computational Chemistry, H. Lundbeck A/S, Ottiliavej 9, DK-2500 Copenhagen, Valby, Denmark
| |
Collapse
|
15
|
Winkler DA. Network models in drug discovery and regenerative medicine. BIOTECHNOLOGY ANNUAL REVIEW 2008; 14:143-70. [PMID: 18606362 DOI: 10.1016/s1387-2656(08)00005-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Network motifs and modelling paradigms are attracting increasing attention as modelling tools in drug design and development, and in regenerative medicine. There is a gradual but inexorable convergence between these hitherto disparate disciplines. This review summarizes some very recent work in these areas, leading to an understanding of the complementary roles networks play and factors driving this convergence: network paradigms can be excellent ways of modelling and understanding drug molecules and their action, an understanding of the robustness and vulnerabilities of biological targets may improve the efficacy of drug design and discovery, drug design has an increasingly large role to play in directing stem cell properties, stem cell regulatory networks can be modelled in useful ways using network models at a reasonable level of scale, and the network tools of drug design are also very useful for the design of biomaterials used in regenerative medicine.
Collapse
Affiliation(s)
- David A Winkler
- CSIRO Molecular and Health Technologies, Clayton 3168, Australia.
| |
Collapse
|
16
|
Hilpert K, Fjell CD, Cherkasov A. Short linear cationic antimicrobial peptides: screening, optimizing, and prediction. Methods Mol Biol 2008; 494:127-159. [PMID: 18726572 DOI: 10.1007/978-1-59745-419-3_8] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The problem of pathogenic antibiotic-resistant bacteria such as Staphylococcus aureus and Pseudomonas aeruginosa is worsening, demonstrating the urgent need for new therapeutics that are effective against multidrug-resistant bacteria. One potential class of substances is cationic antimicrobial peptides. More than 1000 natural occurring peptides have been described so far. These peptides are short (less than 50 amino acids long), cationic, amphiphilic, demonstrate different three-dimensional structures, and appear to have different modes of action. A new screening assay was developed to characterize and optimize short antimicrobial peptides. This assay is based on peptides synthesized on cellulose, combined with a bacterium, where a luminescence gene cassette was introduced. With help of this method tens of thousands of peptides can be screened per year. Information gained by this high-throughput screening can be used in quantitative structure-activity relationships (QSAR) analysis. QSAR analysis attempts to correlate chemical structure to measurement of biological activity using statistical methods. QSAR modeling of antimicrobial peptides to date has been based on predicting differences between peptides that are highly similar. The studies have largely addressed differences in lactoferricin and protegrin derivatives or similar de novo peptides. The mathematical models used to relate the QSAR descriptors to biological activity have been linear models such as principle component analysis or multivariate linear regression. However, with the development of high-throughput peptide synthesis and an antibacterial activity assay, the numbers of peptides and sequence diversity able to be studied have increased dramatically. Also, "inductive" QSAR descriptors have been recently developed to accurately distinguish active from inactive drug-like activity in small compounds. "Inductive" QSAR in combination with more complex mathematical modeling algorithms such as artificial neural networks (ANNs) may yield powerful new methods for in silico identification of novel antimicrobial peptides.
Collapse
Affiliation(s)
- Kai Hilpert
- Centre for Microbial Diseases and Immunity Research, University of British Columbia, Vancouver, British Columbia, Canada
| | | | | |
Collapse
|
17
|
Genetic Programming: An Introduction and Tutorial, with a Survey of Techniques and Applications. STUDIES IN COMPUTATIONAL INTELLIGENCE 2008. [DOI: 10.1007/978-3-540-78293-3_22] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
18
|
Ghafourian T, Cronin M. The Effect of Variable Selection on the Non-linear Modelling of Oestrogen Receptor Binding. ACTA ACUST UNITED AC 2006. [DOI: 10.1002/qsar.200510153] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
19
|
Davies JW, Glick M, Jenkins JL. Streamlining lead discovery by aligning in silico and high-throughput screening. Curr Opin Chem Biol 2006; 10:343-51. [PMID: 16822701 DOI: 10.1016/j.cbpa.2006.06.022] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2006] [Accepted: 06/21/2006] [Indexed: 12/01/2022]
Abstract
Lead discovery in the pharmaceutical environment is largely an industrial-scale process in which it is typical to screen 1-5 million compounds in a matter of weeks using High Throughput Screening (HTS). This process is a very costly endeavor. Typically a HTS campaign of 1 million compounds will cost anywhere from $500000 to $1000000. There is consequently a great deal of pressure to maximize the return on investment by finding fast and more effective ways to screen. A panacea that has emerged over the past few years to help address this issue is in silico screening. In silico screening is now incorporated in all areas of lead discovery; from target identification and library design, to hit analysis and compound profiling. However, as lead discovery has evolved over the past few years, so has the role of in silico screening.
Collapse
Affiliation(s)
- John W Davies
- Lead Discovery Center, Novartis Institutes for Biomedical Research Inc, 250 Massachusetts Avenue, Cambridge, MA 02139, USA.
| | | | | |
Collapse
|
20
|
Abstract
Novel starting points for drug discovery projects are generally found either by screening large collections of compounds or smaller more-focused libraries. Ideally, hundreds or even thousands of actives are initially found, and these need to be reduced to a handful of promising lead series. In several sequential steps, many actives are dropped and only some are followed up. Computational chemistry tools are used in this context to predict properties, cluster hits, design focused libraries and search for close analogues to explore the potential of hit series. At the end of hit-to-lead, the project must commit to one, or preferably a few, lead series that will be refined during lead optimization and hopefully produce a drug candidate. Striving for the best possible decision is crucial because choosing the wrong series is a costly one-way street.
Collapse
Affiliation(s)
- Volker Schnecke
- Computational Lead Discovery, Department of Medicinal Chemistry, AstraZeneca R&D Mölndal, S-43183 Mölndal, Sweden.
| | | |
Collapse
|
21
|
|
22
|
Heterogeneous combinatorial catalysis applied to oil refining, petrochemistry and fine chemistry. Catal Today 2005. [DOI: 10.1016/j.cattod.2005.07.117] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
23
|
Svetnik V, Wang T, Tong C, Liaw A, Sheridan RP, Song Q. Boosting: An Ensemble Learning Tool for Compound Classification and QSAR Modeling. J Chem Inf Model 2005; 45:786-99. [PMID: 15921468 DOI: 10.1021/ci0500379] [Citation(s) in RCA: 123] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A classification and regression tool, J. H. Friedman's Stochastic Gradient Boosting (SGB), is applied to predicting a compound's quantitative or categorical biological activity based on a quantitative description of the compound's molecular structure. Stochastic Gradient Boosting is a procedure for building a sequence of models, for instance regression trees (as in this paper), whose outputs are combined to form a predicted quantity, either an estimate of the biological activity, or a class label to which a molecule belongs. In particular, the SGB procedure builds a model in a stage-wise manner by fitting each tree to the gradient of a loss function: e.g., squared error for regression and binomial log-likelihood for classification. The values of the gradient are computed for each sample in the training set, but only a random sample of these gradients is used at each stage. (Friedman showed that the well-known boosting algorithm, AdaBoost of Freund and Schapire, could be considered as a particular case of SGB.) The SGB method is used to analyze 10 cheminformatics data sets, most of which are publicly available. The results show that SGB's performance is comparable to that of Random Forest, another ensemble learning method, and are generally competitive with or superior to those of other QSAR methods. The use of SGB's variable importance with partial dependence plots for model interpretation is also illustrated.
Collapse
Affiliation(s)
- Vladimir Svetnik
- Biometrics Research and Molecular Systems, Merck Research Laboratories, P.O. Box 2000, Rahway, New Jersey 07065, USA.
| | | | | | | | | | | |
Collapse
|
24
|
Abstract
Cheminformatic analysis of drug-related compound databases has enabled the identification of the physicochemical properties that have the greatest influence on determining the drug-like characteristics of a compound. This enables definition of the parameters and profiles used in constructing a high-quality combinatorial library. Awareness of the multi-objective nature of combinatorial library construction has also given rise to techniques designed to enhance the likelihood of including the best compounds in a given library.
Collapse
Affiliation(s)
- James F Blake
- Array BioPharma Inc., 3200 Walnut Street, Boulder, Colorado 80301, USA.
| |
Collapse
|