1
|
Espinoza‐Castañeda JI, Medina‐Franco JL. MAYA (Multiple ActivitY Analyzer): An Open Access Tool to Explore Structure-Multiple Activity Relationships in the Chemical Universe. Mol Inform 2025; 44:e202400306. [PMID: 39932235 PMCID: PMC11812492 DOI: 10.1002/minf.202400306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2024] [Revised: 12/25/2024] [Accepted: 01/27/2025] [Indexed: 02/14/2025]
Abstract
Herein, we introduce MAYA (Multiple Activity Analyzer), a tool designed to automatically construct a chemical multiverse, generating multiple visualizations of chemical spaces of a compound data set described by structural descriptors of different nature such as Molecular ACCess Systems (MACCS) keys, extended connectivity fingerprints with different radius, molecular descriptors with pharmaceutical relevance, and bioactivity descriptors. These representations are integrated with various data visualization techniques for the automated analysis focused on structure - multiple activity/property relationships, enabling analysis for various problems set in user-friendly source software. The source code of MAYA is freely available on GitHub at https://github.com/IsrC11/MAYA.git.
Collapse
Affiliation(s)
- J. Israel Espinoza‐Castañeda
- J. Israel Espinoza-Castañeda - DIFACQUIM Research GroupDepartment of PharmacySchool of ChemistryUniversidad Nacional Autónoma de MéxicoAvenida Universidad 3000Mexico City04510Mexico
| | - José L. Medina‐Franco
- José L. Medina-Franco - DIFACQUIM Research GroupDepartment of PharmacySchool of ChemistryUniversidad Nacional Autónoma de MéxicoAvenida Universidad 3000Mexico City04510Mexico
| |
Collapse
|
2
|
Orlov AA, Akhmetshin TN, Horvath D, Marcou G, Varnek A. From High Dimensions to Human Insight: Exploring Dimensionality Reduction for Chemical Space Visualization. Mol Inform 2025; 44:e202400265. [PMID: 39633514 PMCID: PMC11733715 DOI: 10.1002/minf.202400265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Revised: 11/08/2024] [Accepted: 11/09/2024] [Indexed: 12/07/2024]
Abstract
Dimensionality reduction is an important exploratory data analysis method that allows high-dimensional data to be represented in a human-interpretable lower-dimensional space. It is extensively applied in the analysis of chemical libraries, where chemical structure data - represented as high-dimensional feature vectors-are transformed into 2D or 3D chemical space maps. In this paper, commonly used dimensionality reduction techniques - Principal Component Analysis (PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), and Generative Topographic Mapping (GTM) - are evaluated in terms of neighborhood preservation and visualization capability of sets of small molecules from the ChEMBL database.
Collapse
Affiliation(s)
- Alexey A. Orlov
- Laboratory of ChemoinformaticsUMR 7140 CNRSUniversity of Strasbourg, 4Blaise Pascal Str.67000StrasbourgFrance
| | - Tagir N. Akhmetshin
- Laboratory of ChemoinformaticsUMR 7140 CNRSUniversity of Strasbourg, 4Blaise Pascal Str.67000StrasbourgFrance
| | - Dragos Horvath
- Laboratory of ChemoinformaticsUMR 7140 CNRSUniversity of Strasbourg, 4Blaise Pascal Str.67000StrasbourgFrance
| | - Gilles Marcou
- Laboratory of ChemoinformaticsUMR 7140 CNRSUniversity of Strasbourg, 4Blaise Pascal Str.67000StrasbourgFrance
| | - Alexandre Varnek
- Laboratory of ChemoinformaticsUMR 7140 CNRSUniversity of Strasbourg, 4Blaise Pascal Str.67000StrasbourgFrance
| |
Collapse
|
3
|
Sosnin S. MolCompass: multi-tool for the navigation in chemical space and visual validation of QSAR/QSPR models. J Cheminform 2024; 16:98. [PMID: 39129016 PMCID: PMC11318166 DOI: 10.1186/s13321-024-00888-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 07/21/2024] [Indexed: 08/13/2024] Open
Abstract
The exponential growth of data is challenging for humans because their ability to analyze data is limited. Especially in chemistry, there is a demand for tools that can visualize molecular datasets in a convenient graphical way. We propose a new, ready-to-use, multi-tool, and open-source framework for visualizing and navigating chemical space. This framework adheres to the low-code/no-code (LCNC) paradigm, providing a KNIME node, a web-based tool, and a Python package, making it accessible to a broad cheminformatics community. The core technique of the MolCompass framework employs a pre-trained parametric t-SNE model. We demonstrate how this framework can be adapted for the visualisation of chemical space and visual validation of binary classification QSAR/QSPR models, revealing their weaknesses and identifying model cliffs. All parts of the framework are publicly available on GitHub, providing accessibility to the broad scientific community. Scientific contributionWe provide an open-source, ready-to-use set of tools for the visualization of chemical space. These tools can be insightful for chemists to analyze compound datasets and for the visual validation of QSAR/QSPR models.
Collapse
Affiliation(s)
- Sergey Sosnin
- Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, Josef-Holaubek-Platz 2, 1090, Vienna, Austria.
| |
Collapse
|
4
|
Zahoránszky-Kőhalmi G, Wan KK, Godfrey AG. Hilbert-curve assisted structure embedding method. J Cheminform 2024; 16:87. [PMID: 39075547 PMCID: PMC11285582 DOI: 10.1186/s13321-024-00850-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 04/30/2024] [Indexed: 07/31/2024] Open
Abstract
MOTIVATION Chemical space embedding methods are widely utilized in various research settings for dimensional reduction, clustering and effective visualization. The maps generated by the embedding process can provide valuable insight to medicinal chemists in terms of the relationships between structural, physicochemical and biological properties of compounds. However, these maps are known to be difficult to interpret, and the ''landscape'' on the map is prone to ''rearrangement'' when embedding different sets of compounds. RESULTS In this study we present the Hilbert-Curve Assisted Space Embedding (HCASE) method which was designed to create maps by organizing structures according to a logic familiar to medicinal chemists. First, a chemical space is created with the help of a set of ''reference scaffolds''. These scaffolds are sorted according to the medicinal chemistry inspired Scaffold-Key algorithm found in prior art. Next, the ordered scaffolds are mapped to a line which is folded into a higher dimensional (here: 2D) space. The intricately folded line is referred to as a pseudo-Hilbert-Curve. The embedding of a compound happens by locating its most similar reference scaffold in the pseudo-Hilbert-Curve and assuming the respective position. Through a series of experiments, we demonstrate the properties of the maps generated by the HCASE method. Subjects of embeddings were compounds of the DrugBank and CANVASS libraries, and the chemical spaces were defined by scaffolds extracted from the ChEMBL database. SCIENTIFIC CONTRIBUTION The novelty of HCASE method lies in generating robust and intuitive chemical space embeddings that are reflective of a medicinal chemist's reasoning, and the precedential use of space filling (Hilbert) curve in the process. AVAILABILITY https://github.com/ncats/hcase.
Collapse
Affiliation(s)
- Gergely Zahoránszky-Kőhalmi
- National Center for Advancing Translational Sciences (NCATS/NIH), 9800 Medical Center Dr., Rockville, MD, 20850, USA.
| | - Kanny K Wan
- National Center for Advancing Translational Sciences (NCATS/NIH), 9800 Medical Center Dr., Rockville, MD, 20850, USA
| | - Alexander G Godfrey
- National Center for Advancing Translational Sciences (NCATS/NIH), 9800 Medical Center Dr., Rockville, MD, 20850, USA
| |
Collapse
|
5
|
Samanipour S, Barron LP, van Herwerden D, Praetorius A, Thomas KV, O’Brien JW. Exploring the Chemical Space of the Exposome: How Far Have We Gone? JACS AU 2024; 4:2412-2425. [PMID: 39055136 PMCID: PMC11267556 DOI: 10.1021/jacsau.4c00220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 05/29/2024] [Accepted: 05/31/2024] [Indexed: 07/27/2024]
Abstract
Around two-thirds of chronic human disease can not be explained by genetics alone. The Lancet Commission on Pollution and Health estimates that 16% of global premature deaths are linked to pollution. Additionally, it is now thought that humankind has surpassed the safe planetary operating space for introducing human-made chemicals into the Earth System. Direct and indirect exposure to a myriad of chemicals, known and unknown, poses a significant threat to biodiversity and human health, from vaccine efficacy to the rise of antimicrobial resistance as well as autoimmune diseases and mental health disorders. The exposome chemical space remains largely uncharted due to the sheer number of possible chemical structures, estimated at over 1060 unique forms. Conventional methods have cataloged only a fraction of the exposome, overlooking transformation products and often yielding uncertain results. In this Perspective, we have reviewed the latest efforts in mapping the exposome chemical space and its subspaces. We also provide our view on how the integration of data-driven approaches might be able to bridge the identified gaps.
Collapse
Affiliation(s)
- Saer Samanipour
- Van’t
Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1090 GD, The Netherlands
- UvA
Data Science Center, University of Amsterdam, Amsterdam 1090 GD, The Netherlands
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Cornwall Street, Woolloongabba, Queensland 4102, Australia
| | - Leon Patrick Barron
- Van’t
Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1090 GD, The Netherlands
- MRC
Centre for Environment and Health, Environmental Research Group, School
of Public Health, Faculty of Medicine, Imperial
College London, London W12 0BZ, United Kingdom
| | - Denice van Herwerden
- Van’t
Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1090 GD, The Netherlands
| | - Antonia Praetorius
- Institute
for Biodiversity and Ecosystem Dynamics (IBED), University of Amsterdam, Amsterdam 1090 GD, The Netherlands
| | - Kevin V. Thomas
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Cornwall Street, Woolloongabba, Queensland 4102, Australia
| | - Jake William O’Brien
- Van’t
Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1090 GD, The Netherlands
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Cornwall Street, Woolloongabba, Queensland 4102, Australia
| |
Collapse
|
6
|
Shu J, Wang Y, Guo W, Liu T, Cai S, Shi T, Hu W. Carbenoid-involved reactions integrated with scaffold-based screening generates a Nav1.7 inhibitor. Commun Chem 2024; 7:135. [PMID: 38866907 PMCID: PMC11169417 DOI: 10.1038/s42004-024-01213-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 05/30/2024] [Indexed: 06/14/2024] Open
Abstract
The discovery of selective Nav1.7 inhibitors is a promising approach for developing anti-nociceptive drugs. In this study, we present a novel oxindole-based readily accessible library (OREAL), which is characterized by readily accessibility, unique chemical space, ideal drug-like properties, and structural diversity. We used a scaffold-based approach to screen the OREAL and discovered compound C4 as a potent Nav1.7 inhibitor. The bioactivity characterization of C4 reveals that it is a selective Nav1.7 inhibitor and effectively reverses Paclitaxel-induced neuropathic pain (PINP) in rodent models. Preliminary toxicology study shows C4 is negative to hERG. The consistent results of molecular docking and molecular simulations further support the reasonability of the in-silico screening and show the insight of the binding mode of C4. Our discovery of C4 paves the way for pushing the Nav1.7-based anti-nociceptive drugs forward to the clinic.
Collapse
Affiliation(s)
- Jirong Shu
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China
| | - Yuwei Wang
- Shenzhen University Health Science Center, Shenzhen, 518060, China
| | - Weijie Guo
- Shenzhen University Health Science Center, Shenzhen, 518060, China
| | - Tao Liu
- Shenzhen University Health Science Center, Shenzhen, 518060, China
| | - Song Cai
- Shenzhen University Health Science Center, Shenzhen, 518060, China
| | - Taoda Shi
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China.
| | - Wenhao Hu
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China
| |
Collapse
|
7
|
Ryzhkov FV, Ryzhkova YE, Elinson MN. Python tools for structural tasks in chemistry. Mol Divers 2024:10.1007/s11030-024-10889-7. [PMID: 38744790 DOI: 10.1007/s11030-024-10889-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 04/27/2024] [Indexed: 05/16/2024]
Abstract
In recent decades, the use of computational approaches and artificial intelligence in the scientific environment has become more widespread. In this regard, the popular and versatile programming language Python has attracted considerable attention from scientists in the field of chemistry. It is used to solve a variety of chemical and structural problems, including calculating descriptors, molecular fingerprints, graph construction, and computing chemical reaction networks. Python offers high-quality visualization tools for analyzing chemical spaces and compound libraries. This review is a list of tools for the above tasks, including scripts, libraries, ready-made programs, and web interfaces. Inevitably this manuscript does not claim to be an all-encompassing handbook including all the existing Python-based structural chemistry codes. The review serves as a starting point for scientists wishing to apply automatization or optimization to routine chemistry problems.
Collapse
Affiliation(s)
- Fedor V Ryzhkov
- N. D. Zelinsky Institute of Organic Chemistry Russian Academy of Sciences, 47 Leninsky Prospekt, Moscow, 119991, Russia.
| | - Yuliya E Ryzhkova
- N. D. Zelinsky Institute of Organic Chemistry Russian Academy of Sciences, 47 Leninsky Prospekt, Moscow, 119991, Russia
| | - Michail N Elinson
- N. D. Zelinsky Institute of Organic Chemistry Russian Academy of Sciences, 47 Leninsky Prospekt, Moscow, 119991, Russia
| |
Collapse
|
8
|
Schafer M, Brich N, Byska J, Marques SM, Bednar D, Thiel P, Kozlikova B, Krone M. InVADo: Interactive Visual Analysis of Molecular Docking Data. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:1984-1997. [PMID: 38019636 DOI: 10.1109/tvcg.2023.3337642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
Molecular docking is a key technique in various fields like structural biology, medicinal chemistry, and biotechnology. It is widely used for virtual screening during drug discovery, computer-assisted drug design, and protein engineering. A general molecular docking process consists of the target and ligand selection, their preparation, and the docking process itself, followed by the evaluation of the results. However, the most commonly used docking software provides no or very basic evaluation possibilities. Scripting and external molecular viewers are often used, which are not designed for an efficient analysis of docking results. Therefore, we developed InVADo, a comprehensive interactive visual analysis tool for large docking data. It consists of multiple linked 2D and 3D views. It filters and spatially clusters the data, and enriches it with post-docking analysis results of protein-ligand interactions and functional groups, to enable well-founded decision-making. In an exemplary case study, domain experts confirmed that InVADo facilitates and accelerates the analysis workflow. They rated it as a convenient, comprehensive, and feature-rich tool, especially useful for virtual screening.
Collapse
|
9
|
He X, Yang Z, Wang L, Sun Y, Cao H, Liang Y. NeuTox: A weighted ensemble model for screening potential neuronal cytotoxicity of chemicals based on various types of molecular representations. JOURNAL OF HAZARDOUS MATERIALS 2024; 465:133443. [PMID: 38198870 DOI: 10.1016/j.jhazmat.2024.133443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 01/02/2024] [Accepted: 01/03/2024] [Indexed: 01/12/2024]
Abstract
Chemical-induced neurotoxicity has been widely brought into focus in the risk assessment of chemical safety. However, the traditional in vivo animal models to evaluate neurotoxicity are time-consuming and expensive, which cannot completely represent the pathophysiology of neurotoxicity in humans. Cytotoxicity to human neuroblastoma cell line (SH-SY5Y) is commonly used as an alternative to animal testing for the assessment of neurotoxicity, yet it is still not appropriate for high throughput screening of potential neuronal cytotoxicity of chemicals. In this study, we constructed an ensemble prediction model, termed NeuTox, by combining multiple machine learning algorithms with molecular representations based on the weighted score of Particle Swarm Optimization. For the test set, NeuTox shows excellent performance with an accuracy of 0.9064, which are superior to the top-performing individual models. The subsequent experimental verifications reveal that 5,5'-isopropylidenedi-2-biphenylol and 4,4'-cyclo-hexylidenebisphenol exhibited stronger SH-SY5Y-based cytotoxicity compared to bisphenol A, suggesting that NeuTox has good generalization ability in the first-tier assessment of neuronal cytotoxicity of BPA analogs. For ease of use, NeuTox is presented as an online web server that can be freely accessed via http://www.iehneutox-predictor.cn/NeuToxPredict/Predict.
Collapse
Affiliation(s)
- Xuejun He
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Zeguo Yang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Ling Wang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Yuzhen Sun
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Huiming Cao
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China.
| | - Yong Liang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| |
Collapse
|
10
|
Nicolle A, Deng S, Ihme M, Kuzhagaliyeva N, Ibrahim EA, Farooq A. Mixtures Recomposition by Neural Nets: A Multidisciplinary Overview. J Chem Inf Model 2024; 64:597-620. [PMID: 38284618 DOI: 10.1021/acs.jcim.3c01633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2024]
Abstract
Artificial Neural Networks (ANNs) are transforming how we understand chemical mixtures, providing an expressive view of the chemical space and multiscale processes. Their hybridization with physical knowledge can bridge the gap between predictivity and understanding of the underlying processes. This overview explores recent progress in ANNs, particularly their potential in the 'recomposition' of chemical mixtures. Graph-based representations reveal patterns among mixture components, and deep learning models excel in capturing complexity and symmetries when compared to traditional Quantitative Structure-Property Relationship models. Key components, such as Hamiltonian networks and convolution operations, play a central role in representing multiscale mixtures. The integration of ANNs with Chemical Reaction Networks and Physics-Informed Neural Networks for inverse chemical kinetic problems is also examined. The combination of sensors with ANNs shows promise in optical and biomimetic applications. A common ground is identified in the context of statistical physics, where ANN-based methods iteratively adapt their models by blending their initial states with training data. The concept of mixture recomposition unveils a reciprocal inspiration between ANNs and reactive mixtures, highlighting learning behaviors influenced by the training environment.
Collapse
Affiliation(s)
- Andre Nicolle
- Aramco Fuel Research Center, Rueil-Malmaison 92852, France
| | - Sili Deng
- Massachusetts Institute of Technology, Cambridge 02139, Massachusetts, United States
| | - Matthias Ihme
- Stanford University, Stanford 94305, California, United States
| | | | - Emad Al Ibrahim
- King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia
| | - Aamir Farooq
- King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia
| |
Collapse
|
11
|
Rodríguez-Villar K, Cortés-Benítez F, Palacios-Espinosa JF, Pérez-Villanueva J. Similarity searching for anticandidal agents employing a repurposing approach. Mol Inform 2024; 43:e202300206. [PMID: 38095132 DOI: 10.1002/minf.202300206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 12/01/2023] [Accepted: 12/11/2023] [Indexed: 01/05/2024]
Abstract
Fungal infections caused by Candida are still a public health concern. Particularly, the resistance to traditional chemotherapeutic agents is a major issue that requires efforts to develop new therapies. One of the most interesting approaches to finding new active compounds is drug repurposing aided by computational methods. In this work, two databases containing anticandidal agents and drugs were studied employing cheminformatics and compared by similarity methods. The results showed 36 drugs with high similarities to some candicidals. From these drugs, trimetozin, osalmid and metochalcone were evaluated against C. albicans (18804), C. glabrata (90030), and miconazole-resistant strain C. glabrata (32554). Osalmid and metochalcone were the best, with activity in the micromolar range. These findings represent an opportunity to continue with the research on the potential antifungal application of osalmid and metochalcone as well as the design of structurally related derivatives.
Collapse
Affiliation(s)
- Karen Rodríguez-Villar
- Departamento de Sistemas Biológicos, División de Ciencias Biológicas y de la Salud, Universidad Autónoma Metropolitana-Xochimilco (UAM-X), Calzada del Hueso 1100, Col. Villa Quietud, Delegación Coyoacán, Ciudad de México, 04960, Mexico
| | - Francisco Cortés-Benítez
- Departamento de Sistemas Biológicos, División de Ciencias Biológicas y de la Salud, Universidad Autónoma Metropolitana-Xochimilco (UAM-X), Calzada del Hueso 1100, Col. Villa Quietud, Delegación Coyoacán, Ciudad de México, 04960, Mexico
| | - Juan Francisco Palacios-Espinosa
- Departamento de Sistemas Biológicos, División de Ciencias Biológicas y de la Salud, Universidad Autónoma Metropolitana-Xochimilco (UAM-X), Calzada del Hueso 1100, Col. Villa Quietud, Delegación Coyoacán, Ciudad de México, 04960, Mexico
| | - Jaime Pérez-Villanueva
- Departamento de Sistemas Biológicos, División de Ciencias Biológicas y de la Salud, Universidad Autónoma Metropolitana-Xochimilco (UAM-X), Calzada del Hueso 1100, Col. Villa Quietud, Delegación Coyoacán, Ciudad de México, 04960, Mexico
| |
Collapse
|
12
|
Tandi M, Tripathi N, Gaur A, Gopal B, Sundriyal S. Curation and cheminformatics analysis of a Ugi-reaction derived library (URDL) of synthetically tractable small molecules for virtual screening application. Mol Divers 2024; 28:37-50. [PMID: 36574164 DOI: 10.1007/s11030-022-10588-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 12/17/2022] [Indexed: 12/28/2022]
Abstract
Virtual screening (VS) is an important approach in drug discovery and relies on the availability of a virtual library of synthetically tractable molecules. Ugi reaction (UR) represents an important multi-component reaction (MCR) that reliably produces a peptidomimetic scaffold. Recent literature shows that a tactically assembled Ugi adduct can be subjected to further chemical modifications to yield a variety of rings and scaffolds, thus, renewing the interest in this old reaction. Given the reliability and efficiency of UR, we collated an UR derived library (URDL) of small molecules (total = 5773) for VS. The synthesis of the majority of URDL molecules may be carried out in 1-2 pots in a time and cost-effective manner. The detailed analysis of the average property and chemical space of URDL was also carried out using the open-source Datawarrior program. The comparison with FDA-approved oral drugs and inhibitors of protein-protein interactions (iPPIs) suggests URDL molecules are 'clean', drug-like, and conform to a structurally distinct space from the other two categories. The average physicochemical properties of compounds in the URDL library lie closer to iPPI molecules than oral drugs thus suggesting that the URDL resource can be applied to discover novel iPPI molecules. The URDL molecules consist of diverse ring systems, many of which have not been exploited yet for drug design. Thus, URDL represents a small virtual library of drug-like molecules with unexplored chemical space designed for VS. The structures of all molecules of URDL, oral drugs, and iPPI compounds are being made freely accessible as supplementary information for broader application.
Collapse
Affiliation(s)
- Mukesh Tandi
- Department of Pharmacy, Birla Institute of Technology and Science Pilani, Pilani Campus, Rajasthan, 333031, India
| | - Nancy Tripathi
- Department of Pharmacy, Birla Institute of Technology and Science Pilani, Pilani Campus, Rajasthan, 333031, India
| | - Animesh Gaur
- Department of Pharmacy, Birla Institute of Technology and Science Pilani, Pilani Campus, Rajasthan, 333031, India
| | | | - Sandeep Sundriyal
- Department of Pharmacy, Birla Institute of Technology and Science Pilani, Pilani Campus, Rajasthan, 333031, India.
| |
Collapse
|
13
|
Olmedo DA, Durant-Archibold AA, López-Pérez JL, Medina-Franco JL. Design and Diversity Analysis of Chemical Libraries in Drug Discovery. Comb Chem High Throughput Screen 2024; 27:502-515. [PMID: 37409545 DOI: 10.2174/1386207326666230705150110] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 05/30/2023] [Accepted: 05/30/2023] [Indexed: 07/07/2023]
Abstract
Chemical libraries and compound data sets are among the main inputs to start the drug discovery process at universities, research institutes, and the pharmaceutical industry. The approach used in the design of compound libraries, the chemical information they possess, and the representation of structures, play a fundamental role in the development of studies: chemoinformatics, food informatics, in silico pharmacokinetics, computational toxicology, bioinformatics, and molecular modeling to generate computational hits that will continue the optimization process of drug candidates. The prospects for growth in drug discovery and development processes in chemical, biotechnological, and pharmaceutical companies began a few years ago by integrating computational tools with artificial intelligence methodologies. It is anticipated that it will increase the number of drugs approved by regulatory agencies shortly.
Collapse
Affiliation(s)
- Dionisio A Olmedo
- Centro de Investigaciones Farmacognósticas de la Flora Panameña (CIFLORPAN), Facultad de Farmacia, Universidad de Panamá, Ciudad de Panamá, Apartado, 0824-00178, Panamá
- Sistema Nacional de Investigación (SNI), Secretaria Nacional de Ciencia, Tecnología e Innovación (SENACYT), Ciudad del Saber, Clayton, Panamá
| | - Armando A Durant-Archibold
- Centro de Biodiversidad y Descubrimiento de Drogas, Instituto de Investigaciones Científicas y Servicios de Alta Tecnología (INDICASAT AIP), Apartado, 0843-01103, Panamá
- Departamento de Bioquímica, Facultad de Ciencias Naturales, Exactas y Tecnología, Universidad de Panamá, Ciudad de Panamá, Panamá
| | - José Luis López-Pérez
- CESIFAR, Departamento de Farmacología, Facultad de Medicina, Universidad de Panamá, Ciudad de Panamá, Panamá
- Departamento de Ciencias Farmacéuticas, Facultad de Farmacia, Universidad de Salamanca, Avda. Campo Charro s/n, 37071 Salamanca, España
| | - José Luis Medina-Franco
- DIFACQUIM Grupo de Investigación, Departamento de Farmacia, Escuela de Química, Universidad Nacional Autónoma de México, Ciudad de México, Apartado, 04510, México
| |
Collapse
|
14
|
Gaytán-Hernández D, Chávez-Hernández AL, López-López E, Miranda-Salas J, Saldívar-González FI, Medina-Franco JL. Art driven by visual representations of chemical space. J Cheminform 2023; 15:100. [PMID: 37865794 PMCID: PMC10590523 DOI: 10.1186/s13321-023-00770-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 10/13/2023] [Indexed: 10/23/2023] Open
Abstract
Science and art have been connected for centuries. With the development of new computational methods, new scientific disciplines have emerged, such as computational chemistry, and related fields, such as cheminformatics. Chemoinformatics is grounded on the chemical space concept: a multi-descriptor space in which chemical structures are described. In several practical applications, visual representations of the chemical space of compound datasets are low-dimensional plots helpful in identifying patterns. However, the authors propose that the plots can also be used as artistic expressions. This manuscript introduces an approach to merging art with chemoinformatics through visual and artistic representations of chemical space. As case studies, we portray the chemical space of food chemicals and other compounds to generate visually appealing graphs with twofold benefits: sharing chemical knowledge and developing pieces of art driven by chemoinformatics. The art driven by chemical space visualization will help increase the application of chemistry and art and contribute to general education and dissemination of chemoinformatics and chemistry through artistic expressions. All the code and data sets to reproduce the visual representation of the chemical space presented in the manuscript are freely available at https://github.com/DIFACQUIM/Art-Driven-by-Visual-Representations-of-Chemical-Space- . Scientific contribution: Chemical space as a concept to create digital art and as a tool to train and introduce students to cheminformatics.
Collapse
Affiliation(s)
- Daniela Gaytán-Hernández
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - Ana L Chávez-Hernández
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - Edgar López-López
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
- Department of Chemistry and Graduate Program in Pharmacology, Center for Research and Advanced Studies of the National Polytechnic Institute, 07000, Mexico City, Mexico
| | - Jazmín Miranda-Salas
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - Fernanda I Saldívar-González
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - José L Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.
| |
Collapse
|
15
|
Kerstjens A, De Winter H. A molecule perturbation software library and its application to study the effects of molecular design constraints. J Cheminform 2023; 15:89. [PMID: 37752561 PMCID: PMC10523775 DOI: 10.1186/s13321-023-00761-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 09/15/2023] [Indexed: 09/28/2023] Open
Abstract
Computational molecular design can yield chemically unreasonable compounds when performed carelessly. A popular strategy to mitigate this risk is mimicking reference chemistry. This is commonly achieved by restricting the way in which molecules are constructed or modified. While it is well established that such an approach helps in designing chemically appealing molecules, concerns about these restrictions impacting chemical space exploration negatively linger. In this work we present a software library for constrained graph-based molecule manipulation and showcase its functionality by developing a molecule generator. Said generator designs molecules mimicking reference chemical features of differing granularity. We find that restricting molecular construction lightly, beyond the usual positive effects on drug-likeness and synthesizability of designed molecules, provides guidance to optimization algorithms navigating chemical space. Nonetheless, restricting molecular construction excessively can indeed hinder effective chemical space exploration.
Collapse
Affiliation(s)
- Alan Kerstjens
- Laboratory of Medicinal Chemistry, Department of Pharmaceutical Sciences, University of Antwerp, Universiteitslaan 1, 2610, Wilrijk, Belgium
| | - Hans De Winter
- Laboratory of Medicinal Chemistry, Department of Pharmaceutical Sciences, University of Antwerp, Universiteitslaan 1, 2610, Wilrijk, Belgium.
| |
Collapse
|
16
|
López-Pérez K, López-López E, Medina-Franco JL, Miranda-Quintana RA. Sampling and Mapping Chemical Space with Extended Similarity Indices. Molecules 2023; 28:6333. [PMID: 37687162 PMCID: PMC10489020 DOI: 10.3390/molecules28176333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 08/24/2023] [Accepted: 08/26/2023] [Indexed: 09/10/2023] Open
Abstract
Visualization of the chemical space is useful in many aspects of chemistry, including compound library design, diversity analysis, and exploring structure-property relationships, to name a few. Examples of notable research areas where the visualization of chemical space has strong applications are drug discovery and natural product research. However, the sheer volume of even comparatively small sub-sections of chemical space implies that we need to use approximations at the time of navigating through chemical space. ChemMaps is a visualization methodology that approximates the distribution of compounds in large datasets based on the selection of satellite compounds that yield a similar mapping of the whole dataset when principal component analysis on a similarity matrix is performed. Here, we show how the recently proposed extended similarity indices can help find regions that are relevant to sample satellites and reduce the amount of high-dimensional data needed to describe a library's chemical space.
Collapse
Affiliation(s)
- Kenneth López-Pérez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL 32611, USA;
| | - Edgar López-López
- DIFACQUIM Research Group, Department of Pharmacy, National Autonomous University of Mexico, Mexico City 04510, Mexico;
- Department of Chemistry and Graduate Program in Pharmacology, Center for Research and Advanced Studies of the National Polytechnic Institute, Mexico City 07000, Mexico
| | - José L. Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, National Autonomous University of Mexico, Mexico City 04510, Mexico;
| | | |
Collapse
|
17
|
Medina‐Franco JL, Chávez‐Hernández AL, López‐López E, Saldívar‐González FI. Chemical Multiverse: An Expanded View of Chemical Space. Mol Inform 2022; 41:e2200116. [PMID: 35916110 PMCID: PMC9787733 DOI: 10.1002/minf.202200116] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 08/01/2022] [Indexed: 12/30/2022]
Abstract
Technological advances and practical applications of the chemical space concept in drug discovery, natural product research, and other research areas have attracted the scientific community's attention. The large- and ultra-large chemical spaces are associated with the significant increase in the number of compounds that can potentially be made and exist and the increasing number of experimental and calculated descriptors, that are emerging that encode the molecular structure and/or property aspects of the molecules. Due to the importance and continued evolution of compound libraries, herein, we discuss definitions proposed in the literature for chemical space and emphasize the convenience, discussed in the literature to use complementary descriptors to obtain a comprehensive view of the chemical space of compound data sets. In this regard, we introduce the term chemical multiverse to refer to the comprehensive analysis of compound data sets through several chemical spaces, each defined by a different set of chemical representations. The chemical multiverse is contrasted with a related idea: consensus chemical space.
Collapse
Affiliation(s)
- José L. Medina‐Franco
- DIFACQUIM research group, Department of Pharmacy, School of ChemistryNational Autonomous University of MexicoMexico City04510Mexico
| | - Ana L. Chávez‐Hernández
- DIFACQUIM research group, Department of Pharmacy, School of ChemistryNational Autonomous University of MexicoMexico City04510Mexico
| | - Edgar López‐López
- Department of PharmacologyCenter for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV)Mexico City07360Mexico
| | - Fernanda I. Saldívar‐González
- DIFACQUIM research group, Department of Pharmacy, School of ChemistryNational Autonomous University of MexicoMexico City04510Mexico
| |
Collapse
|
18
|
Liu Z, Du J, Lin Z, Li Z, Liu B, Cui Z, Fang J, Xie L. DenovoProfiling: A webserver for de novo generated molecule library profiling. Comput Struct Biotechnol J 2022; 20:4082-4097. [PMID: 36016718 PMCID: PMC9379519 DOI: 10.1016/j.csbj.2022.07.045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 07/25/2022] [Accepted: 07/25/2022] [Indexed: 01/10/2023] Open
Abstract
Various deep learning-based architectures for molecular generation have been proposed for de novo drug design. The flourish of the de novo molecular generation methods and applications has created a great demand for the visualization and functional profiling for the de novo generated molecules. An increasing number of publicly available chemogenomic databases sets good foundations and creates good opportunities for comprehensive profiling of the de novo library. In this paper, we present DenovoProfiling, a webserver dedicated to de novo library visualization and functional profiling. Currently, DenovoProfiling contains six modules: (1) identification & visualization module for chemical structure visualization and identify the reported structures, (2) chemical space module for chemical space exploration using similarity maps, principal components analysis (PCA), drug-like properties distribution, and scaffold-based clustering, (3) ADMET prediction module for predicting the ADMET properties of the de novo molecules, (4) molecular alignment module for three dimensional molecular shape analysis, (5) drugs mapping module for identifying structural similar drugs, and (6) target & pathway module for identifying the reported targets and corresponding functional pathways. DenovoProfiling could provide structural identification, chemical space exploration, drug mapping, and target & pathway information. The comprehensive annotated information could give users a clear picture of their de novo library and could guide the further selection of candidates for chemical synthesis and biological confirmation. DenovoProfiling is freely available at http://denovoprofiling.xielab.net.
Collapse
Key Words
- DDR1, Discovered potent discoidin domain receptor 1
- De novo drug design
- De novo molecule library
- Deep learning
- FBDD, Fragment-based drug design
- FDR, False discovery rate
- GAN, Generative adversarial networks
- HTS, High throughput screening
- LSTM, Long short-term memory
- Library profiling
- PCA, Principal components analysis
- RNN, Recurrent neural networks
- SCA, Scaffold-based classification approach
- VAE, Variational autoencoders
Collapse
Affiliation(s)
- Zhihong Liu
- School of Public Health, Xinxiang Medical University, Xinxiang, China
- Guangdong Provincial Key Laboratory of Microbial Culture Collection and Application, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China
| | - Jiewen Du
- Beijing Jingpai Technology Co., Ltd., 1500-1, Hailong Building Z-Park, Beijing 100090, China
| | - Ziying Lin
- Guangdong Provincial Key Laboratory of Microbial Culture Collection and Application, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China
| | - Ze Li
- School of Public Health, Xinxiang Medical University, Xinxiang, China
| | - Bingdong Liu
- Guangdong Provincial Key Laboratory of Microbial Culture Collection and Application, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China
| | - Zongbin Cui
- Guangdong Provincial Key Laboratory of Microbial Culture Collection and Application, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China
| | - Jiansong Fang
- Science and Technology Innovation Center, Guangzhou University of Chinese Medicine, Guangzhou, China
- Corresponding authors at: School of Public Health, Xinxiang Medical University, Xinxiang, China (L. Xie). Science and Technology Innovation Center, Guangzhou University of Chinese Medicine, Guangzhou, China (J. Fang).
| | - Liwei Xie
- School of Public Health, Xinxiang Medical University, Xinxiang, China
- Guangdong Provincial Key Laboratory of Microbial Culture Collection and Application, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China
- Zhujiang Hospital, Southern Medical University, Guangzhou, China
- Corresponding authors at: School of Public Health, Xinxiang Medical University, Xinxiang, China (L. Xie). Science and Technology Innovation Center, Guangzhou University of Chinese Medicine, Guangzhou, China (J. Fang).
| |
Collapse
|
19
|
Saldívar-González FI, Medina-Franco JL. Approaches for enhancing the analysis of chemical space for drug discovery. Expert Opin Drug Discov 2022; 17:789-798. [PMID: 35640229 DOI: 10.1080/17460441.2022.2084608] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
INTRODUCTION Chemical space is a powerful, general, and practical conceptual framework in drug discovery and other areas in chemistry that addresses the diversity of molecules and it has various applications. Moreover, chemical space is a cornerstone of chemoinformatics as a scientific discipline. In response to the increase in the set of chemical compounds in databases, generators of chemical structures, and tools to calculate molecular descriptors, novel approaches to generate visual representations of chemical space in low dimensions are emerging and evolving. Such approaches include a wide range of commercial and free applications, software, and open-source methods. AREAS COVERED The current state of chemical space in drug design and discovery is reviewed. The topics discussed herein include advances for efficient navigation in chemical space, the use of this concept in assessing the diversity of different data sets, exploring structure-property/activity relationships for one or multiple endpoints, and compound library design. Recent advances in methodologies for generating visual representations of chemical space have been highlighted, thereby emphasizing open-source methods. EXPERT OPINION Quantitative and qualitative generation and analysis of chemical space require novel approaches for handling the increasing number of molecules and their information available in chemical databases (including emerging ultra-large libraries). In addition, it is of utmost importance to note that chemical space is a conceptual framework that goes beyond visual representation in low dimensions. However, the graphical representation of chemical space has several practical applications in drug discovery and beyond.
Collapse
Affiliation(s)
- Fernanda I Saldívar-González
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| | - José L Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| |
Collapse
|
20
|
Harada Y, Hatakeyama M, Maeda S, Gao Q, Koizumi K, Sakamoto Y, Ono Y, Nakamura S. Molecular Design Learned from the Natural Product Porphyra-334: Molecular Generation via Chemical Variational Autoencoder versus Database Mining via Similarity Search, A Comparative Study. ACS OMEGA 2022; 7:8581-8590. [PMID: 35309498 PMCID: PMC8928499 DOI: 10.1021/acsomega.1c06453] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 02/18/2022] [Indexed: 06/14/2023]
Abstract
A comparative study is presented. The method via chemical variational autoencoder (VAE) and the method via similarity search are compared, focusing on their generation ability for new functional molecular design. Focusing on the natural porphyra-334 as a model molecule, we generated three groups: molecules of mycosporine-like amino acids (MAAs) as seeds (G SEEDS ), molecules generated via chemical VAE (G VAE ) and molecules gathered via similarity search (G SIM ). The number of molecules that satisfy the condition for the light absorption ability of porphyra-334 in G SEEDS , G VAE , and G SIM are 52, 138, and 6, respectively. The method via chemical VAE shows a promising potential for future molecular design. By using quantum chemistry wave function properties for chemical VAE, we find new molecules that are comparable to porphyra-334, including some with unexpected geometries. At the end, we show a group of molecules found with this method.
Collapse
Affiliation(s)
- Yuki Harada
- Cluster
for Science, Technology, and Innovation Hub, Nakamura Laboratory, RIKEN, 2-1, Hirosawa, Wako, Saitama 351-0198, Japan
| | - Makoto Hatakeyama
- Cluster
for Science, Technology, and Innovation Hub, Nakamura Laboratory, RIKEN, 2-1, Hirosawa, Wako, Saitama 351-0198, Japan
- Sanyo-Onoda
City University, 1-1-1
Daigakudori, Sanyo-Onoda, Yamaguchi 756-0884, Japan
| | - Shuichi Maeda
- Cluster
for Science, Technology, and Innovation Hub, Nakamura Laboratory, RIKEN, 2-1, Hirosawa, Wako, Saitama 351-0198, Japan
| | - Qi Gao
- Mitsubishi
Chemical Corporation Science & Innovation Center 1000 Kamoshida-cho, Yokohama, Kanagawa 227-8502, Japan
| | - Kenichi Koizumi
- Cluster
for Science, Technology, and Innovation Hub, Nakamura Laboratory, RIKEN, 2-1, Hirosawa, Wako, Saitama 351-0198, Japan
| | - Yuki Sakamoto
- Cluster
for Science, Technology, and Innovation Hub, Nakamura Laboratory, RIKEN, 2-1, Hirosawa, Wako, Saitama 351-0198, Japan
| | - Yuuki Ono
- Mitsubishi
Chemical Corporation Science & Innovation Center 1000 Kamoshida-cho, Yokohama, Kanagawa 227-8502, Japan
| | - Shinichiro Nakamura
- Cluster
for Science, Technology, and Innovation Hub, Nakamura Laboratory, RIKEN, 2-1, Hirosawa, Wako, Saitama 351-0198, Japan
| |
Collapse
|
21
|
Saldívar-González FI, Aldas-Bulos VD, Medina-Franco JL, Plisson F. Natural product drug discovery in the artificial intelligence era. Chem Sci 2022; 13:1526-1546. [PMID: 35282622 PMCID: PMC8827052 DOI: 10.1039/d1sc04471k] [Citation(s) in RCA: 70] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 12/10/2021] [Indexed: 12/19/2022] Open
Abstract
Natural products (NPs) are primarily recognized as privileged structures to interact with protein drug targets. Their unique characteristics and structural diversity continue to marvel scientists for developing NP-inspired medicines, even though the pharmaceutical industry has largely given up. High-performance computer hardware, extensive storage, accessible software and affordable online education have democratized the use of artificial intelligence (AI) in many sectors and research areas. The last decades have introduced natural language processing and machine learning algorithms, two subfields of AI, to tackle NP drug discovery challenges and open up opportunities. In this article, we review and discuss the rational applications of AI approaches developed to assist in discovering bioactive NPs and capturing the molecular "patterns" of these privileged structures for combinatorial design or target selectivity.
Collapse
Affiliation(s)
- F I Saldívar-González
- DIFACQUIM Research Group, School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México Avenida Universidad 3000 04510 Mexico Mexico
| | - V D Aldas-Bulos
- Unidad de Genómica Avanzada, Laboratorio Nacional de Genómica para la Biodiversidad (Langebio), Centro de Investigación y de Estudios Avanzados del IPN Irapuato Guanajuato Mexico
| | - J L Medina-Franco
- DIFACQUIM Research Group, School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México Avenida Universidad 3000 04510 Mexico Mexico
| | - F Plisson
- CONACYT - Unidad de Genómica Avanzada, Laboratorio Nacional de Genómica para la Biodiversidad (Langebio), Centro de Investigación y de Estudios Avanzados del IPN Irapuato Guanajuato Mexico
| |
Collapse
|
22
|
Goryashchenko AS, Uvarova VI, Osolodkin DI, Ishmukhametov AA. Discovery of small molecule antivirals targeting tick-borne encephalitis virus. ANNUAL REPORTS IN MEDICINAL CHEMISTRY 2022. [DOI: 10.1016/bs.armc.2022.08.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
23
|
Andronov M, Fedorov MV, Sosnin S. Exploring Chemical Reaction Space with Reaction Difference Fingerprints and Parametric t-SNE. ACS OMEGA 2021; 6:30743-30751. [PMID: 34805702 PMCID: PMC8600617 DOI: 10.1021/acsomega.1c04778] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 10/18/2021] [Indexed: 06/13/2023]
Abstract
Humans prefer visual representations for the analysis of large databases. In this work, we suggest a method for the visualization of the chemical reaction space. Our technique uses the t-SNE approach that is parameterized using a deep neural network (parametric t-SNE). We demonstrated that the parametric t-SNE combined with reaction difference fingerprints could provide a tool for the projection of chemical reactions on a low-dimensional manifold for easy exploration of reaction space. We showed that the global reaction landscape projected on a 2D plane corresponds well with the already known reaction types. The application of a pretrained parametric t-SNE model to new reactions allows chemists to study these reactions in a global reaction space. We validated the feasibility of this approach for two commercial drugs, darunavir and montelukast. We believe that our method can help to explore reaction space and will inspire chemists to find new reactions and synthetic ways.
Collapse
Affiliation(s)
- Mikhail Andronov
- Faculty
of Fundamental Physical and Chemical Engineering, Lomonosov Moscow State University, Leninskie gory, 1, Moscow 119991, Russian Federation
| | - Maxim V. Fedorov
- Sirius
University of Science and Technology, Olimpiysky Ave. b.1, Sochi 354000, Russian Federation
- Syntelly
LLC, Bolshoy Boulevard
30, bld. 1, Moscow 121205, Russian Federation
- Skolkovo
Institute of Science and Technology, Bolshoy Boulevard 30, bld. 1, Moscow 121205, Russian
Federation
| | - Sergey Sosnin
- Syntelly
LLC, Bolshoy Boulevard
30, bld. 1, Moscow 121205, Russian Federation
- Skolkovo
Institute of Science and Technology, Bolshoy Boulevard 30, bld. 1, Moscow 121205, Russian
Federation
| |
Collapse
|
24
|
Santiago Á, Guzmán-Ocampo DC, Aguayo-Ortiz R, Dominguez L. Characterizing the Chemical Space of γ-Secretase Inhibitors and Modulators. ACS Chem Neurosci 2021; 12:2765-2775. [PMID: 34291906 DOI: 10.1021/acschemneuro.1c00313] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
γ-Secretase (GS) is one of the most attractive molecular targets for the treatment of Alzheimer's disease (AD). Its key role in the final step of amyloid-β peptides generation and its relationship in the cascade of events for disease development have caught the attention of many pharmaceutical groups. Over the past years, different inhibitors and modulators have been evaluated as promising therapeutics against AD. However, despite the great chemical diversity of the reported compounds, a global classification and visual representation of the chemical space for GS inhibitors and modulators remain unavailable. In the present work, we carried out a two-dimensional (2D) chemical space analysis from different classes and subclasses of GS inhibitors and modulators based on their structural similarity. Along with the novel structural information available for GS complexes, our analysis opens the possibility to identify compounds with high molecular similarity, critical to finding new chemical structures through the optimization of existing compounds and relating them with a potential binding site.
Collapse
Affiliation(s)
- Ángel Santiago
- Departamento de Fisicoquímica, Facultad de Química, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | - Dulce C. Guzmán-Ocampo
- Departamento de Fisicoquímica, Facultad de Química, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | - Rodrigo Aguayo-Ortiz
- Departamento de Farmacia, Facultad de Química, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | - Laura Dominguez
- Departamento de Fisicoquímica, Facultad de Química, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| |
Collapse
|
25
|
Medina-Franco JL, Sánchez-Cruz N, López-López E, Díaz-Eufracio BI. Progress on open chemoinformatic tools for expanding and exploring the chemical space. J Comput Aided Mol Des 2021; 36:341-354. [PMID: 34143323 PMCID: PMC8211976 DOI: 10.1007/s10822-021-00399-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 06/14/2021] [Indexed: 01/10/2023]
Abstract
The concept of chemical space is a cornerstone in chemoinformatics, and it has broad conceptual and practical applicability in many areas of chemistry, including drug design and discovery. One of the most considerable impacts is in the study of structure-property relationships where the property can be a biological activity or any other characteristic of interest to a particular chemistry discipline. The chemical space is highly dependent on the molecular representation that is also a cornerstone concept in computational chemistry. Herein, we discuss the recent progress on chemoinformatic tools developed to expand and characterize the chemical space of compound data sets using different types of molecular representations, generate visual representations of such spaces, and explore structure-property relationships in the context of chemical spaces. We emphasize the development of methods and freely available tools focusing on drug discovery applications. We also comment on the general advantages and shortcomings of using freely available and easy-to-use tools and discuss the value of using such open resources for research, education, and scientific dissemination.
Collapse
Affiliation(s)
- José L Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico.
| | - Norberto Sánchez-Cruz
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico
| | - Edgar López-López
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico.,Departamento de Química y Programa de Posgrado en Farmacología, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, Apartado 14-740, 07000, Mexico City, Mexico
| | - Bárbara I Díaz-Eufracio
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico
| |
Collapse
|
26
|
Vasyuchenko EP, Orekhov PS, Armeev GA, Bozdaganyan ME. CPE-DB: An Open Database of Chemical Penetration Enhancers. Pharmaceutics 2021; 13:66. [PMID: 33430205 PMCID: PMC7825720 DOI: 10.3390/pharmaceutics13010066] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 12/22/2020] [Accepted: 12/28/2020] [Indexed: 11/21/2022] Open
Abstract
The cutaneous delivery route currently accounts for almost 10% of all administered drugs and it is becoming more common. Chemical penetration enhancers (CPEs) increase the transport of drugs across skin layers by different mechanisms that depend on the chemical nature of the penetration enhancers. In our work, we created a chemical penetration enhancer database (CPE-DB) that is, to the best of our knowledge, the first CPE database. We collected information about known enhancers and their derivatives in a single database, and classified and characterized their molecular diversity in terms of scaffold content, key chemical moieties, molecular descriptors, etc. CPE-DB can be used for virtual screening and similarity search to identify new potent and safe enhancers, building quantitative structure-activity relationship (QSAR) and quantitative structure-property relationship (QSPR) models, and other machine-learning (ML) applications for the prediction of biological activity.
Collapse
Affiliation(s)
- Ekaterina P. Vasyuchenko
- School of Biology, Lomonosov Moscow State University, 119234 Moscow, Russia; (E.P.V.); (P.S.O.); (G.A.A.)
| | - Philipp S. Orekhov
- School of Biology, Lomonosov Moscow State University, 119234 Moscow, Russia; (E.P.V.); (P.S.O.); (G.A.A.)
- Institute of Personalized Medicine, Sechenov University, 119991 Moscow, Russia
- Research Center of Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141701 Dolgoprudny, Russia
| | - Grigoriy A. Armeev
- School of Biology, Lomonosov Moscow State University, 119234 Moscow, Russia; (E.P.V.); (P.S.O.); (G.A.A.)
| | - Marine E. Bozdaganyan
- School of Biology, Lomonosov Moscow State University, 119234 Moscow, Russia; (E.P.V.); (P.S.O.); (G.A.A.)
- N.N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
- Department of ChemBioTech, Polytechnic University, B. Semyonovskaya 38, 107023 Moscow, Russia
| |
Collapse
|
27
|
Medina-Franco JL, Saldívar-González FI. Cheminformatics to Characterize Pharmacologically Active Natural Products. Biomolecules 2020; 10:E1566. [PMID: 33213003 PMCID: PMC7698493 DOI: 10.3390/biom10111566] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 11/11/2020] [Accepted: 11/14/2020] [Indexed: 12/19/2022] Open
Abstract
Natural products have a significant role in drug discovery. Natural products have distinctive chemical structures that have contributed to identifying and developing drugs for different therapeutic areas. Moreover, natural products are significant sources of inspiration or starting points to develop new therapeutic agents. Natural products such as peptides and macrocycles, and other compounds with unique features represent attractive sources to address complex diseases. Computational approaches that use chemoinformatics and molecular modeling methods contribute to speed up natural product-based drug discovery. Several research groups have recently used computational methodologies to organize data, interpret results, generate and test hypotheses, filter large chemical databases before the experimental screening, and design experiments. This review discusses a broad range of chemoinformatics applications to support natural product-based drug discovery. We emphasize profiling natural product data sets in terms of diversity; complexity; acid/base; absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) properties; and fragment analysis. Novel techniques for the visual representation of the chemical space are also discussed.
Collapse
Affiliation(s)
- José L. Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico;
| | | |
Collapse
|
28
|
Aguilera-Mendoza L, Marrero-Ponce Y, García-Jacas CR, Chavez E, Beltran JA, Guillen-Ramirez HA, Brizuela CA. Automatic construction of molecular similarity networks for visual graph mining in chemical space of bioactive peptides: an unsupervised learning approach. Sci Rep 2020; 10:18074. [PMID: 33093586 PMCID: PMC7583304 DOI: 10.1038/s41598-020-75029-1] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2020] [Accepted: 09/23/2020] [Indexed: 12/15/2022] Open
Abstract
The increasing interest in bioactive peptides with therapeutic potentials has been reflected in a large variety of biological databases published over the last years. However, the knowledge discovery process from these heterogeneous data sources is a nontrivial task, becoming the essence of our research endeavor. Therefore, we devise a unified data model based on molecular similarity networks for representing a chemical reference space of bioactive peptides, having an implicit knowledge that is currently not explicitly accessed in existing biological databases. Indeed, our main contribution is a novel workflow for the automatic construction of such similarity networks, enabling visual graph mining techniques to uncover new insights from the "ocean" of known bioactive peptides. The workflow presented here relies on the following sequential steps: (i) calculation of molecular descriptors by applying statistical and aggregation operators on amino acid property vectors; (ii) a two-stage unsupervised feature selection method to identify an optimized subset of descriptors using the concepts of entropy and mutual information; (iii) generation of sparse networks where nodes represent bioactive peptides, and edges between two nodes denote their pairwise similarity/distance relationships in the defined descriptor space; and (iv) exploratory analysis using visual inspection in combination with clustering and network science techniques. For practical purposes, the proposed workflow has been implemented in our visual analytics software tool ( http://mobiosd-hub.com/starpep/ ), to assist researchers in extracting useful information from an integrated collection of 45120 bioactive peptides, which is one of the largest and most diverse data in its field. Finally, we illustrate the applicability of the proposed workflow for discovering central nodes in molecular similarity networks that may represent a biologically relevant chemical space known to date.
Collapse
Affiliation(s)
- Longendri Aguilera-Mendoza
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Baja California, 22860, Mexico
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA), Av. Interoceánica Km 12 1/2 y Av. Florencia, 17-1200-841, Quito, Ecuador.
- Grupo GINUMED, Corporacion Universitaria Rafael Nuñez. Facultad de Salud, Programa de Medicina, Cartagena, Colombia.
- Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain.
| | - César R García-Jacas
- Cátedras Conacyt - Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California, Mexico
| | - Edgar Chavez
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Baja California, 22860, Mexico
| | - Jesus A Beltran
- Department of Informatics, University of California, Irvine, Irvine, CA, USA
| | - Hugo A Guillen-Ramirez
- Department of BioMedical Research (DBMR), University of Bern, Bern, 3008, Switzerland
- Department of Medical Oncology, Inselspital, University Hospital and University of Bern, 3010, Bern, Switzerland
| | - Carlos A Brizuela
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Baja California, 22860, Mexico.
| |
Collapse
|
29
|
Hudson IL, Leemaqz SY, Abell AD. Machine Learning and Scoring Functions (SFs) for Molecular Drug Discovery: Prediction and Characterisation of Druggable Drugs and Targets. MACHINE LEARNING IN CHEMISTRY 2020:251-279. [DOI: 10.1039/9781839160233-00251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2025]
Abstract
Predicting druggability and prioritising disease-modifying targets is critical in drug discovery. In this chapter, we describe the testing of a druggability rule based on 9 molecular parameters, which uses cutpoints for each molecular parameter and targets based on mixture clustering discriminant analysis. We demonstrate that principal component constructs and score functions of violations can be used to identify the hidden pattern of druggable molecules and disease targets. Random Forest and Artificial Neural Network rules to classify the high-score target from the low-score molecular violators, based both on molecular parameters and the principal component constructs, have confirmed the value of logD's inclusion in the scoring function. Our scoring functions of counts of violations and novel principal component analytic molecular and target-based constructs partitioned chemospace well, identifying both good and poor druggable molecules and targets. Viable molecules and targets were located in both the beyond Rule of 5 and expanded Rule of 5 regions. Random Forest and Artificial Neural Networks showed different variable importance profiles, with Artificial Neural Networks models performing better than Random Forests. The most important molecular descriptors that influence classification, by the Random Forest methods, were MW, NATOM, logD, and PSA. The optimal Artificial Neural Networks target models indicated that PSA and logD were more important than the traditional parameter MW. Overall, our score 4 partitions using logD were optimal at classification as shown in all Random Forests and Artificial Neural Networks analyses.
Collapse
Affiliation(s)
- I. L. Hudson
- Mathematical Sciences, College of Science, Engineering and Health, Royal Melbourne Institute of Technology (RMIT) Melbourne Victoria Australia
| | - S. Y. Leemaqz
- Robinson Research Institute, Adelaide Medical School, University of Adelaide Adelaide South Australia
| | - A. D. Abell
- Department of Chemistry, Adelaide Node Director Centre for Nanoscale BioPhotonics (CNBP), University of Adelaide Adelaide South Australia
| |
Collapse
|
30
|
Computational-aided design of a library of lactams through a diversity-oriented synthesis strategy. Bioorg Med Chem 2020; 28:115539. [PMID: 32503698 DOI: 10.1016/j.bmc.2020.115539] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 04/10/2020] [Accepted: 04/29/2020] [Indexed: 02/07/2023]
Abstract
Small molecule libraries for virtual screening are becoming a well-established tool for the identification of new hit compounds. As for experimental assays, the library quality, defined in terms of structural complexity and diversity, is crucial to increase the chance of a successful outcome in the screening campaign. In this context, Diversity-Oriented Synthesis has proven to be very effective, as the compounds generated are structurally complex and differ not only for the appendages, but also for the molecular scaffold. In this work, we automated the design of a library of lactams by applying a Diversity-Oriented Synthesis strategy called Build/Couple/Pair. We evaluated the novelty and diversity of these compounds by comparing them with lactam moieties contained in approved drugs, natural products, and bioactive compounds from ChEMBL. Finally, depending on their scaffold we classified them into β-, γ-, δ-, ε-, and isolated, fused, bridged and spirolactam groups and we assessed their drug-like and lead-like properties, thus providing the value of this novel in silico designed library for medicinal chemistry applications.
Collapse
|
31
|
Capecchi A, Zhang A, Reymond JL. Populating Chemical Space with Peptides Using a Genetic Algorithm. J Chem Inf Model 2020; 60:121-132. [PMID: 31868369 DOI: 10.1021/acs.jcim.9b01014] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
In drug discovery, one uses chemical space as a concept to organize molecules according to their structures and properties. One often would like to generate new possible molecules at a specific location in the chemical space marked by a molecule of interest. Herein, we report the peptide design genetic algorithm (PDGA, code available at https://github.com/reymond-group/PeptideDesignGA ), a computational tool capable of producing peptide sequences of various topologies (linear, cyclic/polycyclic, or dendritic) in proximity of any molecule of interest in a chemical space defined by macromolecule extended atom-pair fingerprint (MXFP), an atom-pair fingerprint describing molecular shape and pharmacophores. We show that the PDGA generates high-similarity analogues of bioactive peptides with diverse peptide chain topologies and of nonpeptide target molecules. We illustrate the chemical space accessible by the PDGA with an interactive 3D map of the MXFP property space available at http://faerun.gdb.tools/ . The PDGA should be generally useful to generate peptides at any location in the chemical space.
Collapse
Affiliation(s)
- Alice Capecchi
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland
| | - Alain Zhang
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland
| |
Collapse
|
32
|
Medina-Franco JL, Naveja JJ, López-López E. Reaching for the bright StARs in chemical space. Drug Discov Today 2019; 24:2162-2169. [PMID: 31557448 DOI: 10.1016/j.drudis.2019.09.013] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2019] [Revised: 09/10/2019] [Accepted: 09/17/2019] [Indexed: 02/07/2023]
Abstract
Visualization of activity data in chemical space is common in drug discovery. Navigating the space in a systematic manner is not trivial, given its size and huge coverage. To this end, methods for data visualization have been developed charting biological activity into chemical space. Herein, we review the progress in different visualization approaches to explore the chemical space aiming at reaching insightful structure-activity relationships (SARs) in the chemical space. We discuss recent methods including consensus diversity plots, ChemMaps, and constellation plots. Several of the methods we review can be extended to analyze other properties of interest in medicinal chemistry, such as structure-toxicity relationships, and can be adapted to postprocess results of virtual screening (VS) of large compound libraries.
Collapse
Affiliation(s)
- José L Medina-Franco
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico.
| | - J Jesús Naveja
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico; PECEM, School of Medicine, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | - Edgar López-López
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| |
Collapse
|
33
|
Naveja JJ, Medina-Franco JL. Finding Constellations in Chemical Space Through Core Analysis. Front Chem 2019; 7:510. [PMID: 31380353 PMCID: PMC6646408 DOI: 10.3389/fchem.2019.00510] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Accepted: 07/03/2019] [Indexed: 12/15/2022] Open
Abstract
Herein we introduce the constellation plots as a general approach that merges different and complementary molecular representations to enhance the information contained in a visual representation and analysis of chemical space. The method is based on a combination of a sub-structure based representation and classification of compounds with a "classical" coordinate-based representation of chemical space. A distinctive outcome of the method is that organizing the compounds in analog series leads to the formation of groups of molecules, aka "constellations" in chemical space. The novel approach is general and can be used to rapidly identify, for instance, insightful and "bright" Structure-Activity Relationships (StARs) in chemical space that are easy to interpret. This kind of analysis is expected to be especially useful for lead identification in large datasets of unannotated molecules, such as those obtained through high-throughput screening. We demonstrate the application of the method using two datasets of focused inhibitors designed against DNMTs and AKT1.
Collapse
Affiliation(s)
- J. Jesús Naveja
- PECEM, School of Medicine, Universidad Nacional Autónoma de México, Mexico City, Mexico
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - José L. Medina-Franco
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City, Mexico
| |
Collapse
|
34
|
Saldívar-González FI, Pilón-Jiménez BA, Medina-Franco JL. Chemical space of naturally occurring compounds. PHYSICAL SCIENCES REVIEWS 2019. [DOI: 10.1515/psr-2018-0103] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
AbstractThe chemical space of naturally occurring compounds is vast and diverse. Other than biologics, naturally occurring small molecules include a large variety of compounds covering natural products from different sources such as plant, marine, and fungi, to name a few, and several food chemicals. The systematic exploration of the chemical space of naturally occurring compounds have significant implications in many areas of research including but not limited to drug discovery, nutrition, bio- and chemical diversity analysis. The exploration of the coverage and diversity of the chemical space of compound databases can be carried out in different ways. The approach will largely depend on the criteria to define the chemical space that is commonly selected based on the goals of the study. This chapter discusses major compound databases of natural products and cheminformatics strategies that have been used to characterize the chemical space of natural products. Recent exemplary studies of the chemical space of natural products from different sources and their relationships with other compounds are also discussed. We also present novel chemical descriptors and data mining approaches that are emerging to characterize the chemical space of naturally occurring compounds.
Collapse
|
35
|
Kunkel C, Schober C, Oberhofer H, Reuter K. Knowledge discovery through chemical space networks: the case of organic electronics. J Mol Model 2019; 25:87. [DOI: 10.1007/s00894-019-3950-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Accepted: 01/29/2019] [Indexed: 12/14/2022]
|
36
|
López-López E, Naveja JJ, Medina-Franco JL. DataWarrior: an evaluation of the open-source drug discovery tool. Expert Opin Drug Discov 2019; 14:335-341. [PMID: 30806519 DOI: 10.1080/17460441.2019.1581170] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
INTRODUCTION DataWarrior is open and interactive software for data analysis and visualization that integrates well-established and novel chemoinformatics algorithms in a single environment. Since its public release in 2014, DataWarrior has been used by research groups in universities, government, and industry. Areas covered: Herein, the authors discuss, in a critical manner, the tools and distinct technical features of DataWarrior and analyze the areas of opportunity. Authors also present the most common applications as well as emerging uses in research areas beyond drug discovery with an emphasis on multidisciplinary projects. Expert opinion: In the era of big data and data-driven science, DataWarrior stands out as a technology that combines prediction of physicochemical properties of pharmaceutical interest, cheminformatics calculations, multivariate data analysis, and interactive visualization with dynamic plots. The well-established chemoinformatics tools implemented in DataWarrior, as well as the innovative algorithms, make the technology useful and attractive as revealed by the increasing number of documented applications.
Collapse
Affiliation(s)
- Edgar López-López
- a Department of Pharmacy, School of Chemistry , National Autonomous University of Mexico , Mexico City , Mexico.,b Medicinal Chemistry Laboratory , University of Veracruz , Veracruz , Mexico
| | - J Jesús Naveja
- a Department of Pharmacy, School of Chemistry , National Autonomous University of Mexico , Mexico City , Mexico.,c PECEM, Faculty of Medicine , National Autonomous University of Mexico , Mexico City , Mexico
| | - José L Medina-Franco
- a Department of Pharmacy, School of Chemistry , National Autonomous University of Mexico , Mexico City , Mexico
| |
Collapse
|
37
|
Orlov AA, Khvatov EV, Koruchekov AA, Nikitina AA, Zolotareva AD, Eletskaya AA, Kozlovskaya LI, Palyulin VA, Horvath D, Osolodkin DI, Varnek A. Getting to Know the Neighbours with GTM: The Case of Antiviral Compounds. Mol Inform 2019; 38:e1800166. [DOI: 10.1002/minf.201800166] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Accepted: 02/02/2019] [Indexed: 01/16/2023]
Affiliation(s)
- Alexey A. Orlov
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
| | | | - Alexander A. Koruchekov
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
| | - Anastasia A. Nikitina
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
| | - Anastasia D. Zolotareva
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Sechenov First Moscow State Medical University Moscow 119991 Russia
| | - Anastasia A. Eletskaya
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
| | - Liubov I. Kozlovskaya
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Sechenov First Moscow State Medical University Moscow 119991 Russia
| | | | - Dragos Horvath
- Laboratory of Chemoinformatics, Faculty of ChemistryUniversity of Strasbourg Strasbourg 67081 France
| | - Dmitry I. Osolodkin
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
- Sechenov First Moscow State Medical University Moscow 119991 Russia
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, Faculty of ChemistryUniversity of Strasbourg Strasbourg 67081 France
| |
Collapse
|
38
|
BIOFACQUIM: A Mexican Compound Database of Natural Products. Biomolecules 2019; 9:biom9010031. [PMID: 30658522 PMCID: PMC6358837 DOI: 10.3390/biom9010031] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Revised: 12/28/2018] [Accepted: 01/15/2019] [Indexed: 12/22/2022] Open
Abstract
Compound databases of natural products have a major impact on drug discovery projects and other areas of research. The number of databases in the public domain with compounds with natural origins is increasing. Several countries, Brazil, France, Panama and, recently, Vietnam, have initiatives in place to construct and maintain compound databases that are representative of their diversity. In this proof-of-concept study, we discuss the first version of BIOFACQUIM, a novel compound database with natural products isolated and characterized in Mexico. We discuss its construction, curation, and a complete chemoinformatic characterization of the content and coverage in chemical space. The profile of physicochemical properties, scaffold content, and diversity, as well as structural diversity based on molecular fingerprints is reported. BIOFACQUIM is available for free.
Collapse
|
39
|
Karlov DS, Sosnin S, Tetko IV, Fedorov MV. Chemical space exploration guided by deep neural networks. RSC Adv 2019; 9:5151-5157. [PMID: 35514634 PMCID: PMC9060647 DOI: 10.1039/c8ra10182e] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Accepted: 01/29/2019] [Indexed: 11/21/2022] Open
Abstract
A parametric t-SNE approach based on deep feed-forward neural networks was applied to the chemical space visualization problem. It is able to retain more information than certain dimensionality reduction techniques used for this purpose (principal component analysis (PCA), multidimensional scaling (MDS)). The applicability of this method to some chemical space navigation tasks (activity cliffs and activity landscapes identification) is discussed. We created a simple web tool to illustrate our work (http://space.syntelly.com). A parametric t-SNE approach based on deep feed-forward neural networks was applied to the chemical space visualization problem.![]()
Collapse
Affiliation(s)
- Dmitry S. Karlov
- Skolkovo Institute of Science and Technology
- Skolkovo Innovation Center
- Moscow 143026
- Russia
| | - Sergey Sosnin
- Skolkovo Institute of Science and Technology
- Skolkovo Innovation Center
- Moscow 143026
- Russia
- Syntelly LLC
| | - Igor V. Tetko
- Helmholtz Zentrum München – Research Center for Environmental Health (GmbH)
- Institute of Structural Biology
- Germany
- BIGCHEM GmbH
- Germany
| | - Maxim V. Fedorov
- Skolkovo Institute of Science and Technology
- Skolkovo Innovation Center
- Moscow 143026
- Russia
- Syntelly LLC
| |
Collapse
|
40
|
Nikitina AA, Orlov AA, Kozlovskaya LI, Palyulin VA, Osolodkin DI. Enhanced taxonomy annotation of antiviral activity data from ChEMBL. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2019; 2019:5308407. [PMID: 30753475 PMCID: PMC6367519 DOI: 10.1093/database/bay139] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2018] [Accepted: 12/09/2018] [Indexed: 11/14/2022]
Abstract
The discovery of antiviral drugs is a rapidly developing area of medicinal chemistry research. The emergence of resistant variants and outbreaks of poorly studied viral diseases make this area constantly developing. The amount of antiviral activity data available in ChEMBL consistently grows, but virus taxonomy annotation of these data is not sufficient for thorough studies of antiviral chemical space. We developed a procedure for semi-automatic extraction of antiviral activity data from ChEMBL and mapped them to the virus taxonomy developed by the International Committee for Taxonomy of Viruses (ICTV). The procedure is based on the lists of virus-related values of ChEMBL annotation fields and a dictionary of virus names and acronyms mapped to ICTV taxa. Application of this data extraction procedure allows retrieving from ChEMBL 1.6 times more assays linked to 2.5 times more compounds and data points than ChEMBL web interface allows. Mapping of these data to ICTV taxa allows analyzing all the compounds tested against each viral species. Activity values and structures of the compounds were standardized, and the antiviral activity profile was created for each standard structure. Data set compiled using this algorithm was called ViralChEMBL. As case studies, we compared descriptor and scaffold distributions for the full ChEMBL and its `viral' and `non-viral' subsets, identified the most studied compounds and created a self-organizing map for ViralChEMBL. Our approach to data annotation appeared to be a very efficient tool for the study of antiviral chemical space.
Collapse
Affiliation(s)
- Anastasia A Nikitina
- FSBSI "Chumakov FSC R&D IBP RAS", Moscow, Russia.,Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia
| | - Alexey A Orlov
- FSBSI "Chumakov FSC R&D IBP RAS", Moscow, Russia.,Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia
| | - Liubov I Kozlovskaya
- FSBSI "Chumakov FSC R&D IBP RAS", Moscow, Russia.,Institute of Translational Medicine and Biotechnology, Sechenov First Moscow State Medical University, Moscow, Russia
| | | | - Dmitry I Osolodkin
- FSBSI "Chumakov FSC R&D IBP RAS", Moscow, Russia.,Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia.,Institute of Translational Medicine and Biotechnology, Sechenov First Moscow State Medical University, Moscow, Russia
| |
Collapse
|
41
|
Saldívar-González FI, Lenci E, Trabocchi A, Medina-Franco JL. Exploring the chemical space and the bioactivity profile of lactams: a chemoinformatic study. RSC Adv 2019; 9:27105-27116. [PMID: 35528563 PMCID: PMC9070607 DOI: 10.1039/c9ra04841c] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Accepted: 08/17/2019] [Indexed: 01/04/2023] Open
Abstract
Lactams are a class of compounds important for drug design, due to their great variety of potential therapeutic applications, spanning cancer, diabetes, and infectious diseases. So far, the biological profile and chemical diversity of lactams have not been characterized in a systematic and detailed manner. In this work, we report the chemoinformatic analysis of beta-, gamma-, delta- and epsilon-lactams present in databases of approved drugs, natural products, and bioactive compounds from the large public database ChEMBL. We identified the main biological targets in which the lactams have been evaluated according to their chemical classification. We also identified the most frequent scaffolds and those that can be prioritized in chemical synthesis, since they are scaffolds with potential biological activity but with few reported analogs. Results of the biological and chemoinformatic analysis of lactams indicate that spiro- and bridged-lactams belong to classes with the lowest number of compounds and unique scaffolds, and some showing activity against specific targets. Information obtained from this analysis allows focusing the design of new chemical structures in less explored spaces and with increased possibilities of success. Lactams are a class of compounds important for drug design, due to their great variety of potential therapeutic applications, spanning cancer, diabetes, and infectious diseases.![]()
Collapse
Affiliation(s)
| | - Elena Lenci
- Department of Chemistry “Ugo Schiff”
- University of Florence
- 50019 Sesto Fiorentino
- Italy
| | - Andrea Trabocchi
- Department of Chemistry “Ugo Schiff”
- University of Florence
- 50019 Sesto Fiorentino
- Italy
- Interdepartmental Center for Preclinical Development of Molecular Imaging (CISPIM)
| | - José L. Medina-Franco
- School of Chemistry
- Department of Pharmacy
- Universidad Nacional Autónoma de México
- Mexico City 04510
- Mexico
| |
Collapse
|
42
|
Saldívar-González FI, Valli M, Andricopulo AD, da Silva Bolzani V, Medina-Franco JL. Chemical Space and Diversity of the NuBBE Database: A Chemoinformatic Characterization. J Chem Inf Model 2018; 59:74-85. [DOI: 10.1021/acs.jcim.8b00619] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Fernanda I. Saldívar-González
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| | - Marilia Valli
- Nuclei of Bioassays, Biosynthesis and Ecophysiology of Natural Products (NuBBE), Department of Organic Chemistry, Institute of Chemistry, Sao Paulo State University - UNESP, 14800-060 Araraquara, Sao Paulo, Brazil
| | - Adriano D. Andricopulo
- Laboratório de Química Medicinal e Computacional (LQMC), Centro de Pesquisa e Inovação em Biodiversidade e Fármacos, Institute of Physics of Sao Carlos, University of Sao Paulo - USP, 13563-120 Sao Carlos, Sao Paulo, Brazil
| | - Vanderlan da Silva Bolzani
- Nuclei of Bioassays, Biosynthesis and Ecophysiology of Natural Products (NuBBE), Department of Organic Chemistry, Institute of Chemistry, Sao Paulo State University - UNESP, 14800-060 Araraquara, Sao Paulo, Brazil
| | - José L. Medina-Franco
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| |
Collapse
|
43
|
Wolfender JL, Nuzillard JM, van der Hooft JJJ, Renault JH, Bertrand S. Accelerating Metabolite Identification in Natural Product Research: Toward an Ideal Combination of Liquid Chromatography–High-Resolution Tandem Mass Spectrometry and NMR Profiling, in Silico Databases, and Chemometrics. Anal Chem 2018; 91:704-742. [DOI: 10.1021/acs.analchem.8b05112] [Citation(s) in RCA: 113] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Jean-Luc Wolfender
- School of Pharmaceutical Sciences, EPGL, University of Geneva, University of Lausanne, CMU, 1 Rue Michel Servet, 1211 Geneva 4, Switzerland
| | - Jean-Marc Nuzillard
- Institut de Chimie Moléculaire de Reims, UMR CNRS 7312, Université de Reims Champagne Ardenne, 51687 Reims Cedex 2, France
| | | | - Jean-Hugues Renault
- Institut de Chimie Moléculaire de Reims, UMR CNRS 7312, Université de Reims Champagne Ardenne, 51687 Reims Cedex 2, France
| | - Samuel Bertrand
- Groupe Mer, Molécules, Santé-EA 2160, UFR des Sciences Pharmaceutiques et Biologiques, Université de Nantes, 44035 Nantes, France
- ThalassOMICS Metabolomics Facility, Plateforme Corsaire, Biogenouest, 44035 Nantes, France
| |
Collapse
|
44
|
Opassi G, Gesù A, Massarotti A. The hitchhiker’s guide to the chemical-biological galaxy. Drug Discov Today 2018; 23:565-574. [DOI: 10.1016/j.drudis.2018.01.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Revised: 11/25/2017] [Accepted: 01/04/2018] [Indexed: 12/21/2022]
|
45
|
Naveja JJ, Medina-Franco JL. Insights from pharmacological similarity of epigenetic targets in epipolypharmacology. Drug Discov Today 2018; 23:141-150. [DOI: 10.1016/j.drudis.2017.10.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Revised: 09/05/2017] [Accepted: 10/05/2017] [Indexed: 01/10/2023]
|
46
|
Naveja JJ, Oviedo-Osornio CI, Trujillo-Minero NN, Medina-Franco JL. Chemoinformatics: a perspective from an academic setting in Latin America. Mol Divers 2017; 22:247-258. [PMID: 29204824 DOI: 10.1007/s11030-017-9802-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 11/26/2017] [Indexed: 12/13/2022]
Abstract
This perspective discusses the current progress of a chemoinformatics group in a major university in Latin America. Three major aspects are discussed in a critical manner: research, education, and collaboration with industry and other public research networks. It is also presented an overview of the progress in applied research and development of research concepts. Efforts to teach chemoinformatics at the undergraduate and graduate levels are discussed. It is addressed how the partnership with industry and other not-for-profit research institutions not only brings additional sources of funding but, more importantly, increases the impact of the multidisciplinary work and offers the students to be exposed to other research environments. We also discuss the main perspectives and challenges that remain to be addressed in these settings.
Collapse
Affiliation(s)
- J Jesús Naveja
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.,PECEM, Facultad de Medicina, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - C Iluhí Oviedo-Osornio
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - Nicole N Trujillo-Minero
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - José L Medina-Franco
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.
| |
Collapse
|
47
|
Visini R, Arús-Pous J, Awale M, Reymond JL. Virtual Exploration of the Ring Systems Chemical Universe. J Chem Inf Model 2017; 57:2707-2718. [PMID: 29019686 DOI: 10.1021/acs.jcim.7b00457] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Here, we explore the chemical space of all virtually possible organic molecules focusing on ring systems, which represent the cyclic cores of organic molecules obtained by removing all acyclic bonds and converting all remaining atoms to carbon. This approach circumvents the combinatorial explosion encountered when enumerating the molecules themselves. We report the chemical universe database GDB4c containing 916 130 ring systems up to four saturated or aromatic rings and maximum ring size of 14 atoms and GDB4c3D containing the corresponding 6 555 929 stereoisomers. Almost all (98.6%) of these ring systems are unknown and represent chiral 3D-shaped macrocycles containing small rings and quaternary centers reminiscent of polycyclic natural products. We envision that GDB4c can serve to select new ring systems from which to design analogs of such natural products. The database is available for download at www.gdb.unibe.ch together with interactive visualization and search tools as a resource for molecular design.
Collapse
Affiliation(s)
- Ricardo Visini
- Department of Chemistry and Biochemistry, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Josep Arús-Pous
- Department of Chemistry and Biochemistry, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Mahendra Awale
- Department of Chemistry and Biochemistry, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| |
Collapse
|
48
|
Olmedo DA, González-Medina M, Gupta MP, Medina-Franco JL. Cheminformatic characterization of natural products from Panama. Mol Divers 2017; 21:779-789. [PMID: 28831697 DOI: 10.1007/s11030-017-9781-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2017] [Accepted: 08/07/2017] [Indexed: 12/26/2022]
Abstract
In this work, we discuss the characterization and diversity analysis of 354 natural products (NPs) from Panama, systematically analyzed for the first time. The in-house database was compared to NPs from Brazil, compounds from Traditional Chinese Medicine, natural and semisynthetic collections used in high-throughput screening, and compounds from ChEMBL. An analysis of the "global diversity" was conducted using molecular properties of pharmaceutical interest, three molecular fingerprints of different design, molecular scaffolds, and molecular complexity. The global diversity was visualized using consensus diversity plots that revealed that the secondary metabolites in the Panamanian flora have a large scaffold diversity as compared to other composite databases and also have several unique scaffolds. The large scaffold diversity is in agreement with the broad range of biological activities that this collection of NPs from Panama has shown. This study also provided further quantitative evidence of the large structural complexity of NPs. The results obtained in this study support that NPs from Panama are promising candidates to identify selective molecules and are suitable sources of compounds for virtual screening campaigns.
Collapse
Affiliation(s)
- Dionisio A Olmedo
- CIFLORPAN, Center for Pharmacognostic Research on Panamanian Flora, College of Pharmacy, University of Panama, Campus Universitario Octavio Méndez Pereira, Avenida Octavio Méndez Pereira, P.O. Box 0824-00172, Panama City, Republic of Panama.
| | - Mariana González-Medina
- Departamento de Farmacia, Facultad de Química, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - Mahabir P Gupta
- CIFLORPAN, Center for Pharmacognostic Research on Panamanian Flora, College of Pharmacy, University of Panama, Campus Universitario Octavio Méndez Pereira, Avenida Octavio Méndez Pereira, P.O. Box 0824-00172, Panama City, Republic of Panama
| | - José L Medina-Franco
- Departamento de Farmacia, Facultad de Química, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.
| |
Collapse
|
49
|
Abstract
The generation of conformations for small molecules is a problem of continuing interest in cheminformatics and computational drug discovery. This review will present an overview of methods used to sample conformational space, focusing on those methods designed for organic molecules commonly of interest in drug discovery. Different approaches to both the sampling of conformational space and the scoring of conformational stability will be compared and contrasted, with an emphasis on those methods suitable for conformer sampling of large numbers of drug-like molecules. Particular attention will be devoted to the appropriate utilization of information from experimental solid-state structures in validating and evaluating the performance of these tools. The review will conclude with some areas worthy of further investigation.
Collapse
Affiliation(s)
- Paul C D Hawkins
- OpenEye Scientific , 9 Bisbee Court, Suite D, Santa Fe, New Mexico 87508, United States
| |
Collapse
|
50
|
Naveja JJ, Medina-Franco JL. ChemMaps: Towards an approach for visualizing the chemical space based on adaptive satellite compounds. F1000Res 2017; 6. [PMID: 28794856 PMCID: PMC5538041 DOI: 10.12688/f1000research.12095.2] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/03/2017] [Indexed: 01/22/2023] Open
Abstract
We present a novel approach called ChemMaps for visualizing chemical space based on the similarity matrix of compound datasets generated with molecular fingerprints’ similarity. The method uses a ‘satellites’ approach, where satellites are, in principle, molecules whose similarity to the rest of the molecules in the database provides sufficient information for generating a visualization of the chemical space. Such an approach could help make chemical space visualizations more efficient. We hereby describe a proof-of-principle application of the method to various databases that have different diversity measures. Unsurprisingly, we found the method works better with databases that have low 2D diversity. 3D diversity played a secondary role, although it seems to be more relevant as 2D diversity increases. For less diverse datasets, taking as few as 25% satellites seems to be sufficient for a fair depiction of the chemical space. We propose to iteratively increase the satellites number by a factor of 5% relative to the whole database, and stop when the new and the prior chemical space correlate highly. This Research Note represents a first exploratory step, prior to the full application of this method for several datasets.
Collapse
Affiliation(s)
- J Jesús Naveja
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City, 04510, Mexico.,PECEM, Faculty of Medicine, Universidad Nacional Autónoma de México, Mexico City, 04510, Mexico
| | - José L Medina-Franco
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City, 04510, Mexico
| |
Collapse
|