1
|
Djoumbou-Feunang Y, Wilmot J, Kinney J, Chanda P, Yu P, Sader A, Sharifi M, Smith S, Ou J, Hu J, Shipp E, Tomandl D, Kumpatla SP. Cheminformatics and artificial intelligence for accelerating agrochemical discovery. Front Chem 2023; 11:1292027. [PMID: 38093816 PMCID: PMC10716421 DOI: 10.3389/fchem.2023.1292027] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Accepted: 11/09/2023] [Indexed: 10/17/2024] Open
Abstract
The global cost-benefit analysis of pesticide use during the last 30 years has been characterized by a significant increase during the period from 1990 to 2007 followed by a decline. This observation can be attributed to several factors including, but not limited to, pest resistance, lack of novelty with respect to modes of action or classes of chemistry, and regulatory action. Due to current and projected increases of the global population, it is evident that the demand for food, and consequently, the usage of pesticides to improve yields will increase. Addressing these challenges and needs while promoting new crop protection agents through an increasingly stringent regulatory landscape requires the development and integration of infrastructures for innovative, cost- and time-effective discovery and development of novel and sustainable molecules. Significant advances in artificial intelligence (AI) and cheminformatics over the last two decades have improved the decision-making power of research scientists in the discovery of bioactive molecules. AI- and cheminformatics-driven molecule discovery offers the opportunity of moving experiments from the greenhouse to a virtual environment where thousands to billions of molecules can be investigated at a rapid pace, providing unbiased hypothesis for lead generation, optimization, and effective suggestions for compound synthesis and testing. To date, this is illustrated to a far lesser extent in the publicly available agrochemical research literature compared to drug discovery. In this review, we provide an overview of the crop protection discovery pipeline and how traditional, cheminformatics, and AI technologies can help to address the needs and challenges of agrochemical discovery towards rapidly developing novel and more sustainable products.
Collapse
Affiliation(s)
| | - Jeremy Wilmot
- Corteva Agriscience, Crop Protection Discovery and Development, Indianapolis, IN, United States
| | - John Kinney
- Corteva Agriscience, Farming Solutions and Digital, Indianapolis, IN, United States
| | - Pritam Chanda
- Corteva Agriscience, Farming Solutions and Digital, Indianapolis, IN, United States
| | - Pulan Yu
- Corteva Agriscience, Crop Protection Discovery and Development, Indianapolis, IN, United States
| | - Avery Sader
- Corteva Agriscience, Crop Protection Discovery and Development, Indianapolis, IN, United States
| | - Max Sharifi
- Corteva Agriscience, Regulatory and Stewardship, Indianapolis, IN, United States
| | - Scott Smith
- Corteva Agriscience, Farming Solutions and Digital, Indianapolis, IN, United States
| | - Junjun Ou
- Corteva Agriscience, Crop Protection Discovery and Development, Indianapolis, IN, United States
| | - Jie Hu
- Corteva Agriscience, Farming Solutions and Digital, Indianapolis, IN, United States
| | - Elizabeth Shipp
- Corteva Agriscience UK Limited, Regulation Innovation Center, Abingdon, United Kingdom
| | | | | |
Collapse
|
2
|
Tingle B, Tang KG, Castanon M, Gutierrez JJ, Khurelbaatar M, Dandarchuluun C, Moroz YS, Irwin JJ. ZINC-22─A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery. J Chem Inf Model 2023; 63:1166-1176. [PMID: 36790087 PMCID: PMC9976280 DOI: 10.1021/acs.jcim.2c01253] [Citation(s) in RCA: 37] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Indexed: 02/16/2023]
Abstract
Purchasable chemical space has grown rapidly into the tens of billions of molecules, providing unprecedented opportunities for ligand discovery but straining the tools that might exploit these molecules at scale. We have therefore developed ZINC-22, a database of commercially accessible small molecules derived from multi-billion-scale make-on-demand libraries. The new database and tools enable analog searching in this vast new space via a facile GUI, CartBlanche, drawing on similarity methods that scale sublinearly in the number of molecules. The new library also uses data organization methods, enabling rapid lookup of molecules and their physical properties, including conformations, partial atomic charges, c Log P values, and solvation energies, all crucial for molecule docking, which had become slow with older database organizations in previous versions of ZINC. As the libraries have continued to grow, we have been interested in finding whether molecular diversity has suffered, for instance, because certain scaffolds have come to dominate via easy analoging. This has not occurred thus far, and chemical diversity continues to grow with database size, with a log increase in Bemis-Murcko scaffolds for every two-log unit increase in database size. Most new scaffolds come from compounds with the highest heavy atom count. Finally, we consider the implications for databases like ZINC as the libraries grow toward and beyond the trillion-molecule range. ZINC is freely available to everyone and may be accessed at cartblanche22.docking.org, via Globus, and in the Amazon AWS and Oracle OCI clouds.
Collapse
Affiliation(s)
- Benjamin
I. Tingle
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Khanh G. Tang
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Mar Castanon
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - John J. Gutierrez
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Munkhzul Khurelbaatar
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Chinzorig Dandarchuluun
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Yurii S. Moroz
- Taras
Shevchenko National University of Kyïv, 60 Volodymyrska Street, Kyïv 01601, Ukraine
- Chemspace
LLC, 85 Chervonotkatska
Street, Kyïv 02094, Ukraine
| | - John J. Irwin
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| |
Collapse
|
3
|
Data-Driven Approaches Used for Compound Library Design for the Treatment of Parkinson's Disease. Int J Mol Sci 2023; 24:ijms24021134. [PMID: 36674652 PMCID: PMC9867512 DOI: 10.3390/ijms24021134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Revised: 12/10/2022] [Accepted: 12/19/2022] [Indexed: 01/11/2023] Open
Abstract
Parkinson's disease (PD) is the second most common neurodegenerative disease in older individuals worldwide. Pharmacological treatment for such a disease consists of drugs such as monoamine oxidase B (MAO-B) inhibitors to increase dopamine concentration in the brain. However, such drugs have adverse reactions that limit their use for extended periods; thus, the design of less toxic and more efficient compounds may be explored. In this context, cheminformatics and computational chemistry have recently contributed to developing new drugs and the search for new therapeutic targets. Therefore, through a data-driven approach, we used cheminformatic tools to find and optimize novel compounds with pharmacological activity against MAO-B for treating PD. First, we retrieved from the literature 3316 original articles published between 2015-2021 that experimentally tested 215 natural compounds against PD. From such compounds, we built a pharmacological network that showed rosmarinic acid, chrysin, naringenin, and cordycepin as the most connected nodes of the network. From such compounds, we performed fingerprinting analysis and developed evolutionary libraries to obtain novel derived structures. We filtered these compounds through a docking test against MAO-B and obtained five derived compounds with higher affinity and lead likeness potential. Then we evaluated its antioxidant and pharmacokinetic potential through a docking analysis (NADPH oxidase and CYP450) and physiologically-based pharmacokinetic (PBPK modeling). Interestingly, only one compound showed dual activity (antioxidant and MAO-B inhibitors) and pharmacokinetic potential to be considered a possible candidate for PD treatment and further experimental analysis.
Collapse
|
4
|
Schaub J, Zander J, Zielesny A, Steinbeck C. Scaffold Generator: a Java library implementing molecular scaffold functionalities in the Chemistry Development Kit (CDK). J Cheminform 2022; 14:79. [PMID: 36357931 PMCID: PMC9650898 DOI: 10.1186/s13321-022-00656-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 10/30/2022] [Indexed: 11/12/2022] Open
Abstract
The concept of molecular scaffolds as defining core structures of organic molecules is utilised in many areas of chemistry and cheminformatics, e.g. drug design, chemical classification, or the analysis of high-throughput screening data. Here, we present Scaffold Generator, a comprehensive open library for the generation, handling, and display of molecular scaffolds, scaffold trees and networks. The new library is based on the Chemistry Development Kit (CDK) and highly customisable through multiple settings, e.g. five different structural framework definitions are available. For display of scaffold hierarchies, the open GraphStream Java library is utilised. Performance snapshots with natural products (NP) from the COCONUT (COlleCtion of Open Natural prodUcTs) database and drug molecules from DrugBank are reported. The generation of a scaffold network from more than 450,000 NP can be achieved within a single day.
Collapse
Affiliation(s)
- Jonas Schaub
- grid.9613.d0000 0001 1939 2794Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University Jena, Lessing Strasse 8, 07743 Jena, Germany
| | - Julian Zander
- grid.454254.60000 0004 0647 4362Institute for Bioinformatics and Chemoinformatics, Westphalian University of Applied Sciences, August-Schmidt-Ring 10, 45665 Recklinghausen, Germany
| | - Achim Zielesny
- grid.454254.60000 0004 0647 4362Institute for Bioinformatics and Chemoinformatics, Westphalian University of Applied Sciences, August-Schmidt-Ring 10, 45665 Recklinghausen, Germany
| | - Christoph Steinbeck
- grid.9613.d0000 0001 1939 2794Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University Jena, Lessing Strasse 8, 07743 Jena, Germany
| |
Collapse
|
5
|
Flores-Padilla EA, Juárez-Mercado KE, Naveja JJ, Kim TD, Alain Miranda-Quintana R, Medina-Franco JL. Chemoinformatic Characterization of Synthetic Screening Libraries Focused on Epigenetic Targets. Mol Inform 2021; 41:e2100285. [PMID: 34931466 DOI: 10.1002/minf.202100285] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 12/08/2021] [Indexed: 02/03/2023]
Abstract
The importance of epigenetic drug and probe discovery is on the rise. This is not only paramount to identify and develop therapeutic treatments associated with epigenetic processes but also to understand the underlying epigenetic mechanisms involved in biological processes. To this end, chemical vendors have been developing synthetic compound libraries focused on epigenetic targets to increase the probabilities of identifying promising starting points for drug or probe candidates. However, the chemical contents of these data sets, the distribution of their physicochemical properties, and diversity remain unknown. To fill this gap and make this information available to the scientific community, we report a comprehensive analysis of eleven libraries focused on epigenetic targets containing more than 50,000 compounds. We used well-validated chemoinformatics approaches to characterize these sets, including novel methods such as automated detection of analog series and visual representations of the chemical space based on Constellation Plots and Chemical Library Networks. This work will guide the efforts of experimental groups working on high-throughput and medium-throughput screening of epigenetic-focused libraries. The outcome of this work can also be used as a reference to design and describe novel focused epigenetic libraries.
Collapse
Affiliation(s)
- E Alexis Flores-Padilla
- DIFACQUIM Research Group, Department of Pharmacy, National Autonomous University of Mexico, Mexico City, 04510, Mexico
| | - K Eurídice Juárez-Mercado
- DIFACQUIM Research Group, Department of Pharmacy, National Autonomous University of Mexico, Mexico City, 04510, Mexico
| | - José J Naveja
- Instituto de Quimica, National Autonomous University of Mexico, Mexico City, 04510, Mexico
| | - Taewon D Kim
- Department of Chemistry, University of Florida, Gainesville, Florida, 32611, United States
| | - Ramón Alain Miranda-Quintana
- Department of Chemistry, University of Florida, Gainesville, Florida, 32611, United States.,Quantum Theory Project, University of Florida, Gainesville, Florida, 32611, United States
| | - José L Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, National Autonomous University of Mexico, Mexico City, 04510, Mexico
| |
Collapse
|