1
|
Thirunavukkarasu MK, Veerappapillai S, Karuppasamy R. Sequential virtual screening collaborated with machine-learning strategies for the discovery of precise medicine against non-small cell lung cancer. J Biomol Struct Dyn 2024; 42:615-628. [PMID: 36995235 DOI: 10.1080/07391102.2023.2194994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 03/17/2023] [Indexed: 03/31/2023]
Abstract
Dysregulation of MAPK pathway receptors are crucial in causing uncontrolled cell proliferation in many cancer types including non-small cell lung cancer. Due to the complications in targeting the upstream components, MEK is an appealing target to diminish this pathway activity. Hence, we have aimed to discover potent MEK inhibitors by integrating virtual screening and machine learning-based strategies. Preliminary screening was conducted on 11,808 compounds using the cavity-based pharmacophore model AADDRRR. Further, seven ML models were accessed to predict the MEK active compounds using six molecular representations. The LGB model with morgan2 fingerprints surpasses other models ensuing 0.92 accuracy and 0.83 MCC value versus test set and 0.85 accuracy and 0.70 MCC value with external set. Further, the binding ability of screened hits were examined using glide XP docking and prime-MM/GBSA calculations. Note that we have utilized three ML-based scoring functions to predict the various biological properties of the compounds. The two hit compounds such as DB06920 and DB08010 resulted excellent binding mechanism with acceptable toxicity properties against MEK. Further, 200 ns of MD simulation combined with MM-GBSA/PBSA calculations confirms that DB06920 may have stable binding conformations with MEK thus step forwarded to the experimental studies in the near future.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Muthu Kumar Thirunavukkarasu
- Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Shanthi Veerappapillai
- Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Ramanathan Karuppasamy
- Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| |
Collapse
|
2
|
Zou J, Zhao L, Shi S. Generation of focused drug molecule library using recurrent neural network. J Mol Model 2023; 29:361. [PMID: 37932607 DOI: 10.1007/s00894-023-05772-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 10/26/2023] [Indexed: 11/08/2023]
Abstract
CONTEXT With the wide application of deep learning in drug research and development, de novo molecular design methods based on recurrent neural network (RNN) have strong advantages in drug molecule generation. The RNN model can be used to learn the internal chemical structure of molecules, which is similar to a natural language processing task. Although techniques for generating target-specific molecular libraries based on RNN models are mature, research related to drug design and screening continues around the clock. Research based on de novo drug design methods to generate larger quantities of valid compounds is necessary. METHODS In this study, a molecular generation model based on RNN was designed, which abandoned the traditional way of stacked RNN and introduced the Nested long short-term memory network structure. To enrich the library of focused molecules for specific targets, we fine-tuned the model using active molecules from novel coronavirus pneumonia and screened the molecules using machine learning models. Following rigorous screening, the selected molecules underwent molecular docking with the SARS-CoV-2 M-pro receptor using AutoDock2.4 to identify the top 3 potential inhibitors. Subsequently, 100-ns molecular dynamics simulations were conducted using Amber22. Molecule parameterization involved the GAFF2 force field, while the proteins were modeled using the ff19SB force field, with solvation facilitated by a truncated octahedral TIP3P solvent environment. Upon completion of molecular dynamics simulations, stability of ligand-protein complexes was assessed by analysis of RMSD, H-bonds, and MM-GBSA. Reasonable results prove that the model can complete the task of de novo drug design and has the potential to be ideal drug molecules.
Collapse
Affiliation(s)
- Jinping Zou
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Long Zhao
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Shaoping Shi
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China.
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China.
| |
Collapse
|
3
|
Sivangi KB, Amilpur S, Dasari CM. ReGen-DTI: A novel generative drug target interaction model for predicting potential drug candidates against SARS-COV2. Comput Biol Chem 2023; 106:107927. [PMID: 37499436 DOI: 10.1016/j.compbiolchem.2023.107927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 07/12/2023] [Accepted: 07/13/2023] [Indexed: 07/29/2023]
Abstract
Covid-19 has caused massive numbers of infections and fatalities globally. In response, there has been a large-scale experimental and computational research effort to study and develop drugs. Towards this, Deep learning techniques are used for the generation of potential novel drug candidates that are proven to be effective against exploring large molecular search spaces. Recent advances in reinforcement learning in conjunction with generative techniques has proven to be a promising field in the area of drug discovery. In this regard, we propose a generative drug discovery approach using reinforcement techniques for sampling novel molecules that bind to the main protease of SARS-COV2. The generative method reported significant validity scores for the generated novel molecules and captured the underlying features of the training molecules. Further, the model is fine-tuned on existing re-purposed molecules which are active towards specific target proteins based on similarity metrics. Upon fine tuning the model generated 92.71% valid, 93.55% unique, and 100% novel molecules. Unlike previous methods which are dependent on docking procedures, we proposed a deep learning based novel drug target interaction (DTI) model to find the binding affinity between candidate molecules and target protease sequence. Finally, the binding affinity of the generated molecules is predicted against the 3CLPro main protease by using the proposed DTI model. Most of the generated molecules have shown binding affinity scores <100 nM (lower the better), which are significantly better compared to the existing commercial drugs including Remdesevir.
Collapse
Affiliation(s)
- Kaushik Bhargav Sivangi
- Indian Institute of Information Technology, Sri City, Chittoor, 517646, Andhra Pradesh, India
| | - Santhosh Amilpur
- Indian Institute of Information Technology, Sri City, Chittoor, 517646, Andhra Pradesh, India
| | - Chandra Mohan Dasari
- Indian Institute of Information Technology, Sri City, Chittoor, 517646, Andhra Pradesh, India.
| |
Collapse
|
4
|
Elkashlan M, Ahmad RM, Hajar M, Al Jasmi F, Corchado JM, Nasarudin NA, Mohamad MS. A review of SARS-CoV-2 drug repurposing: databases and machine learning models. Front Pharmacol 2023; 14:1182465. [PMID: 37601065 PMCID: PMC10436567 DOI: 10.3389/fphar.2023.1182465] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 07/06/2023] [Indexed: 08/22/2023] Open
Abstract
The emergence of Severe Acute Respiratory Syndrome Corona Virus 2 (SARS-CoV-2) posed a serious worldwide threat and emphasized the urgency to find efficient solutions to combat the spread of the virus. Drug repurposing has attracted more attention than traditional approaches due to its potential for a time- and cost-effective discovery of new applications for the existing FDA-approved drugs. Given the reported success of machine learning (ML) in virtual drug screening, it is warranted as a promising approach to identify potential SARS-CoV-2 inhibitors. The implementation of ML in drug repurposing requires the presence of reliable digital databases for the extraction of the data of interest. Numerous databases archive research data from studies so that it can be used for different purposes. This article reviews two aspects: the frequently used databases in ML-based drug repurposing studies for SARS-CoV-2, and the recent ML models that have been developed for the prospective prediction of potential inhibitors against the new virus. Both types of ML models, Deep Learning models and conventional ML models, are reviewed in terms of introduction, methodology, and its recent applications in the prospective predictions of SARS-CoV-2 inhibitors. Furthermore, the features and limitations of the databases are provided to guide researchers in choosing suitable databases according to their research interests.
Collapse
Affiliation(s)
- Marim Elkashlan
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Rahaf M Ahmad
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Malak Hajar
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Fatma Al Jasmi
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
- Division of Metabolic Genetics, Department of Pediatrics, Tawam Hospital, Al Ain, United Arab Emirates
| | - Juan Manuel Corchado
- Departamento de Informática y Automática, Facultad de Ciencias, Grupo de Investigación BISITE, Instituto de Investigación Biomédica de Salamanca, University of Salamanca, Salamanca, Spain
| | - Nurul Athirah Nasarudin
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Mohd Saberi Mohamad
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| |
Collapse
|
5
|
Joshi PB. Navigating with chemometrics and machine learning in chemistry. Artif Intell Rev 2023; 56:1-26. [PMID: 36714038 PMCID: PMC9870782 DOI: 10.1007/s10462-023-10391-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/09/2023] [Indexed: 01/25/2023]
Abstract
Chemometrics and machine learning are artificial intelligence-based methods stirring a transformative change in chemistry. Organic synthesis, drug discovery and analytical techniques are incorporating machine learning techniques at an accelerated pace. However, machine-assisted chemistry faces challenges while solving critical problems in chemistry due to complex relationships in data sets. Even with increasing publishing volumes on machine learning, its application in areas of chemistry is not a straightforward endeavour. A particular concern in applying machine learning in chemistry is data availability and reproducibility. The present review article discusses the various chemometric methods, expert systems, and machine learning techniques developed for solving problems of organic synthesis and drug discovery with selected examples. Further, a concise discussion on chemometrics and ML deployed in analytical techniques such as, spectroscopy, microscopy and chromatography are presented. Finally, the review reflects the challenges, opportunities and future perspectives on machine learning and automation in chemistry. The review concludes by pondering on some tough questions on applying machine learning and their possibility of navigation in the different terrains of chemistry.
Collapse
Affiliation(s)
- Payal B. Joshi
- Operations and Method Development, Shefali Research Laboratories, Ambernath (East), Thane, Maharashtra 421501 India
| |
Collapse
|
6
|
Jodłowski PJ, Dymek K, Kurowski G, Jaśkowska J, Bury W, Pander M, Wnorowska S, Targowska-Duda K, Piskorz W, Wnorowski A, Boguszewska-Czubara A. Zirconium-Based Metal-Organic Frameworks as Acriflavine Cargos in the Battle against Coronaviruses─A Theoretical and Experimental Approach. ACS APPLIED MATERIALS & INTERFACES 2022; 14:28615-28627. [PMID: 35700479 PMCID: PMC9212192 DOI: 10.1021/acsami.2c06420] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 06/01/2022] [Indexed: 06/15/2023]
Abstract
In this study, we present a complementary approach for obtaining an effective drug, based on acriflavine (ACF) and zirconium-based metal-organic frameworks (MOFs), against SARS-CoV-2. The experimental results showed that acriflavine inhibits the interaction between viral receptor-binding domain (RBD) of spike protein and angiotensin converting enzyme-2 (ACE2) host receptor driving viral cell entry. The prepared ACF@MOF composites exhibited low (MOF-808 and UiO-66) and high (UiO-67 and NU-1000) ACF loadings. The drug release profiles from prepared composites showed different release kinetics depending on the local pore environment. The long-term ACF release with the effective antiviral ACF concentration was observed for all studied ACF@MOF composites. The density functional theory (DFT) calculations allowed us to determine that π-π stacking together with electrostatic interaction plays an important role in acriflavine adsorption and release from ACF@MOF composites. The molecular docking results have shown that acriflavine interacts with several possible binding sites within the RBD and binding site at the RBD/ACE2 interface. The cytotoxicity and ecotoxicity results have confirmed that the prepared ACF@MOF composites may be considered potentially safe for living organisms. The complementary experimental and theoretical results presented in this study have confirmed that the ACF@MOF composites may be considered a potential candidate for the COVID-19 treatment, which makes them good candidates for clinical trials.
Collapse
Affiliation(s)
- Przemysław J. Jodłowski
- Faculty
of Chemical Engineering and Technology, Cracow University of Technology, 24 Warszawska, 31-155 Kraków, Poland
| | - Klaudia Dymek
- Faculty
of Chemical Engineering and Technology, Cracow University of Technology, 24 Warszawska, 31-155 Kraków, Poland
| | - Grzegorz Kurowski
- Faculty
of Chemical Engineering and Technology, Cracow University of Technology, 24 Warszawska, 31-155 Kraków, Poland
| | - Jolanta Jaśkowska
- Faculty
of Chemical Engineering and Technology, Cracow University of Technology, 24 Warszawska, 31-155 Kraków, Poland
| | - Wojciech Bury
- Faculty
of Chemistry, University of Wrocław, 14 F. Joliot-Curie, 50-383 Wrocław, Poland
| | - Marzena Pander
- Faculty
of Chemistry, University of Wrocław, 14 F. Joliot-Curie, 50-383 Wrocław, Poland
| | - Sylwia Wnorowska
- Department
of Medical Chemistry, Medical University
of Lublin, 4A Chodzki, 20-093 Lublin, Poland
| | | | - Witold Piskorz
- Faculty
of Chemistry, Jagiellonian University, Gronostajowa 2, 30-387 Kraków, Poland
| | - Artur Wnorowski
- Department
of Biopharmacy, Medical University of Lublin, 4A Chodzki, 20-093 Lublin, Poland
| | - Anna Boguszewska-Czubara
- Department
of Medical Chemistry, Medical University
of Lublin, 4A Chodzki, 20-093 Lublin, Poland
| |
Collapse
|
7
|
Heidari A, Jafari Navimipour N, Unal M, Toumaj S. Machine learning applications for COVID-19 outbreak management. Neural Comput Appl 2022; 34:15313-15348. [PMID: 35702664 PMCID: PMC9186489 DOI: 10.1007/s00521-022-07424-w] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 05/10/2022] [Indexed: 12/29/2022]
Abstract
Recently, the COVID-19 epidemic has resulted in millions of deaths and has impacted practically every area of human life. Several machine learning (ML) approaches are employed in the medical field in many applications, including detecting and monitoring patients, notably in COVID-19 management. Different medical imaging systems, such as computed tomography (CT) and X-ray, offer ML an excellent platform for combating the pandemic. Because of this need, a significant quantity of study has been carried out; thus, in this work, we employed a systematic literature review (SLR) to cover all aspects of outcomes from related papers. Imaging methods, survival analysis, forecasting, economic and geographical issues, monitoring methods, medication development, and hybrid apps are the seven key uses of applications employed in the COVID-19 pandemic. Conventional neural networks (CNNs), long short-term memory networks (LSTM), recurrent neural networks (RNNs), generative adversarial networks (GANs), autoencoders, random forest, and other ML techniques are frequently used in such scenarios. Next, cutting-edge applications related to ML techniques for pandemic medical issues are discussed. Various problems and challenges linked with ML applications for this pandemic were reviewed. It is expected that additional research will be conducted in the upcoming to limit the spread and catastrophe management. According to the data, most papers are evaluated mainly on characteristics such as flexibility and accuracy, while other factors such as safety are overlooked. Also, Keras was the most often used library in the research studied, accounting for 24.4 percent of the time. Furthermore, medical imaging systems are employed for diagnostic reasons in 20.4 percent of applications.
Collapse
Affiliation(s)
- Arash Heidari
- Department of Computer Engineering, Tabriz Branch, Islamic Azad University, Tabriz, Iran
- Department of Computer Engineering, Shabestar Branch, Islamic Azad University, Shabestar, Iran
| | | | - Mehmet Unal
- Department of Computer Engineering, Nisantasi University, Istanbul, Turkey
| | - Shiva Toumaj
- Urmia University of Medical Sciences, Urmia, Iran
| |
Collapse
|
8
|
Martinelli DD. Generative machine learning for de novo drug discovery: A systematic review. Comput Biol Med 2022; 145:105403. [PMID: 35339849 DOI: 10.1016/j.compbiomed.2022.105403] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 03/10/2022] [Accepted: 03/11/2022] [Indexed: 02/08/2023]
Abstract
Recent research on artificial intelligence indicates that machine learning algorithms can auto-generate novel drug-like molecules. Generative models have revolutionized de novo drug discovery, rendering the explorative process more efficient. Several model frameworks and input formats have been proposed to enhance the performance of intelligent algorithms in generative molecular design. In this systematic literature review of experimental articles and reviews over the last five years, machine learning models, challenges associated with computational molecule design along with proposed solutions, and molecular encoding methods are discussed. A query-based search of the PubMed, ScienceDirect, Springer, Wiley Online Library, arXiv, MDPI, bioRxiv, and IEEE Xplore databases yielded 87 studies. Twelve additional studies were identified via citation searching. Of the articles in which machine learning was implemented, six prominent algorithms were identified: long short-term memory recurrent neural networks (LSTM-RNNs), variational autoencoders (VAEs), generative adversarial networks (GANs), adversarial autoencoders (AAEs), evolutionary algorithms, and gated recurrent unit (GRU-RNNs). Furthermore, eight central challenges were designated: homogeneity of generated molecular libraries, deficient synthesizability, limited assay data, model interpretability, incapacity for multi-property optimization, incomparability, restricted molecule size, and uncertainty in model evaluation. Molecules were encoded either as strings, which were occasionally augmented using randomization, as 2D graphs, or as 3D graphs. Statistical analysis and visualization are performed to illustrate how approaches to machine learning in de novo drug design have evolved over the past five years. Finally, future opportunities and reservations are discussed.
Collapse
|
9
|
Amilpur S, Bhukya R. A sequence-based two-layer predictor for identifying enhancers and their strength through enhanced feature extraction. J Bioinform Comput Biol 2022; 20:2250005. [PMID: 35264081 DOI: 10.1142/s0219720022500056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Enhancers are short regulatory DNA fragments that are bound with proteins called activators. They are free-bound and distant elements, which play a vital role in controlling gene expression. It is challenging to identify enhancers and their strength due to their dynamic nature. Although some machine learning methods exist to accelerate identification process, their prediction accuracy and efficiency will need more improvement. In this regard, we propose a two-layer prediction model with enhanced feature extraction strategy which does feature combination from improved position-specific amino acid propensity (PSTKNC) method along with Enhanced Nucleic Acid Composition (ENAC) and Composition of k-spaced Nucleic Acid Pairs (CKSNAP). The feature sets from all three feature extraction approaches were concatenated and then sent through a simple artificial neural network (ANN) to accurately identify enhancers in the first layer and their strength in the second layer. Experiments are conducted on benchmark chromatin nine cell lines dataset. A 10-fold cross validation method is employed to evaluate model's performance. The results show that the proposed model gives an outstanding performance with 94.50%, 0.8903 of accuracy and Matthew's correlation coefficient (MCC) in predicting enhancers and fairly does well with independent test also when compared with all other existing methods.
Collapse
Affiliation(s)
- Santhosh Amilpur
- Computer Science and Engineering, National Institute of Technology Warangal, Warangal Telangana 506004, India
| | - Raju Bhukya
- Computer Science and Engineering, National Institute of Technology Warangal, Warangal Telangana 506004, India
| |
Collapse
|
10
|
Zia SR. Identification of Potential Ligands of the Main Protease of Coronavirus SARS-CoV-2 (2019-nCoV) Using Multimodal Generative Neural-Networks. FRENCH-UKRAINIAN JOURNAL OF CHEMISTRY 2022. [DOI: 10.17721/fujcv10i1p30-47] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The recent outbreak of coronavirus disease 2019 (COVID-19) is posing a global threat to human population. The pandemic caused by novel coronavirus (2019-nCoV), also called as severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2); first emerged in Wuhan city, Hubei province of China in December 2019. The rapid human to human transmission has caused the contagion to spread world-wide affecting 244,385,444 (244.4 million) people globally causing 4,961,489 (5 million) fatalities dated by 27 October 2021. At present, 6,697,607,393 (6.7 billion) vaccine doses have been administered dated by 27 October 2021, for the prevention of COVID-19 infections. Even so, this critical and threatening situation of pandemic and due to various variants’ emergence, the pandemic control has become challenging; this calls for gigantic efforts to find new potent drug candidates and effective therapeutic approaches against the virulent respiratory disease of COVID-19. In the respiratory morbidities of COVID-19, the functionally crucial drug target for the antiviral treatment could be the main protease/3-chymotrypsin protease (Mpro/3CLpro) enzyme that is primarily involved in viral maturation and replication. In view of this, in the current study I have designed a library of small molecules against the main protease (Mpro) of coronavirus SARS-CoV-2 (2019-nCoV) by using multimodal generative neural-networks. The scaffold-based molecular docking of the series of compounds at the active site of the protein was performed; binding poses of the molecules were evaluated and protein-ligand interaction studies followed by the binding affinity calculations validated the findings. I have identified a number of small promising lead compounds that could serve as potential inhibitors of the main protease (Mpro) enzyme of coronavirus SARS-CoV-2 (2019-nCoV). This study would serve as a step forward in the development of effective antiviral therapeutic agents against the COVID-19.
Collapse
|