1
|
Zhang Y, Zhu Y, Bao X, Dai Z, Shen Q, Wang L, Xue Y. Mining Bovine Milk Proteins for DPP-4 Inhibitory Peptides Using Machine Learning and Virtual Proteolysis. RESEARCH (WASHINGTON, D.C.) 2024; 7:0391. [PMID: 38887277 PMCID: PMC11182572 DOI: 10.34133/research.0391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 04/26/2024] [Indexed: 06/20/2024]
Abstract
Dipeptidyl peptidase-IV (DPP-4) enzyme inhibitors are a promising category of diabetes medications. Bioactive peptides, particularly those derived from bovine milk proteins, play crucial roles in inhibiting the DPP-4 enzyme. This study describes a comprehensive strategy for DPP-4 inhibitory peptide discovery and validation that combines machine learning and virtual proteolysis techniques. Five machine learning models, including GBDT, XGBoost, LightGBM, CatBoost, and RF, were trained. Notably, LightGBM demonstrated superior performance with an AUC value of 0.92 ± 0.01. Subsequently, LightGBM was employed to forecast the DPP-4 inhibitory potential of peptides generated through virtual proteolysis of milk proteins. Through a series of in silico screening process and in vitro experiments, GPVRGPF and HPHPHL were found to exhibit good DPP-4 inhibitory activity. Molecular docking and molecular dynamics simulations further confirmed the inhibitory mechanisms of these peptides. Through retracing the virtual proteolysis steps, it was found that GPVRGPF can be obtained from β-casein through enzymatic hydrolysis by chymotrypsin, while HPHPHL can be obtained from κ-casein through enzymatic hydrolysis by stem bromelain or papain. In summary, the integration of machine learning and virtual proteolysis techniques can aid in the preliminary determination of key hydrolysis parameters and facilitate the efficient screening of bioactive peptides.
Collapse
Affiliation(s)
- Yiyun Zhang
- National Engineering and Technology Research Center for Fruits and Vegetables, College of Food Science and Nutritional Engineering,
China Agricultural University, Beijing 100083, P.R. China
| | - Yiqing Zhu
- National Engineering and Technology Research Center for Fruits and Vegetables, College of Food Science and Nutritional Engineering,
China Agricultural University, Beijing 100083, P.R. China
| | - Xin Bao
- National Engineering and Technology Research Center for Fruits and Vegetables, College of Food Science and Nutritional Engineering,
China Agricultural University, Beijing 100083, P.R. China
| | - Zijian Dai
- National Engineering and Technology Research Center for Fruits and Vegetables, College of Food Science and Nutritional Engineering,
China Agricultural University, Beijing 100083, P.R. China
| | - Qun Shen
- National Engineering and Technology Research Center for Fruits and Vegetables, College of Food Science and Nutritional Engineering,
China Agricultural University, Beijing 100083, P.R. China
- National Center of Technology Innovation (Deep Processing of Highland Barley) in Food Industry,
China Agricultural University, Haidian District, Beijing 100083, P.R. China
| | - Liyang Wang
- National Engineering and Technology Research Center for Fruits and Vegetables, College of Food Science and Nutritional Engineering,
China Agricultural University, Beijing 100083, P.R. China
- School of Clinical Medicine,
Tsinghua University, Beijing 100084, P.R. China
| | - Yong Xue
- National Engineering and Technology Research Center for Fruits and Vegetables, College of Food Science and Nutritional Engineering,
China Agricultural University, Beijing 100083, P.R. China
- National Center of Technology Innovation (Deep Processing of Highland Barley) in Food Industry,
China Agricultural University, Haidian District, Beijing 100083, P.R. China
| |
Collapse
|
2
|
Dey PK, Dutta R, Ray M, Jakkula P, Banerjee S, Qureshi IA, Gayen S, Amin SA. Fragment-based QSAR study to explore the structural requirements of DPP-4 inhibitors: a stepping stone towards better type 2 diabetes mellitus management. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2024; 35:483-504. [PMID: 38904353 DOI: 10.1080/1062936x.2024.2366886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Accepted: 06/05/2024] [Indexed: 06/22/2024]
Abstract
Dipeptidyl peptidase-4 (DPP-4) inhibitors belong to a prominent group of pharmaceutical agents that are used in the governance of type 2 diabetes mellitus (T2DM). They exert their antidiabetic effects by inhibiting the incretin hormones like glucagon-like peptide-1 and glucose-dependent insulinotropic polypeptide which, play a pivotal role in the regulation of blood glucose homoeostasis in our body. DPP-4 inhibitors have emerged as an important class of oral antidiabetic drugs for the treatment of T2DM. Surprisingly, only a few 2D-QSAR studies have been reported on DPP-4 inhibitors. Here, fragment-based QSAR (Laplacian-modified Bayesian modelling and Recursive partitioning (RP) approaches have been utilized on a dataset of 108 DPP-4 inhibitors to achieve a deeper understanding of the association among their molecular structures. The Bayesian analysis demonstrated satisfactory ROC values for the training as well as the test sets. Meanwhile, the RP analysis resulted in decision tree 3 with 2 leaves (Tree 3: 2 leaves). This present study is an effort to get an insight into the pivotal fragments modulating DPP-4 inhibition.
Collapse
Affiliation(s)
- P K Dey
- Department of Pharmaceutical Technology, JIS University, Kolkata, West Bengal, India
| | - R Dutta
- Department of Pharmaceutical Technology, JIS University, Kolkata, West Bengal, India
| | - M Ray
- Department of Pharmaceutical Technology, JIS University, Kolkata, West Bengal, India
| | - P Jakkula
- Department of Biotechnology and Bioinformatics, School of Life Sciences, University of Hyderabad, Hyderabad, Telangana, India
| | - S Banerjee
- Department of Pharmaceutical Technology, JIS University, Kolkata, West Bengal, India
| | - I A Qureshi
- Department of Biotechnology and Bioinformatics, School of Life Sciences, University of Hyderabad, Hyderabad, Telangana, India
| | - S Gayen
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - S A Amin
- Department of Pharmaceutical Technology, JIS University, Kolkata, West Bengal, India
| |
Collapse
|
3
|
Wu Y, Min H, Li M, Shi Y, Ma A, Han Y, Gan Y, Guo X, Sun X. Effect of Artificial Intelligence-based Health Education Accurately Linking System (AI-HEALS) for Type 2 diabetes self-management: protocol for a mixed-methods study. BMC Public Health 2023; 23:1325. [PMID: 37434126 DOI: 10.1186/s12889-023-16066-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 06/06/2023] [Indexed: 07/13/2023] Open
Abstract
BACKGROUND Patients with type 2 diabetes (T2DM) have an increasing need for personalized and Precise management as medical technology advances. Artificial intelligence (AI) technologies on mobile devices are being developed gradually in a variety of healthcare fields. As an AI field, knowledge graph (KG) is being developed to extract and store structured knowledge from massive data sets. It has great prospects for T2DM medical information retrieval, clinical decision-making, and individual intelligent question and answering (QA), but has yet to be thoroughly researched in T2DM intervention. Therefore, we designed an artificial intelligence-based health education accurately linking system (AI-HEALS) to evaluate if the AI-HEALS-based intervention could help patients with T2DM improve their self-management abilities and blood glucose control in primary healthcare. METHODS This is a nested mixed-method study that includes a community-based cluster-randomized control trial and personal in-depth interviews. Individuals with T2DM between the ages of 18 and 75 will be recruited from 40-45 community health centers in Beijing, China. Participants will either receive standard diabetes primary care (SDPC) (control, 3 months) or SDPC plus AI-HEALS online health education program (intervention, 3 months). The AI-HEALS runs in the WeChat service platform, which includes a KBQA, a system of physiological indicators and lifestyle recording and monitoring, medication and blood glucose monitoring reminders, and automated, personalized message sending. Data on sociodemography, medical examination, blood glucose, and self-management behavior will be collected at baseline, as well as 1,3,6,12, and 18 months later. The primary outcome is to reduce HbA1c levels. Secondary outcomes include changes in self-management behavior, social cognition, psychology, T2DM skills, and health literacy. Furthermore, the cost-effectiveness of the AI-HEALS-based intervention will be evaluated. DISCUSSION KBQA system is an innovative and cost-effective technology for health education and promotion for T2DM patients, but it is not yet widely used in the T2DM interventions. This trial will provide evidence on the efficacy of AI and mHealth-based personalized interventions in primary care for improving T2DM outcomes and self-management behaviors. TRIAL REGISTRATION Biomedical Ethics Committee of Peking University: IRB00001052-22,058, 2022/06/06; Clinical Trials: ChiCTR2300068952, 02/03/2023.
Collapse
Affiliation(s)
- Yibo Wu
- Department of Social Medicine and Health Education, School of Public Health, Peking University, Beijing, China
| | - Hewei Min
- Department of Social Medicine and Health Education, School of Public Health, Peking University, Beijing, China
| | - Mingzi Li
- School of Nursing, Peking University, Beijing, China
| | - Yuhui Shi
- Department of Social Medicine and Health Education, School of Public Health, Peking University, Beijing, China
| | - Aijuan Ma
- Beijing Center for Disease Control and Prevention, Beijing, China
| | - Yumei Han
- Beijing Medical Examination Center, Beijing, China
| | - Yadi Gan
- Daxing District Center for Disease Control and Prevention of Beijing, Beijing, China
| | - Xiaohui Guo
- Peking University First Hospital, Beijing, China
| | - Xinying Sun
- Department of Social Medicine and Health Education, School of Public Health, Peking University, Beijing, China.
| |
Collapse
|
4
|
Khoury P, Srinivasan R, Kakumanu S, Ochoa S, Keswani A, Sparks R, Rider NL. A Framework for Augmented Intelligence in Allergy and Immunology Practice and Research—A Work Group Report of the AAAAI Health Informatics, Technology, and Education Committee. THE JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY: IN PRACTICE 2022; 10:1178-1188. [PMID: 35300959 PMCID: PMC9205719 DOI: 10.1016/j.jaip.2022.01.047] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 01/19/2022] [Accepted: 01/20/2022] [Indexed: 10/18/2022]
Abstract
Artificial and augmented intelligence (AI) and machine learning (ML) methods are expanding into the health care space. Big data are increasingly used in patient care applications, diagnostics, and treatment decisions in allergy and immunology. How these technologies will be evaluated, approved, and assessed for their impact is an important consideration for researchers and practitioners alike. With the potential of ML, deep learning, natural language processing, and other assistive methods to redefine health care usage, a scaffold for the impact of AI technology on research and patient care in allergy and immunology is needed. An American Academy of Asthma Allergy and Immunology Health Information Technology and Education subcommittee workgroup was convened to perform a scoping review of AI within health care as well as the specialty of allergy and immunology to address impacts on allergy and immunology practice and research as well as potential challenges including education, AI governance, ethical and equity considerations, and potential opportunities for the specialty. There are numerous potential clinical applications of AI in allergy and immunology that range from disease diagnosis to multidimensional data reduction in electronic health records or immunologic datasets. For appropriate application and interpretation of AI, specialists should be involved in the design, validation, and implementation of AI in allergy and immunology. Challenges include incorporation of data science and bioinformatics into training of future allergists-immunologists.
Collapse
|
5
|
Huang W, Zhang L, Li Z. Advances in computer-aided drug design for type 2 diabetes. Expert Opin Drug Discov 2022; 17:461-472. [PMID: 35254188 DOI: 10.1080/17460441.2022.2047644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
INTRODUCTION The number of diabetic patients is increasing, posing a heavy social and economic burden worldwide. Traditional drug development technology is time-consuming and costly, and the emergence of computer-aided drug design (CADD) has changed this situation. This study reviews the applications of CADD in diabetic drug designing. AREAS COVERED In this article, the authors focus on the advance in CADD in diabetic drug design by elaborating the discovery, including peroxisome proliferator-activated receptor (PPAR), G protein-coupled receptor 40 (GPR40), dipeptidyl peptidase-IV (DDP-IV), protein tyrosine phosphatase 1B (PTP1B), sodium-dependent glucose transporter 2 (SGLT-2), and glucokinase (GK). Some drug discovery of these targets is related to CADD strategies. EXPERT OPINION There is no doubt that CADD has contributed to the discovery of novel anti-diabetic agents. However, there are still many limitations and challenges, such as lack of co-crystal complex, dynamic simulations, water, and metal ion treatment. In the near future, artificial intelligence (AI) may be a promising strategy to accelerate drug discovery and reduce costs by identifying candidates. Moreover, AlphaFold, a deep learning model that predicts the 3D structure of proteins, represents a considerable advancement in the structural prediction of proteins, especially in the absence of homologous templates for protein structures.
Collapse
Affiliation(s)
- Wanqiu Huang
- School of Pharmacy, Guangdong Pharmaceutical University, Guangzhou, PR China.,Key Laboratory of New Drug Discovery and Evaluation, Guangdong Pharmaceutical University, Guangzhou, PR China.,Guangzhou Key Laboratory of Construction and Application of New Drug Screening Model Systems, Guangdong Pharmaceutical University, Guangzhou, PR China
| | - Luyong Zhang
- School of Pharmacy, Guangdong Pharmaceutical University, Guangzhou, PR China.,Key Laboratory of New Drug Discovery and Evaluation, Guangdong Pharmaceutical University, Guangzhou, PR China.,Guangzhou Key Laboratory of Construction and Application of New Drug Screening Model Systems, Guangdong Pharmaceutical University, Guangzhou, PR China.,Jiangsu Key Laboratory of Drug Screening, China Pharmaceutical University, Nanjing, PR China
| | - Zheng Li
- School of Pharmacy, Guangdong Pharmaceutical University, Guangzhou, PR China.,Key Laboratory of New Drug Discovery and Evaluation, Guangdong Pharmaceutical University, Guangzhou, PR China
| |
Collapse
|
6
|
Hermansyah O, Bustamam A, Yanuar A. Virtual screening of dipeptidyl peptidase-4 inhibitors using quantitative structure-activity relationship-based artificial intelligence and molecular docking of hit compounds. Comput Biol Chem 2021; 95:107597. [PMID: 34800858 DOI: 10.1016/j.compbiolchem.2021.107597] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Revised: 10/25/2021] [Accepted: 10/26/2021] [Indexed: 12/31/2022]
Abstract
Dipeptidyl peptidase-4 (DPP-4) inhibitors are becoming an essential drug in the treatment of type 2 diabetes mellitus; however, some classes of these drugs exert side effects, including joint pain and pancreatitis. Studies suggest that these side effects might be related to secondary inhibition of DPP-8 and DPP-9. In this study, we identified DPP-4-inhibitor hit compounds selective against DPP-8 and DPP-9. We built a virtual screening workflow using a quantitative structure-activity relationship (QSAR) strategy based on artificial intelligence to allow faster screening of millions of molecules for the DPP-4 target relative to other screening methods. Five regression machine learning algorithms and four classification machine learning algorithms were applied to build virtual screening workflows, with the QSAR model applied using support vector regression (R2pred 0.78) and the classification QSAR model using the random forest algorithm with 92.2% accuracy. Virtual screening results of > 10 million molecules obtained 2 716 hits compounds with a pIC50 value of > 7.5. Additionally, molecular docking results of several potential hit compounds for DPP-4, DPP-8, and DPP-9 identified CH0002 as showing high inhibitory potential against DPP-4 and low inhibitory potential for DPP-8 and DPP-9 enzymes. These results demonstrated the effectiveness of this technique for identifying DPP-4-inhibitor hit compounds selective for DPP-4 and against DPP-8 and DPP-9 and suggest its potential efficacy for applications to discover hit compounds of other targets.
Collapse
Affiliation(s)
- Oky Hermansyah
- Laboratory of Biomedical Computation and Drug Design, Faculty of Pharmacy, Universitas Indonesia, Depok 16424, Indonesia
| | - Alhadi Bustamam
- Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Indonesia, Depok 16424, Indonesia
| | - Arry Yanuar
- Laboratory of Biomedical Computation and Drug Design, Faculty of Pharmacy, Universitas Indonesia, Depok 16424, Indonesia.
| |
Collapse
|
7
|
Abstract
MMP2, a Zn2+-dependent metalloproteinase, is related to cancer and angiogenesis. Inhibition of this enzyme might result in a potential antimetastatic drug to leverage the anticancer drug armory. In silico or computer-aided ligand-based drug design is a method of rational drug design that takes multiple chemometrics (i.e., multi-quantitative structure-activity relationship methods) into account for virtually selecting or developing a series of probable selective MMP2 inhibitors. Though existing matrix metalloproteinase inhibitors have shown plausible pan-matrix metalloproteinase (MMP) activity, they have resulted in various adverse effects leading to their being rescinded in later phases of clinical trials. Therefore a review of the ligand-based designing methods of MMP2 inhibitors would result in an explicit route map toward successfully designing and synthesizing novel and selective MMP2 inhibitors.
Collapse
|
8
|
Singh R, Ganeshpurkar A, Ghosh P, Pokle AV, Kumar D, Singh RB, Singh SK, Kumar A. Classification of beta-site amyloid precursor protein cleaving enzyme 1 inhibitors by using machine learning methods. Chem Biol Drug Des 2021; 98:1079-1097. [PMID: 34592057 DOI: 10.1111/cbdd.13965] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 09/18/2021] [Accepted: 09/26/2021] [Indexed: 11/28/2022]
Abstract
The beta-site amyloid precursor protein cleaving enzyme 1 (BACE1) is a transmembrane aspartyl-protease, that cleaves amyloid precursor protein (APP) at the β-site. The sequential proteolytic cleavage of APP, first by β-secretase and then by γ-secretase complex, leads to the production and release of amyloid-β peptide, a pathological hallmark of Alzheimer's disease (AD). BACE1 inhibitors are reported to possess considerable potential in decreasing the level of amyloid-β in brain and preventing the progression of AD. A classification study has been conducted on 3536 diverse BACE1 inhibitors, obtained from Binding DB database, by extracting two types of descriptors, that is molecular property (Mordred) and fingerprints (Pubchem, MACCS and KRFP). Furthermore, based on the descriptors, various machine learning algorithms such as Naïve Bayesian (NB), nearest known neighbours (kNN), support vector machine (SVM), random forest (RF) and gradient-boosted algorithms (XGB) were applied to develop classification models. The performance of models was evaluated by using accuracy, precision, recall and F1 score of test set. The best NB, kNN, SVM, RF and XGB classifiers had F1 score of 0.74, 0.85, 0.86, 0.87 and 0.87, respectively. The diverse 3536 BACE1 inhibitors were clustered into 11 subsets, and the structural features of each subset were evaluated. The important fragments present in active and inactive compounds were also identified. The model developed in the study would serve as a valuable tool for the designing of BACE1 inhibitors, and also in virtual screening of molecules to identify these.
Collapse
Affiliation(s)
- Ravi Singh
- Pharmaceutical Chemistry Research Laboratory 1, Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (Banaras Hindu University), Varanasi, India
| | - Ankit Ganeshpurkar
- Pharmaceutical Chemistry Research Laboratory 1, Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (Banaras Hindu University), Varanasi, India
| | - Powsali Ghosh
- Pharmaceutical Chemistry Research Laboratory 1, Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (Banaras Hindu University), Varanasi, India
| | - Ankit Vyankatrao Pokle
- Pharmaceutical Chemistry Research Laboratory 1, Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (Banaras Hindu University), Varanasi, India
| | | | - Ravi Bhushan Singh
- Institute of Pharmacy Harischandra PG College, Bawanbigha, Varanasi, India
| | - Sushil Kumar Singh
- Pharmaceutical Chemistry Research Laboratory 1, Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (Banaras Hindu University), Varanasi, India
| | - Ashok Kumar
- Pharmaceutical Chemistry Research Laboratory 1, Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (Banaras Hindu University), Varanasi, India
| |
Collapse
|
9
|
Zhu J, Jiang Y, Jia L, Xu L, Cai Y, Chen Y, Zhu N, Li H, Jin J. A multi-conformational virtual screening approach based on machine learning targeting PI3Kγ. Mol Divers 2021; 25:1271-1282. [PMID: 34160714 DOI: 10.1007/s11030-021-10243-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Accepted: 06/03/2021] [Indexed: 12/13/2022]
Abstract
Nowadays, more and more attention has been attracted to develop selective PI3Kγ inhibitors, but the unique structural features of PI3Kγ protein make it a very big challenge. In the present study, a virtual screening strategy based on machine learning with multiple PI3Kγ protein structures was developed to screen novel PI3Kγ inhibitors. First, six mainstream docking programs were chosen to evaluate their scoring power and screening power; CDOCKER and Glide show satisfactory reliability and accuracy against the PI3Kγ system. Next, virtual screening integrating multiple PI3Kγ protein structures was demonstrated to significantly improve the screening enrichment rate comparing to that with an individual protein structure. Last, a multi-conformational Naïve Bayesian Classification model with the optimal docking programs was constructed, and it performed a true capability in the screening of PI3Kγ inhibitors. Taken together, the current study could provide some guidance for the docking-based virtual screening to discover novel PI3Kγ inhibitors.
Collapse
Affiliation(s)
- Jingyu Zhu
- School of Pharmaceutical Sciences, Jiangnan University, Wuxi, 214122, Jiangsu, China.
| | - Yingmin Jiang
- School of Pharmaceutical Sciences, Jiangnan University, Wuxi, 214122, Jiangsu, China
| | - Lei Jia
- School of Pharmaceutical Sciences, Jiangnan University, Wuxi, 214122, Jiangsu, China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, 213001, China
| | - Yanfei Cai
- School of Pharmaceutical Sciences, Jiangnan University, Wuxi, 214122, Jiangsu, China
| | - Yun Chen
- School of Pharmaceutical Sciences, Jiangnan University, Wuxi, 214122, Jiangsu, China
| | - Nannan Zhu
- School of Pharmaceutical Sciences, Jiangnan University, Wuxi, 214122, Jiangsu, China
| | - Huazhong Li
- School of Biotechnology, Jiangnan University, Wuxi, 214122, Jiangsu, China
| | - Jian Jin
- School of Pharmaceutical Sciences, Jiangnan University, Wuxi, 214122, Jiangsu, China.
| |
Collapse
|
10
|
Zhao J, Xu P, Liu X, Ji X, Li M, Dev S, Qu X, Lu W, Niu B. Application of machine learning methods for the development of antidiabetic drugs. Curr Pharm Des 2021; 28:260-271. [PMID: 34161205 DOI: 10.2174/1381612827666210622104428] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Accepted: 05/10/2021] [Indexed: 11/22/2022]
Abstract
Diabetes is a chronic non-communicable disease caused by several different routes, which has attracted increasing attention. In order to speed up the development of new selective drugs, machine learning (ML) technology has been applied in the process of diabetes drug development, which opens up a new blueprint for drug design. This review provides a comprehensive portrayal of the application of ML in antidiabetic drug use.
Collapse
Affiliation(s)
- Juanjuan Zhao
- Department of Chemistry, College of Sciences, Shanghai University, 200444, China
| | - Pengcheng Xu
- Materials Genome Institute, Shanghai University, Shanghai 200444, China
| | - Xiujuan Liu
- Department of Chemistry, College of Sciences, Shanghai University, 200444, China
| | - Xiaobo Ji
- Department of Chemistry, College of Sciences, Shanghai University, 200444, China
| | - Minjie Li
- Department of Chemistry, College of Sciences, Shanghai University, 200444, China
| | - Sooranna Dev
- Department of Obstetrics and Gynaecology, Imperial College London, Fulham Road, London SW10 9 NH, United Kingdom
| | - Xiaosheng Qu
- National Engineering Laboratory of Southwest Endangered Medicinal Resources Development, Guangxi Botanical Garden of Medicinal Plants, No. 189, Changgang Road, 530023, Nanning, China
| | - Wencong Lu
- Department of Chemistry, College of Sciences, Shanghai University, 200444, China
| | - Bing Niu
- School of Life Sciences, Shanghai University, 200444, China
| |
Collapse
|
11
|
Wang L, Niu D, Wang X, Khan J, Shen Q, Xue Y. A Novel Machine Learning Strategy for the Prediction of Antihypertensive Peptides Derived from Food with High Efficiency. Foods 2021; 10:foods10030550. [PMID: 33800877 PMCID: PMC7999667 DOI: 10.3390/foods10030550] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 03/01/2021] [Accepted: 03/03/2021] [Indexed: 12/22/2022] Open
Abstract
Strategies to screen antihypertensive peptides with high throughput and rapid speed will doubtlessly contribute to the treatment of hypertension. Food-derived antihypertensive peptides can reduce blood pressure without side effects. In the present study, a novel model based on the eXtreme Gradient Boosting (XGBoost) algorithm was developed and compared with the dominating machine learning models. To further reflect on the reliability of the method in a real situation, the optimized XGBoost model was utilized to predict the antihypertensive degree of the k-mer peptides cutting from six key proteins in bovine milk, and the peptide-protein docking technology was introduced to verify the findings. The results showed that the XGBoost model achieved outstanding performance, with an accuracy of 86.50% and area under the receiver operating characteristic curve of 94.11%, which were better than the other models. Using the XGBoost model, the prediction of antihypertensive peptides derived from milk protein was consistent with the peptide-protein docking results, and was more efficient. Our results indicate that using the XGBoost algorithm as a novel auxiliary tool is feasible to screen for antihypertensive peptides derived from food, with high throughput and high efficiency.
Collapse
Affiliation(s)
- Liyang Wang
- College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (J.K.); (Q.S.)
| | - Dantong Niu
- College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China;
| | - Xiaoya Wang
- College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (J.K.); (Q.S.)
| | - Jabir Khan
- College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (J.K.); (Q.S.)
| | - Qun Shen
- College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (J.K.); (Q.S.)
| | - Yong Xue
- College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (J.K.); (Q.S.)
- Correspondence:
| |
Collapse
|
12
|
Chandrasekaran S, Luna-Vital D, de Mejia EG. Identification and Comparison of Peptides from Chickpea Protein Hydrolysates Using Either Bromelain or Gastrointestinal Enzymes and Their Relationship with Markers of Type 2 Diabetes and Bitterness. Nutrients 2020; 12:nu12123843. [PMID: 33339265 PMCID: PMC7765824 DOI: 10.3390/nu12123843] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2020] [Revised: 12/13/2020] [Accepted: 12/13/2020] [Indexed: 12/17/2022] Open
Abstract
The chickpea (Cicer arietinum L.) is one of the most important pulses worldwide. The objective was to identify, compare and evaluate peptides from chickpea hydrolysates produced by two enzymatic treatments. The antidiabetic potential and bitterness of the peptides and induction of bitter receptors were identified in silico. Proteins were isolated from the Kabuli variety. Peptides were produced from the proteins using a simulated digestive system (pepsin/pancreatin, 1:50 Enzyme/Protein, E/P), and these peptides were compared with those produced via bromelain hydrolysis (1:50 E/P). The protein profiles, sequences and characteristics of the peptides were evaluated. The biochemical inhibition and molecular docking of dipeptidyl peptidase-IV (DPP-IV), α-amylase and α-glucosidase were also studied. The molecular docking identified peptides from enzymatic hydrolysis as inhibitors of DPP-IV. The high hydrophobicity of the peptides indicated the potential for bitterness. There was no correlation between peptide length and DPP-IV binding. Peptides sequenced from the pepsin/pancreatin hydrolysates, PHPATSGGGL and YVDGSGTPLT, had greater affinity for the DPP-IV catalytic site than the peptides from the bromelain hydrolysates. These results are in agreement with their biochemical inhibition, when considering the inhibition of sitagliptin (54.3 µg/mL) as a standard. The bitter receptors hTAS2R38, hTAS2R5, hTAS2R7 and hTAS2R14 were stimulated by most sequences, which could be beneficial in the treatment of type 2 diabetes. Chickpea hydrolysates could be utilized as functional ingredients to be included in the diet for the prevention of diabetes.
Collapse
Affiliation(s)
- Subhiksha Chandrasekaran
- Department of Food Science and Human Nutrition, University of Illinois at Urbana-Champaign, 228 ERML Bldg, 1201 W Gregory Drive, Urbana, IL 61801, USA
| | - Diego Luna-Vital
- Department of Food Science and Human Nutrition, University of Illinois at Urbana-Champaign, 228 ERML Bldg, 1201 W Gregory Drive, Urbana, IL 61801, USA
| | - Elvira Gonzalez de Mejia
- Department of Food Science and Human Nutrition, University of Illinois at Urbana-Champaign, 228 ERML Bldg, 1201 W Gregory Drive, Urbana, IL 61801, USA
| |
Collapse
|
13
|
Hao M, Bryant SH, Wang Y. Open-source chemogenomic data-driven algorithms for predicting drug-target interactions. Brief Bioinform 2020; 20:1465-1474. [PMID: 29420684 DOI: 10.1093/bib/bby010] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 01/18/2018] [Indexed: 12/25/2022] Open
Abstract
While novel technologies such as high-throughput screening have advanced together with significant investment by pharmaceutical companies during the past decades, the success rate for drug development has not yet been improved prompting researchers looking for new strategies of drug discovery. Drug repositioning is a potential approach to solve this dilemma. However, experimental identification and validation of potential drug targets encoded by the human genome is both costly and time-consuming. Therefore, effective computational approaches have been proposed to facilitate drug repositioning, which have proved to be successful in drug discovery. Doubtlessly, the availability of open-accessible data from basic chemical biology research and the success of human genome sequencing are crucial to develop effective in silico drug repositioning methods allowing the identification of potential targets for existing drugs. In this work, we review several chemogenomic data-driven computational algorithms with source codes publicly accessible for predicting drug-target interactions (DTIs). We organize these algorithms by model properties and model evolutionary relationships. We re-implemented five representative algorithms in R programming language, and compared these algorithms by means of mean percentile ranking, a new recall-based evaluation metric in the DTI prediction research field. We anticipate that this review will be objective and helpful to researchers who would like to further improve existing algorithms or need to choose appropriate algorithms to infer potential DTIs in the projects. The source codes for DTI predictions are available at: https://github.com/minghao2016/chemogenomicAlg4DTIpred.
Collapse
|
14
|
Abhari S, Niakan Kalhori SR, Ebrahimi M, Hasannejadasl H, Garavand A. Artificial Intelligence Applications in Type 2 Diabetes Mellitus Care: Focus on Machine Learning Methods. Healthc Inform Res 2019; 25:248-261. [PMID: 31777668 PMCID: PMC6859270 DOI: 10.4258/hir.2019.25.4.248] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Revised: 10/06/2019] [Accepted: 10/09/2019] [Indexed: 12/18/2022] Open
Abstract
Objectives The incidence of type 2 diabetes mellitus has increased significantly in recent years. With the development of artificial intelligence applications in healthcare, they are used for diagnosis, therapeutic decision making, and outcome prediction, especially in type 2 diabetes mellitus. This study aimed to identify the artificial intelligence (AI) applications for type 2 diabetes mellitus care. Methods This is a review conducted in 2018. We searched the PubMed, Web of Science, and Embase scientific databases, based on a combination of related mesh terms. The article selection process was based on Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). Finally, 31 articles were selected after inclusion and exclusion criteria were applied. Data gathering was done by using a data extraction form. Data were summarized and reported based on the study objectives. Results The main applications of AI for type 2 diabetes mellitus care were screening and diagnosis in different stages. Among all of the reviewed AI methods, machine learning methods with 71% (n = 22) were the most commonly applied techniques. Many applications were in multi method forms (23%). Among the machine learning algorithms applications, support vector machine (21%) and naive Bayesian (19%) were the most commonly used methods. The most important variables that were used in the selected studies were body mass index, fasting blood sugar, blood pressure, HbA1c, triglycerides, low-density lipoprotein, high-density lipoprotein, and demographic variables. Conclusions It is recommended to select optimal algorithms by testing various techniques. Support vector machine and naive Bayesian might achieve better performance than other applications due to the type of variables and targets in diabetes-related outcomes classification.
Collapse
Affiliation(s)
- Shahabeddin Abhari
- Department of Health Information Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran
| | - Sharareh R Niakan Kalhori
- Department of Health Information Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran
| | - Mehdi Ebrahimi
- Department of Internal Medicine, School of Medicine, Tehran University of Medical Sciences, Tehran, Iran.,Endocrinology and Metabolism Research Center, Endocrinology and Metabolism Research Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Hajar Hasannejadasl
- Department of Health Information Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran
| | - Ali Garavand
- Department of Health Information Management and Technology, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
15
|
Yang X, Wang Y, Byrne R, Schneider G, Yang S. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem Rev 2019; 119:10520-10594. [PMID: 31294972 DOI: 10.1021/acs.chemrev.8b00728] [Citation(s) in RCA: 351] [Impact Index Per Article: 70.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Artificial intelligence (AI), and, in particular, deep learning as a subcategory of AI, provides opportunities for the discovery and development of innovative drugs. Various machine learning approaches have recently (re)emerged, some of which may be considered instances of domain-specific AI which have been successfully employed for drug discovery and design. This review provides a comprehensive portrayal of these machine learning techniques and of their applications in medicinal chemistry. After introducing the basic principles, alongside some application notes, of the various machine learning algorithms, the current state-of-the art of AI-assisted pharmaceutical discovery is discussed, including applications in structure- and ligand-based virtual screening, de novo drug design, physicochemical and pharmacokinetic property prediction, drug repurposing, and related aspects. Finally, several challenges and limitations of the current methods are summarized, with a view to potential future directions for AI-assisted drug discovery and design.
Collapse
Affiliation(s)
- Xin Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Yifei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Ryan Byrne
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Gisbert Schneider
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Shengyong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| |
Collapse
|
16
|
Hao M, Bryant SH, Wang Y. A new chemoinformatics approach with improved strategies for effective predictions of potential drugs. J Cheminform 2018; 10:50. [PMID: 30311095 PMCID: PMC6755712 DOI: 10.1186/s13321-018-0303-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2018] [Accepted: 10/02/2018] [Indexed: 12/24/2022] Open
Abstract
Background Fast and accurate identification of potential drug candidates against therapeutic targets (i.e., drug–target interactions, DTIs) is a fundamental step in the early drug discovery process. However, experimental determination of DTIs is time-consuming and costly, especially for testing the associations between the entire chemical and genomic spaces. Therefore, computationally efficient algorithms with accurate predictions are required to achieve such a challenging task. In this work, we design a new chemoinformatics approach derived from neighbor-based collaborative filtering (NBCF) to infer potential drug candidates for targets of interest. One of the fundamental steps of NBCF in the application of DTI predictions is to accurately measure the similarity between drugs solely based on the DTI profiles of known knowledge. However, commonly used similarity calculation methods such as COSINE may be noise-prone due to the extremely sparse property of the DTI bipartite network, which decreases the model performance of NBCF. We herein propose three strategies to remedy such a dilemma, which include: (1) adopting a positive pointwise mutual information (PPMI)-based similarity metric, which is noise-immune to some extent; (2) performing low-rank approximation of the original prediction scores; (3) incorporating auxiliary (complementary) information to produce the final predictions. Results We test the proposed methods in three benchmark datasets and the results indicate that our strategies are helpful to improve the NBCF performance for DTI predictions. Comparing to the prior algorithm, our methods exhibit better results assessed by a recall-based evaluation metric. Conclusions A new chemoinformatics approach with improved strategies was successfully developed to predict potential DTIs. Among them, the model based on the sparsity resistant PPMI similarity metric exhibits the best performance, which may be helpful to researchers for identifying potential drugs against therapeutic targets of interest, and can also be applied to related research such as identifying candidate disease genes.
Collapse
Affiliation(s)
- Ming Hao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Stephen H Bryant
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Yanli Wang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| |
Collapse
|
17
|
Carpenter KA, Huang X. Machine Learning-based Virtual Screening and Its Applications to Alzheimer's Drug Discovery: A Review. Curr Pharm Des 2018; 24:3347-3358. [PMID: 29879881 PMCID: PMC6327115 DOI: 10.2174/1381612824666180607124038] [Citation(s) in RCA: 84] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Revised: 05/31/2018] [Accepted: 06/01/2018] [Indexed: 01/11/2023]
Abstract
BACKGROUND Virtual Screening (VS) has emerged as an important tool in the drug development process, as it conducts efficient in silico searches over millions of compounds, ultimately increasing yields of potential drug leads. As a subset of Artificial Intelligence (AI), Machine Learning (ML) is a powerful way of conducting VS for drug leads. ML for VS generally involves assembling a filtered training set of compounds, comprised of known actives and inactives. After training the model, it is validated and, if sufficiently accurate, used on previously unseen databases to screen for novel compounds with desired drug target binding activity. OBJECTIVE The study aims to review ML-based methods used for VS and applications to Alzheimer's Disease (AD) drug discovery. METHODS To update the current knowledge on ML for VS, we review thorough backgrounds, explanations, and VS applications of the following ML techniques: Naïve Bayes (NB), k-Nearest Neighbors (kNN), Support Vector Machines (SVM), Random Forests (RF), and Artificial Neural Networks (ANN). RESULTS All techniques have found success in VS, but the future of VS is likely to lean more largely toward the use of neural networks - and more specifically, Convolutional Neural Networks (CNN), which are a subset of ANN that utilize convolution. We additionally conceptualize a work flow for conducting ML-based VS for potential therapeutics for AD, a complex neurodegenerative disease with no known cure and prevention. This both serves as an example of how to apply the concepts introduced earlier in the review and as a potential workflow for future implementation. CONCLUSION Different ML techniques are powerful tools for VS, and they have advantages and disadvantages albeit. ML-based VS can be applied to AD drug development.
Collapse
Affiliation(s)
- Kristy A. Carpenter
- Neurochemistry Laboratory, Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Charlestown, MA 02129, USA
| | - Xudong Huang
- Neurochemistry Laboratory, Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Charlestown, MA 02129, USA
| |
Collapse
|