1
|
Taujale R, Gravel N, Zhou Z, Yeung W, Kochut K, Kannan N. Informatic challenges and advances in illuminating the druggable proteome. Drug Discov Today 2024; 29:103894. [PMID: 38266979 DOI: 10.1016/j.drudis.2024.103894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Revised: 01/08/2024] [Accepted: 01/17/2024] [Indexed: 01/26/2024]
Abstract
The understudied members of the druggable proteomes offer promising prospects for drug discovery efforts. While large-scale initiatives have generated valuable functional information on understudied members of the druggable gene families, translating this information into actionable knowledge for drug discovery requires specialized informatics tools and resources. Here, we review the unique informatics challenges and advances in annotating understudied members of the druggable proteome. We demonstrate the application of statistical evolutionary inference tools, knowledge graph mining approaches, and protein language models in illuminating understudied protein kinases, pseudokinases, and ion channels.
Collapse
Affiliation(s)
- Rahil Taujale
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA, USA
| | - Nathan Gravel
- Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | | | - Wayland Yeung
- Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | - Krystof Kochut
- School of Computing, University of Georgia, Athens, GA, USA
| | - Natarajan Kannan
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA, USA; Institute of Bioinformatics, University of Georgia, Athens, GA, USA.
| |
Collapse
|
2
|
Soleymani S, Gravel N, Huang LC, Yeung W, Bozorgi E, Bendzunas NG, Kochut KJ, Kannan N. Dark kinase annotation, mining, and visualization using the Protein Kinase Ontology. PeerJ 2023; 11:e16087. [PMID: 38077442 PMCID: PMC10704995 DOI: 10.7717/peerj.16087] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 08/22/2023] [Indexed: 12/18/2023] Open
Abstract
The Protein Kinase Ontology (ProKinO) is an integrated knowledge graph that conceptualizes the complex relationships among protein kinase sequence, structure, function, and disease in a human and machine-readable format. In this study, we have significantly expanded ProKinO by incorporating additional data on expression patterns and drug interactions. Furthermore, we have developed a completely new browser from the ground up to render the knowledge graph visible and interactive on the web. We have enriched ProKinO with new classes and relationships that capture information on kinase ligand binding sites, expression patterns, and functional features. These additions extend ProKinO's capabilities as a discovery tool, enabling it to uncover novel insights about understudied members of the protein kinase family. We next demonstrate the application of ProKinO. Specifically, through graph mining and aggregate SPARQL queries, we identify the p21-activated protein kinase 5 (PAK5) as one of the most frequently mutated dark kinases in human cancers with abnormal expression in multiple cancers, including a previously unappreciated role in acute myeloid leukemia. We have identified recurrent oncogenic mutations in the PAK5 activation loop predicted to alter substrate binding and phosphorylation. Additionally, we have identified common ligand/drug binding residues in PAK family kinases, underscoring ProKinO's potential application in drug discovery. The updated ontology browser and the addition of a web component, ProtVista, which enables interactive mining of kinase sequence annotations in 3D structures and Alphafold models, provide a valuable resource for the signaling community. The updated ProKinO database is accessible at https://prokino.uga.edu.
Collapse
Affiliation(s)
- Saber Soleymani
- Department of Computer Science, University of Georgia, Athens, GA, United States
| | - Nathan Gravel
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States
| | - Liang-Chin Huang
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States
| | - Wayland Yeung
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States
| | - Elika Bozorgi
- Department of Computer Science, University of Georgia, Athens, GA, United States
| | - Nathaniel G. Bendzunas
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA, United States
| | - Krzysztof J. Kochut
- Department of Computer Science, University of Georgia, Athens, GA, United States
| | - Natarajan Kannan
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA, United States
| |
Collapse
|
3
|
Yin X, Liao H, Yun H, Lin N, Li S, Xiang Y, Ma X. Artificial intelligence-based prediction of clinical outcome in immunotherapy and targeted therapy of lung cancer. Semin Cancer Biol 2022; 86:146-159. [PMID: 35963564 DOI: 10.1016/j.semcancer.2022.08.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 08/06/2022] [Accepted: 08/08/2022] [Indexed: 11/26/2022]
Abstract
Lung cancer accounts for the main proportion of malignancy-related deaths and most patients are diagnosed at an advanced stage. Immunotherapy and targeted therapy have great advances in application in clinics to treat lung cancer patients, yet the efficacy is unstable. The response rate of these therapies varies among patients. Some biomarkers have been proposed to predict the outcomes of immunotherapy and targeted therapy, including programmed cell death-ligand 1 (PD-L1) expression and oncogene mutations. Nevertheless, the detection tests are invasive, time-consuming, and have high demands on tumor tissue. The predictive performance of conventional biomarkers is also unsatisfactory. Therefore, novel biomarkers are needed to effectively predict the outcomes of immunotherapy and targeted therapy. The application of artificial intelligence (AI) can be a possible solution, as it has several advantages. AI can help identify features that are unable to be used by humans and perform repetitive tasks. By combining AI methods with radiomics, pathology, genomics, transcriptomics, proteomics, and clinical data, the integrated model has shown predictive value in immunotherapy and targeted therapy, which significantly improves the precision treatment of lung cancer patients. Herein, we reviewed the application of AI in predicting the outcomes of immunotherapy and targeted therapy in lung cancer patients, and discussed the challenges and future directions in this field.
Collapse
Affiliation(s)
- Xiaomeng Yin
- Division of Biotherapy, Cancer Center, West China Hospital and State Key Laboratory of Biotherapy, Sichuan University, No. 37 GuoXue Alley, Chengdu 610041, China
| | - Hu Liao
- Department of Thoracic Surgery, West China Hospital, Sichuan University, No. 37 GuoXue Alley, Chengdu 610041, China
| | - Hong Yun
- Division of Biotherapy, Cancer Center, West China Hospital and State Key Laboratory of Biotherapy, Sichuan University, No. 37 GuoXue Alley, Chengdu 610041, China
| | - Nan Lin
- Division of Biotherapy, Cancer Center, West China Hospital and State Key Laboratory of Biotherapy, Sichuan University, No. 37 GuoXue Alley, Chengdu 610041, China
| | - Shen Li
- West China School of Medicine, West China Hospital, Sichuan University, No. 37 GuoXue Alley, Chengdu 610041, China
| | - Yu Xiang
- Division of Biotherapy, Cancer Center, West China Hospital and State Key Laboratory of Biotherapy, Sichuan University, No. 37 GuoXue Alley, Chengdu 610041, China
| | - Xuelei Ma
- Division of Biotherapy, Cancer Center, West China Hospital and State Key Laboratory of Biotherapy, Sichuan University, No. 37 GuoXue Alley, Chengdu 610041, China.
| |
Collapse
|
4
|
A Systematic Review of Explainable Artificial Intelligence in Terms of Different Application Domains and Tasks. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12031353] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Artificial intelligence (AI) and machine learning (ML) have recently been radically improved and are now being employed in almost every application domain to develop automated or semi-automated systems. To facilitate greater human acceptability of these systems, explainable artificial intelligence (XAI) has experienced significant growth over the last couple of years with the development of highly accurate models but with a paucity of explainability and interpretability. The literature shows evidence from numerous studies on the philosophy and methodologies of XAI. Nonetheless, there is an evident scarcity of secondary studies in connection with the application domains and tasks, let alone review studies following prescribed guidelines, that can enable researchers’ understanding of the current trends in XAI, which could lead to future research for domain- and application-specific method development. Therefore, this paper presents a systematic literature review (SLR) on the recent developments of XAI methods and evaluation metrics concerning different application domains and tasks. This study considers 137 articles published in recent years and identified through the prominent bibliographic databases. This systematic synthesis of research articles resulted in several analytical findings: XAI methods are mostly developed for safety-critical domains worldwide, deep learning and ensemble models are being exploited more than other types of AI/ML models, visual explanations are more acceptable to end-users and robust evaluation metrics are being developed to assess the quality of explanations. Research studies have been performed on the addition of explanations to widely used AI/ML models for expert users. However, more attention is required to generate explanations for general users from sensitive domains such as finance and the judicial system.
Collapse
|
5
|
Born J, Huynh T, Stroobants A, Cornell WD, Manica M. Active Site Sequence Representations of Human Kinases Outperform Full Sequence Representations for Affinity Prediction and Inhibitor Generation: 3D Effects in a 1D Model. J Chem Inf Model 2021; 62:240-257. [PMID: 34905358 DOI: 10.1021/acs.jcim.1c00889] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Recent advances in deep learning have enabled the development of large-scale multimodal models for virtual screening and de novo molecular design. The human kinome with its abundant sequence and inhibitor data presents an attractive opportunity to develop proteochemometric models that exploit the size and internal diversity of this family of targets. Here, we challenge a standard practice in sequence-based affinity prediction models: instead of leveraging the full primary structure of proteins, each target is represented by a sequence of 29 discontiguous residues defining the ATP binding site. In kinase-ligand binding affinity prediction, our results show that the reduced active site sequence representation is not only computationally more efficient but consistently yields significantly higher performance than the full primary structure. This trend persists across different models, data sets, and performance metrics and holds true when predicting pIC50 for both unseen ligands and kinases. Our interpretability analysis reveals a potential explanation for the superiority of the active site models: whereas only mild statistical effects about the extraction of three-dimensional (3D) interaction sites take place in the full sequence models, the active site models are equipped with an implicit but strong inductive bias about the 3D structure stemming from the discontiguity of the active sites. Moreover, in direct comparisons, our models perform similarly or better than previous state-of-the-art approaches in affinity prediction. We then investigate a de novo molecular design task and find that the active site provides benefits in the computational efficiency, but otherwise, both kinase representations yield similar optimized affinities (for both SMILES- and SELFIES-based molecular generators). Our work challenges the assumption that the full primary structure is indispensable for modeling human kinases.
Collapse
Affiliation(s)
- Jannis Born
- IBM Research Europe, 8804 Rüschlikon, Switzerland.,Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland
| | - Tien Huynh
- IBM Research, Yorktown Heights, New York 10598, United States
| | - Astrid Stroobants
- Department of Chemistry, Imperial College London, SW7 2AZ London, United Kingdom
| | - Wendy D Cornell
- IBM Research, Yorktown Heights, New York 10598, United States
| | | |
Collapse
|
6
|
Zhou Y, Zhang Y, Lian X, Li F, Wang C, Zhu F, Qiu Y, Chen Y. Therapeutic target database update 2022: facilitating drug discovery with enriched comparative data of targeted agents. Nucleic Acids Res 2021; 50:D1398-D1407. [PMID: 34718717 PMCID: PMC8728281 DOI: 10.1093/nar/gkab953] [Citation(s) in RCA: 339] [Impact Index Per Article: 84.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 09/29/2021] [Accepted: 10/04/2021] [Indexed: 11/14/2022] Open
Abstract
Drug discovery relies on the knowledge of not only drugs and targets, but also the comparative agents and targets. These include poor binders and non-binders for developing discovery tools, prodrugs for improved therapeutics, co-targets of therapeutic targets for multi-target strategies and off-target investigations, and the collective structure-activity and drug-likeness landscapes of enhanced drug feature. However, such valuable data are inadequately covered by the available databases. In this study, a major update of the Therapeutic Target Database, previously featured in NAR, was therefore introduced. This update includes (a) 34 861 poor binders and 12 683 non-binders of 1308 targets; (b) 534 prodrug-drug pairs for 121 targets; (c) 1127 co-targets of 672 targets regulated by 642 approved and 624 clinical trial drugs; (d) the collective structure-activity landscapes of 427 262 active agents of 1565 targets; (e) the profiles of drug-like properties of 33 598 agents of 1102 targets. Moreover, a variety of additional data and function are provided, which include the cross-links to the target structure in PDB and AlphaFold, 159 and 1658 newly emerged targets and drugs, and the advanced search function for multi-entry target sequences or drug structures. The database is accessible without login requirement at: https://idrblab.org/ttd/.
Collapse
Affiliation(s)
- Ying Zhou
- State Key Laboratory for Diagnosis and Treatment of Infectious Disease, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, 79 QingChun Road, Hangzhou, Zhejiang 310000, China
| | - Yintao Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Xichen Lian
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Fengcheng Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Chaoxin Wang
- Department of Computer Science, Kansas State University, Manhattan 66506, USA
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Yunqing Qiu
- State Key Laboratory for Diagnosis and Treatment of Infectious Disease, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, 79 QingChun Road, Hangzhou, Zhejiang 310000, China
| | - Yuzong Chen
- State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, The Graduate School at Shenzhen, Tsinghua University, Shenzhen 518055, China.,Qian Xuesen Collaborative Research Center of Astrochemistry and Space Life Sciences, Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
| |
Collapse
|