1
|
Qian Y, Li X, Wu J, Zhang Q. MMCL-CPI: A multi-modal compound-protein interaction prediction model incorporating contrastive learning pre-training. Comput Biol Chem 2024; 112:108137. [PMID: 39079285 DOI: 10.1016/j.compbiolchem.2024.108137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 05/31/2024] [Accepted: 06/20/2024] [Indexed: 09/13/2024]
Abstract
MOTIVATION Compound-protein interaction (CPI) prediction plays a crucial role in drug discovery and drug repositioning. Early researchers relied on time-consuming and labor-intensive wet laboratory experiments. However, the advent of deep learning has significantly accelerated this progress. Most existing deep learning methods utilize deep neural networks to extract compound features from sequences and graphs, either separately or in combination. Our team's previous research has demonstrated that compound images contain valuable information that can be leveraged for CPI task. However, there is a scarcity of multimodal methods that effectively combine sequence and image representations of compounds in CPI. Currently, the use of text-image pairs for contrastive language-image pre-training is a popular approach in the multimodal field. Further research is needed to explore how the integration of sequence and image representations can enhance the accuracy of CPI task. RESULTS This paper presents a novel method called MMCL-CPI, which encompasses two key highlights: 1) Firstly, we propose extracting compound features from two modalities: one-dimensional SMILES and two-dimensional images. This approach enables us to capture both sequence and spatial features, enhancing the prediction accuracy for CPI. Based on this, we design a novel multimodal model. 2) Secondly, we introduce a multimodal pre-training strategy that leverages comparative learning on a large-scale unlabeled dataset to establish the correspondence between SMILES string and compound's image. This pre-training approach significantly improves compound feature representations for downstream CPI task. Our method has shown competitive results on multiple datasets.
Collapse
Affiliation(s)
- Ying Qian
- School of Computer Science and Technology, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, East China Normal University, Shanghai, China
| | - Xinyi Li
- School of Computer Science and Technology, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, East China Normal University, Shanghai, China
| | - Jian Wu
- School of Computer Science and Technology, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, East China Normal University, Shanghai, China
| | - Qian Zhang
- School of Computer Science and Technology, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, East China Normal University, Shanghai, China.
| |
Collapse
|
2
|
Tian Z, Yu Y, Ni F, Zou Q. Drug-target interaction prediction with collaborative contrastive learning and adaptive self-paced sampling strategy. BMC Biol 2024; 22:216. [PMID: 39334132 PMCID: PMC11437672 DOI: 10.1186/s12915-024-02012-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 09/06/2024] [Indexed: 09/30/2024] Open
Abstract
BACKGROUND Drug-target interaction (DTI) prediction plays a pivotal role in drug discovery and drug repositioning, enabling the identification of potential drug candidates. However, most previous approaches often do not fully utilize the complementary relationships among multiple biological networks, which limits their ability to learn more consistent representations. Additionally, the selection strategy of negative samples significantly affects the performance of contrastive learning methods. RESULTS In this study, we propose CCL-ASPS, a novel deep learning model that incorporates Collaborative Contrastive Learning (CCL) and Adaptive Self-Paced Sampling strategy (ASPS) for drug-target interaction prediction. CCL-ASPS leverages multiple networks to learn the fused embeddings of drugs and targets, ensuring their consistent representations from individual networks. Furthermore, ASPS dynamically selects more informative negative sample pairs for contrastive learning. Experiment results on the established dataset demonstrate that CCL-ASPS achieves significant improvements compared to current state-of-the-art methods. Moreover, ablation experiments confirm the contributions of the proposed CCL and ASPS strategies. CONCLUSIONS By integrating Collaborative Contrastive Learning and Adaptive Self-Paced Sampling, the proposed CCL-ASPS effectively addresses the limitations of previous methods. This study demonstrates that CCL-ASPS achieves notable improvements in DTI predictive performance compared to current state-of-the-art approaches. The case study and cold start experiments further illustrate the capability of CCL-ASPS to effectively predict previously unknown DTI, potentially facilitating the identification of new drug-target interactions.
Collapse
Affiliation(s)
- Zhen Tian
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, 450001, Henan, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Yue Yu
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, 450001, Henan, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Fengming Ni
- Department of Gastroenterology, The First Hospital of Jilin University, Changchun, 130021, China.
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| |
Collapse
|
3
|
Boonyarit B, Yamprasert N, Kaewnuratchadasorn P, Kinchagawat J, Prommin C, Rungrotmongkol T, Nutanong S. GraphEGFR: Multi-task and transfer learning based on molecular graph attention mechanism and fingerprints improving inhibitor bioactivity prediction for EGFR family proteins on data scarcity. J Comput Chem 2024; 45:2001-2023. [PMID: 38713612 DOI: 10.1002/jcc.27388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 04/16/2024] [Accepted: 04/19/2024] [Indexed: 05/09/2024]
Abstract
The proteins within the human epidermal growth factor receptor (EGFR) family, members of the tyrosine kinase receptor family, play a pivotal role in the molecular mechanisms driving the development of various tumors. Tyrosine kinase inhibitors, key compounds in targeted therapy, encounter challenges in cancer treatment due to emerging drug resistance mutations. Consequently, machine learning has undergone significant evolution to address the challenges of cancer drug discovery related to EGFR family proteins. However, the application of deep learning in this area is hindered by inherent difficulties associated with small-scale data, particularly the risk of overfitting. Moreover, the design of a model architecture that facilitates learning through multi-task and transfer learning, coupled with appropriate molecular representation, poses substantial challenges. In this study, we introduce GraphEGFR, a deep learning regression model designed to enhance molecular representation and model architecture for predicting the bioactivity of inhibitors against both wild-type and mutant EGFR family proteins. GraphEGFR integrates a graph attention mechanism for molecular graphs with deep and convolutional neural networks for molecular fingerprints. We observed that GraphEGFR models employing multi-task and transfer learning strategies generally achieve predictive performance comparable to existing competitive methods. The integration of molecular graphs and fingerprints adeptly captures relationships between atoms and enables both global and local pattern recognition. We further validated potential multi-targeted inhibitors for wild-type and mutant HER1 kinases, exploring key amino acid residues through molecular dynamics simulations to understand molecular interactions. This predictive model offers a robust strategy that could significantly contribute to overcoming the challenges of developing deep learning models for drug discovery with limited data and exploring new frontiers in multi-targeted kinase drug discovery for EGFR family proteins.
Collapse
Affiliation(s)
- Bundit Boonyarit
- School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology, Rayong, Thailand
| | - Nattawin Yamprasert
- School of Information, Computer, and Communication Technology, Sirindhorn International Institute of Technology, Thammasat University, Pathum Thani, Thailand
| | | | - Jiramet Kinchagawat
- School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology, Rayong, Thailand
| | - Chanatkran Prommin
- School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology, Rayong, Thailand
| | - Thanyada Rungrotmongkol
- Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Bangkok, Thailand
- Center of Excellence in Structural and Computational Biology Research Unit, Department of Biochemistry, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
| | - Sarana Nutanong
- School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology, Rayong, Thailand
| |
Collapse
|
4
|
Zeng X, Feng PK, Li SJ, Lv SQ, Wen ML, Li Y. GNN-DDAS: Drug discovery for identifying anti-schistosome small molecules based on graph neural network. J Comput Chem 2024. [PMID: 39189298 DOI: 10.1002/jcc.27490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 08/06/2024] [Accepted: 08/09/2024] [Indexed: 08/28/2024]
Abstract
Schistosomiasis is a tropical disease that poses a significant risk to hundreds of millions of people, yet often goes unnoticed. While praziquantel, a widely used anti-schistosome drug, has a low cost and a high cure rate, it has several drawbacks. These include ineffectiveness against schistosome larvae, reduced efficacy in young children, and emerging drug resistance. Discovering new and active anti-schistosome small molecules is therefore critical, but this process presents the challenge of low accuracy in computer-aided methods. To address this issue, we proposed GNN-DDAS, a novel deep learning framework based on graph neural networks (GNN), designed for drug discovery to identify active anti-schistosome (DDAS) small molecules. Initially, a multi-layer perceptron was used to derive sequence features from various representations of small molecule SMILES. Next, GNN was employed to extract structural features from molecular graphs. Finally, the extracted sequence and structural features were then concatenated and fed into a fully connected network to predict active anti-schistosome small molecules. Experimental results showed that GNN-DDAS exhibited superior performance compared to the benchmark methods on both benchmark and real-world application datasets. Additionally, the use of GNNExplainer model allowed us to analyze the key substructure features of small molecules, providing insight into the effectiveness of GNN-DDAS. Overall, GNN-DDAS provided a promising solution for discovering new and active anti-schistosome small molecules.
Collapse
Affiliation(s)
- Xin Zeng
- College of Mathematics and Computer Science, Dali University, Dali, China
| | - Peng-Kun Feng
- College of Mathematics and Computer Science, Dali University, Dali, China
| | - Shu-Juan Li
- Department of Endemic Diseases, Yunnan Institute of Endemic Diseases Control and Prevention, Dali, China
| | - Shuang-Qing Lv
- Institute of Surveying and Information Engineering, West Yunnan University of Applied Science, Dali, China
| | - Meng-Liang Wen
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan University, Kunming, China
| | - Yi Li
- College of Mathematics and Computer Science, Dali University, Dali, China
| |
Collapse
|
5
|
Li G, Li S, Liang C, Xiao Q, Luo J. Drug repositioning based on residual attention network and free multiscale adversarial training. BMC Bioinformatics 2024; 25:261. [PMID: 39118000 PMCID: PMC11308596 DOI: 10.1186/s12859-024-05893-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 08/06/2024] [Indexed: 08/10/2024] Open
Abstract
BACKGROUND Conducting traditional wet experiments to guide drug development is an expensive, time-consuming and risky process. Analyzing drug function and repositioning plays a key role in identifying new therapeutic potential of approved drugs and discovering therapeutic approaches for untreated diseases. Exploring drug-disease associations has far-reaching implications for identifying disease pathogenesis and treatment. However, reliable detection of drug-disease relationships via traditional methods is costly and slow. Therefore, investigations into computational methods for predicting drug-disease associations are currently needed. RESULTS This paper presents a novel drug-disease association prediction method, RAFGAE. First, RAFGAE integrates known associations between diseases and drugs into a bipartite network. Second, RAFGAE designs the Re_GAT framework, which includes multilayer graph attention networks (GATs) and two residual networks. The multilayer GATs are utilized for learning the node embeddings, which is achieved by aggregating information from multihop neighbors. The two residual networks are used to alleviate the deep network oversmoothing problem, and an attention mechanism is introduced to combine the node embeddings from different attention layers. Third, two graph autoencoders (GAEs) with collaborative training are constructed to simulate label propagation to predict potential associations. On this basis, free multiscale adversarial training (FMAT) is introduced. FMAT enhances node feature quality through small gradient adversarial perturbation iterations, improving the prediction performance. Finally, tenfold cross-validations on two benchmark datasets show that RAFGAE outperforms current methods. In addition, case studies have confirmed that RAFGAE can detect novel drug-disease associations. CONCLUSIONS The comprehensive experimental results validate the utility and accuracy of RAFGAE. We believe that this method may serve as an excellent predictor for identifying unobserved disease-drug associations.
Collapse
Affiliation(s)
- Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China.
| | - Shuwen Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| |
Collapse
|
6
|
Chen L, Liu L, Su H, Xu Y. KbhbXG: A Machine learning architecture based on XGBoost for prediction of lysine β-Hydroxybutyrylation (Kbhb) modification sites. Methods 2024; 227:27-34. [PMID: 38679187 DOI: 10.1016/j.ymeth.2024.04.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 04/16/2024] [Accepted: 04/20/2024] [Indexed: 05/01/2024] Open
Abstract
Lysine β-hydroxybutyrylation is an important post-translational modification (PTM) involved in various physiological and biological processes. In this research, we introduce a novel predictor KbhbXG, which utilizes XGBoost to identify β-hydroxybutyrylation modification sites based on protein sequence information. The traditional experimental methods employed for the identification of β-hydroxybutyrylated sites using proteomic techniques are both costly and time-consuming. Thus, the development of computational methods and predictors can play a crucial role in facilitating the rapid identification of β-hydroxybutyrylation sites. Our proposed KbhbXG model first utilizes machine learning algorithm XGBoost to predict β-hydroxybutyrylation modification sites. On the independent test set, KbhbXG achieves an accuracy of 0.7457, specificity of 0.7771, and an impressive area under the curve (AUC) score of 0.8172. The high AUC score achieved by our method demonstrates its potential for effectively identifying novel β-hydroxybutyrylation sites, thereby facilitating further research and exploration of the β-hydroxybutyrylation process. Also, functional analyses have revealed that different organisms preferentially engage in distinct biological processes and pathways, which can provide valuable insights for understanding the mechanism of β-hydroxybutyrylation and guide experimental verification. To promote transparency and reproducibility, we have made both the codes and dataset of KbhbXG publicly available. Researchers interested in utilizing our proposed model can access these resources at https://github.com/Lab-Xu/KbhbXG.
Collapse
Affiliation(s)
- Leqi Chen
- Department of Statistics, University of Science and Technology Beijing, Beijing 100083, China
| | - Liwen Liu
- The Open University of China, Beijing 100039, China
| | - Haiyan Su
- School of Computing, Montclair State University, NJ 07043, USA
| | - Yan Xu
- Department of Statistics, University of Science and Technology Beijing, Beijing 100083, China.
| |
Collapse
|
7
|
Abubakar ML, Kapoor N, Sharma A, Gambhir L, Jasuja ND, Sharma G. Artificial Intelligence in Drug Identification and Validation: A Scoping Review. Drug Res (Stuttg) 2024; 74:208-219. [PMID: 38830370 DOI: 10.1055/a-2306-8311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
The end-to-end process in the discovery of drugs involves therapeutic candidate identification, validation of identified targets, identification of hit compound series, lead identification and optimization, characterization, and formulation and development. The process is lengthy, expensive, tedious, and inefficient, with a large attrition rate for novel drug discovery. Today, the pharmaceutical industry is focused on improving the drug discovery process. Finding and selecting acceptable drug candidates effectively can significantly impact the price and profitability of new medications. Aside from the cost, there is a need to reduce the end-to-end process time, limiting the number of experiments at various stages. To achieve this, artificial intelligence (AI) has been utilized at various stages of drug discovery. The present study aims to identify the recent work that has developed AI-based models at various stages of drug discovery, identify the stages that need more concern, present the taxonomy of AI methods in drug discovery, and provide research opportunities. From January 2016 to September 1, 2023, the study identified all publications that were cited in the electronic databases including Scopus, NCBI PubMed, MEDLINE, Anthropology Plus, Embase, APA PsycInfo, SOCIndex, and CINAHL. Utilising a standardized form, data were extracted, and presented possible research prospects based on the analysis of the extracted data.
Collapse
Affiliation(s)
| | - Neha Kapoor
- School of Applied Sciences, Suresh Gyan Vihar University, Jaipur, Rajasthan, India
| | - Asha Sharma
- Department of Zoology, Swargiya P. N. K. S. Govt. PG College, Dausa, Rajasthan, India
| | - Lokesh Gambhir
- School of Basic and Applied Sciences, Shri Guru Ram Rai University, Dehradun, Uttarakhand, India
| | | | - Gaurav Sharma
- School of Applied Sciences, Suresh Gyan Vihar University, Jaipur, Rajasthan, India
| |
Collapse
|
8
|
Tan D, Jiang H, Li H, Xie Y, Su Y. Prediction of drug-protein interaction based on dual channel neural networks with attention mechanism. Brief Funct Genomics 2024; 23:286-294. [PMID: 37642213 DOI: 10.1093/bfgp/elad037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 07/16/2023] [Accepted: 08/08/2023] [Indexed: 08/31/2023] Open
Abstract
The precise identification of drug-protein inter action (DPI) can significantly speed up the drug discovery process. Bioassay methods are time-consuming and expensive to screen for each pair of drug proteins. Machine-learning-based methods cannot accurately predict a large number of DPIs. Compared with traditional computing methods, deep learning methods need less domain knowledge and have strong data learning ability. In this study, we construct a DPI prediction model based on dual channel neural networks with an efficient path attention mechanism, called DCA-DPI. The drug molecular graph and protein sequence are used as the data input of the model, and the residual graph neural network and the residual convolution network are used to learn the feature representation of the drug and protein, respectively, to obtain the feature vector of the drug and the hidden vector of protein. To get a more accurate protein feature vector, the weighted sum of the hidden vector of protein is applied using the neural attention mechanism. In the end, drug and protein vectors are concatenated and input into the full connection layer for classification. In order to evaluate the performance of DCA-DPI, three widely used public data, Human, C.elegans and DUD-E, are used in the experiment. The evaluation metrics values in the experiment are superior to other relevant methods. Experiments show that our model is efficient for DPI prediction.
Collapse
Affiliation(s)
- Dayu Tan
- Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, 230601, Hefei, China
| | - Haijun Jiang
- Key Laboratory of Intelligent Computing and Signal Processing, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, 230601, Hefei, China
| | - Haitao Li
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, School of Artificial Intelligence, Anhui University, 111 Jiulong Road, 230601, Hefei, China
| | - Ying Xie
- School of Mechanical, Electrical and Information Engineering, Putian University, China
| | - Yansen Su
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, School of Artificial Intelligence, Anhui University, 111 Jiulong Road, 230601, Hefei, China
| |
Collapse
|
9
|
Liu Y, Xing L, Zhang L, Cai H, Guo M. GEFormerDTA: drug target affinity prediction based on transformer graph for early fusion. Sci Rep 2024; 14:7416. [PMID: 38548825 PMCID: PMC10979032 DOI: 10.1038/s41598-024-57879-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 03/22/2024] [Indexed: 04/01/2024] Open
Abstract
Predicting the interaction affinity between drugs and target proteins is crucial for rapid and accurate drug discovery and repositioning. Therefore, more accurate prediction of DTA has become a key area of research in the field of drug discovery and drug repositioning. However, traditional experimental methods have disadvantages such as long operation cycles, high manpower requirements, and high economic costs, making it difficult to predict specific interactions between drugs and target proteins quickly and accurately. Some methods mainly use the SMILES sequence of drugs and the primary structure of proteins as inputs, ignoring the graph information such as bond encoding, degree centrality encoding, spatial encoding of drug molecule graphs, and the structural information of proteins such as secondary structure and accessible surface area. Moreover, previous methods were based on protein sequences to learn feature representations, neglecting the completeness of information. To address the completeness of drug and protein structure information, we propose a Transformer graph-based early fusion research approach for drug-target affinity prediction (GEFormerDTA). Our method reduces prediction errors caused by insufficient feature learning. Experimental results on Davis and KIBA datasets showed a better prediction of drugtarget affinity than existing affinity prediction methods.
Collapse
Affiliation(s)
- Youzhi Liu
- Department of Computer Science and Technology, Shandong University of Technology, Zibo, 255000, China
| | - Linlin Xing
- Department of Computer Science and Technology, Shandong University of Technology, Zibo, 255000, China.
| | - Longbo Zhang
- Department of Computer Science and Technology, Shandong University of Technology, Zibo, 255000, China
| | - Hongzhen Cai
- Department of Agricultural Engineering and Food Science, Shandong University of Technology, Zibo, 255000, China
| | - Maozu Guo
- Department of Electrical and Information Engineering, Beijing University of Architecture, Beijing, 102616, China
| |
Collapse
|
10
|
Zhang Y, Li S, Meng K, Sun S. Machine Learning for Sequence and Structure-Based Protein-Ligand Interaction Prediction. J Chem Inf Model 2024; 64:1456-1472. [PMID: 38385768 DOI: 10.1021/acs.jcim.3c01841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
Developing new drugs is too expensive and time -consuming. Accurately predicting the interaction between drugs and targets will likely change how the drug is discovered. Machine learning-based protein-ligand interaction prediction has demonstrated significant potential. In this paper, computational methods, focusing on sequence and structure to study protein-ligand interactions, are examined. Therefore, this paper starts by presenting an overview of the data sets applied in this area, as well as the various approaches applied for representing proteins and ligands. Then, sequence-based and structure-based classification criteria are subsequently utilized to categorize and summarize both the classical machine learning models and deep learning models employed in protein-ligand interaction studies. Moreover, the evaluation methods and interpretability of these models are proposed. Furthermore, delving into the diverse applications of protein-ligand interaction models in drug research is presented. Lastly, the current challenges and future directions in this field are addressed.
Collapse
Affiliation(s)
- Yunjiang Zhang
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Shuyuan Li
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Kong Meng
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Shaorui Sun
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| |
Collapse
|
11
|
Xie K, Hou Y, Zhou X. Deep centroid: a general deep cascade classifier for biomedical omics data classification. Bioinformatics 2024; 40:btae039. [PMID: 38305432 PMCID: PMC10868341 DOI: 10.1093/bioinformatics/btae039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/13/2024] [Accepted: 01/30/2024] [Indexed: 02/03/2024] Open
Abstract
MOTIVATION Classification of samples using biomedical omics data is a widely used method in biomedical research. However, these datasets often possess challenging characteristics, including high dimensionality, limited sample sizes, and inherent biases across diverse sources. These factors limit the performance of traditional machine learning models, particularly when applied to independent datasets. RESULTS To address these challenges, we propose a novel classifier, Deep Centroid, which combines the stability of the nearest centroid classifier and the strong fitting ability of the deep cascade strategy. Deep Centroid is an ensemble learning method with a multi-layer cascade structure, consisting of feature scanning and cascade learning stages that can dynamically adjust the training scale. We apply Deep Centroid to three precision medicine applications-cancer early diagnosis, cancer prognosis, and drug sensitivity prediction-using cell-free DNA fragmentations, gene expression profiles, and DNA methylation data. Experimental results demonstrate that Deep Centroid outperforms six traditional machine learning models in all three applications, showcasing its potential in biological omics data classification. Furthermore, functional annotations reveal that the features scanned by the model exhibit biological significance, indicating its interpretability from a biological perspective. Our findings underscore the promising application of Deep Centroid in the classification of biomedical omics data, particularly in the field of precision medicine. AVAILABILITY AND IMPLEMENTATION Deep Centroid is available at both github (github.com/xiexiexiekuan/DeepCentroid) and Figshare (https://figshare.com/articles/software/Deep_Centroid_A_General_Deep_Cascade_Classifier_for_Biomedical_Omics_Data_Classification/24993516).
Collapse
Affiliation(s)
- Kuan Xie
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, People’s Republic of China
| | - Yuying Hou
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, People’s Republic of China
| | - Xionghui Zhou
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, People’s Republic of China
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan 430070, People’s Republic of China
| |
Collapse
|
12
|
Cheng N, Bi C, Shi Y, Liu M, Cao A, Ren M, Xia J, Liang Z. Effect Predictor of Driver Synonymous Mutations Based on Multi-Feature Fusion and Iterative Feature Representation Learning. IEEE J Biomed Health Inform 2024; 28:1144-1151. [PMID: 38096097 DOI: 10.1109/jbhi.2023.3343075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Accurate identification of driver mutations is crucial in genetic studies of human cancers. While numerous cancer driver missense mutations have been identified, research into potential cancer drivers for synonymous mutations has shown limited success to date. Here, we developed a novel machine learning framework, epSMic, for predicting cancer driver synonymous mutations. epSMic employs an iterative feature representation scheme that facilitates the learning of discriminative features from various sequential models in a supervised iterative mode. We constructed the benchmark datasets and encoded the embedding sequence, physicochemical property, and basic information such as conservation and splicing feature. The evaluation results on benchmark test datasets demonstrate that epSMic outperforms existing methods, making it a valuable tool for researchers in identifying functional synonymous mutations in cancer. We hope epSMic can enable researchers to concentrate on synonymous mutations that have a functional impact on cancer.
Collapse
|
13
|
Wei J, Lu L, Shen T. Predicting drug-protein interactions by preserving the graph information of multi source data. BMC Bioinformatics 2024; 25:10. [PMID: 38177981 PMCID: PMC10768380 DOI: 10.1186/s12859-023-05620-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Accepted: 12/15/2023] [Indexed: 01/06/2024] Open
Abstract
Examining potential drug-target interactions (DTIs) is a pivotal component of drug discovery and repurposing. Recently, there has been a significant rise in the use of computational techniques to predict DTIs. Nevertheless, previous investigations have predominantly concentrated on assessing either the connections between nodes or the consistency of the network's topological structure in isolation. Such one-sided approaches could severely hinder the accuracy of DTI predictions. In this study, we propose a novel method called TTGCN, which combines heterogeneous graph convolutional neural networks (GCN) and graph attention networks (GAT) to address the task of DTI prediction. TTGCN employs a two-tiered feature learning strategy, utilizing GAT and residual GCN (R-GCN) to extract drug and target embeddings from the diverse network, respectively. These drug and target embeddings are then fused through a mean-pooling layer. Finally, we employ an inductive matrix completion technique to forecast DTIs while preserving the network's node connectivity and topological structure. Our approach demonstrates superior performance in terms of area under the curve and area under the precision-recall curve in experimental comparisons, highlighting its significant advantages in predicting DTIs. Furthermore, case studies provide additional evidence of its ability to identify potential DTIs.
Collapse
Affiliation(s)
- Jiahao Wei
- School of Mathematical Sciences, Guizhou Normal University, Guiyang, 550025, China
| | - Linzhang Lu
- School of Mathematical Sciences, Guizhou Normal University, Guiyang, 550025, China.
- School of Mathematical Sciences, Xiamen University, Xiamen, 361005, China.
| | - Tie Shen
- Key Laboratory of Information and Computing Science Guizhou Province, Guizhou Normal University, Guizhou, 550001, China.
| |
Collapse
|
14
|
Qiu W, Liang Q, Yu L, Xiao X, Qiu W, Lin W. LSTM-SAGDTA: Predicting Drug-target Binding Affinity with an Attention Graph Neural Network and LSTM Approach. Curr Pharm Des 2024; 30:468-476. [PMID: 38323613 PMCID: PMC11071654 DOI: 10.2174/0113816128282837240130102817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 01/14/2024] [Accepted: 01/19/2024] [Indexed: 02/08/2024]
Abstract
INTRODUCTION Drug development is a challenging and costly process, yet it plays a crucial role in improving healthcare outcomes. Drug development requires extensive research and testing to meet the demands for economic efficiency, cures, and pain relief. METHODS Drug development is a vital research area that necessitates innovation and collaboration to achieve significant breakthroughs. Computer-aided drug design provides a promising avenue for drug discovery and development by reducing costs and improving the efficiency of drug design and testing. RESULTS In this study, a novel model, namely LSTM-SAGDTA, capable of accurately predicting drug-target binding affinity, was developed. We employed SeqVec for characterizing the protein and utilized the graph neural networks to capture information on drug molecules. By introducing self-attentive graph pooling, the model achieved greater accuracy and efficiency in predicting drug-target binding affinity. CONCLUSION Moreover, LSTM-SAGDTA obtained superior accuracy over current state-of-the-art methods only by using less training time. The results of experiments suggest that this method represents a highprecision solution for the DTA predictor.
Collapse
Affiliation(s)
- Wenjing Qiu
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen 333000, China
| | - Qianle Liang
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen 333000, China
| | - Liyi Yu
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen 333000, China
| | - Xuan Xiao
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen 333000, China
| | - Wangren Qiu
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen 333000, China
| | - Weizhong Lin
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen 333000, China
| |
Collapse
|
15
|
Wang Y, Zhang Z, Piao C, Huang Y, Zhang Y, Zhang C, Lu YJ, Liu D. LDS-CNN: a deep learning framework for drug-target interactions prediction based on large-scale drug screening. Health Inf Sci Syst 2023; 11:42. [PMID: 37667773 PMCID: PMC10475000 DOI: 10.1007/s13755-023-00243-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 08/14/2023] [Indexed: 09/06/2023] Open
Abstract
Background Drug-target interaction (DTI) is a vital drug design strategy that plays a significant role in many processes of complex diseases and cellular events. In the face of challenges such as extensive protein data and experimental costs, it is suggested to apply bioinformatics approaches to exploit potential interactions to design new targeted medications. Different data and interaction types bring difficulties to study involving incompatible and heterology formats. The analysis of drug-target interactions in a comprehensive and unified model is a significant challenge. Method Here, we propose a general method for predicting interactions between small-molecule drugs and protein targets, Large-scale Drug target Screening Convolutional Neural Network (LDS-CNN), which used unified encoding to achieve the calculation of the different data formats in an integrated model to realize feature abstraction and potential object prediction. Result On 898,412 interaction data involving 1683 small-molecule compounds and 14,350 human proteins from 8.8 billion records, the proposed method achieved an area under the curve (AUC) of 0.96, an area under the precision-recall curve (AUPRC) of 0.95, and an accuracy of 90.13%. The experimental results illustrated that the proposed method attained high accuracy on the test set, indicating its high predictive ability in drug-target interaction prediction. LDS-CNN is effective for the prediction of large-scale datasets and datasets composed of data with different formats. Conclusion In this study, we propose a DTI prediction method to solve the problems of unified encoding of large-scale data in multiple formats. It provides a feasible way to efficiently abstract the features among different types of drug-related data, thus reducing experimental costs and time consumption. The proposed method can be used to identify potential drug targets and candidates for the treatment of complex diseases. This work provides a reference for DTI to process large-scale data and different formats with deep learning methods and provides certain suggestions for future research.
Collapse
Affiliation(s)
- Yang Wang
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006 China
| | - Zuxian Zhang
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
| | - Chenghong Piao
- The First Affiliated Hospital of Ningbo University, Ningbo, 315010 China
| | - Ying Huang
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
| | - Yihan Zhang
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
| | - Chi Zhang
- Shanghai Institute of Biological Products, Shanghai, 201403 China
| | - Yu-Jing Lu
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
- Smart Medical Innovation Technology Center, Guangdong University of Technology, Guangzhou, 510006 China
| | - Dongning Liu
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006 China
| |
Collapse
|
16
|
Zhu Z, Yao Z, Zheng X, Qi G, Li Y, Mazur N, Gao X, Gong Y, Cong B. Drug-target affinity prediction method based on multi-scale information interaction and graph optimization. Comput Biol Med 2023; 167:107621. [PMID: 37907030 DOI: 10.1016/j.compbiomed.2023.107621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 10/16/2023] [Accepted: 10/23/2023] [Indexed: 11/02/2023]
Abstract
Drug-target affinity (DTA) prediction as an emerging and effective method is widely applied to explore the strength of drug-target interactions in drug development research. By predicting these interactions, researchers can assess the potential efficacy and safety of candidate drugs at an early stage, narrowing down the search space for therapeutic targets and accelerating the discovery and development of new drugs. However, existing DTA prediction models mainly use graphical representations of drug molecules, which lack information on interactions between individual substructures, thus affecting prediction accuracy and model interpretability. Therefore, transformer and diffusion on drug graphs in DTA prediction (TDGraphDTA) are introduced to predict drug-target interactions using multi-scale information interaction and graph optimization. An interactive module is integrated into feature extraction of drug and target features at different granularity levels. A diffusion model-based graph optimization module is proposed to improve the representation of molecular graph structures and enhance the interpretability of graph representations while obtaining optimal feature representations. In addition, TDGraphDTA improves the accuracy and reliability of predictions by capturing relationships and contextual information between molecular substructures. The performance of the proposed TDGraphDTA in DTA prediction was verified on three publicly available benchmark datasets (Davis, Metz, and KIBA). Compared with state-of-the-art baseline models, it achieved better results in terms of consistency index, R-squared, etc. Furthermore, compared with some existing methods, the proposed TDGraphDTA is demonstrated to have better structure capturing capabilities by visualizing the feature capturing capabilities of the model using Grad-AAM toxicity labels in the ToxCast dataset. The corresponding source codes are available at https://github.com/Lamouryz/TDGraph.
Collapse
Affiliation(s)
- Zhiqin Zhu
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Zheng Yao
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Xin Zheng
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Guanqiu Qi
- Computer Information Systems Department, State University of New York at Buffalo State, Buffalo, NY 14222, USA.
| | - Yuanyuan Li
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Neal Mazur
- Computer Information Systems Department, State University of New York at Buffalo State, Buffalo, NY 14222, USA.
| | - Xinbo Gao
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Yifei Gong
- Faculty of applied science & engineering, the Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto at Toronto, ON M5S, Canada.
| | - Baisen Cong
- Diagnostics Digital, DH(Shanghai) Diagnostics Co, Ltd, a Danaher company, Shanghai, 200335, China.
| |
Collapse
|
17
|
Wang S, Wang L, Li F, Bai F. DeepSA: a deep-learning driven predictor of compound synthesis accessibility. J Cheminform 2023; 15:103. [PMID: 37919805 PMCID: PMC10621138 DOI: 10.1186/s13321-023-00771-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 10/20/2023] [Indexed: 11/04/2023] Open
Abstract
With the continuous development of artificial intelligence technology, more and more computational models for generating new molecules are being developed. However, we are often confronted with the question of whether these compounds are easy or difficult to synthesize, which refers to synthetic accessibility of compounds. In this study, a deep learning based computational model called DeepSA, was proposed to predict the synthesis accessibility of compounds, which provides a useful tool to choose molecules. DeepSA is a chemical language model that was developed by training on a dataset of 3,593,053 molecules using various natural language processing (NLP) algorithms, offering advantages over state-of-the-art methods and having a much higher area under the receiver operating characteristic curve (AUROC), i.e., 89.6%, in discriminating those molecules that are difficult to synthesize. This helps users select less expensive molecules for synthesis, reducing the time and cost required for drug discovery and development. Interestingly, a comparison of DeepSA with a Graph Attention-based method shows that using SMILES alone can also efficiently visualize and extract compound's informative features. DeepSA is available online on the below web server ( https://bailab.siais.shanghaitech.edu.cn/services/deepsa/ ) of our group, and the code is available at https://github.com/Shihang-Wang-58/DeepSA .
Collapse
Affiliation(s)
- Shihang Wang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
| | - Lin Wang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
| | - Fenglei Li
- School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
| | - Fang Bai
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China.
- School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China.
- Shanghai Clinical Research and Trial Center, Shanghai, 201210, China.
| |
Collapse
|
18
|
Tang R, Sun C, Huang J, Li M, Wei J, Liu J. Predicting Drug-Protein Interactions by Self-Adaptively Adjusting the Topological Structure of the Heterogeneous Network. IEEE J Biomed Health Inform 2023; 27:5675-5684. [PMID: 37672364 DOI: 10.1109/jbhi.2023.3312374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Many powerful computational methods based on graph neural networks (GNNs) have been proposed to predict drug-protein interactions (DPIs). It can effectively reduce laboratory workload and the cost of drug discovery and drug repurposing. However, many clinical functions of drugs and proteins are unknown due to their unobserved indications. Therefore, it is difficult to establish a reliable drug-protein heterogeneous network that can describe the relationships between drugs and proteins based on the available information. To solve this problem, we propose a DPI prediction method that can self-adaptively adjust the topological structure of the heterogeneous networks, and name it SATS. SATS establishes a representation learning module based on graph attention network to carry out the drug-protein heterogeneous network. It can self-adaptively learn the relationships among the nodes based on their attributes and adjust the topological structure of the network according to the training loss of the model. Finally, SATS predicts the interaction propensity between drugs and proteins based on their embeddings. The experimental results show that SATS can effectively improve the topological structure of the network. The performance of SATS outperforms several state-of-the-art DPI prediction methods under various evaluation metrics. These prove that SATS is useful to deal with incomplete data and unreliable networks. The case studies on the top section of the prediction results further demonstrate that SATS is powerful for discovering novel DPIs.
Collapse
|
19
|
Wang Z, Meng J, Li H, Xia S, Wang Y, Luan Y. PAMPred: A hierarchical evolutionary ensemble framework for identifying plant antimicrobial peptides. Comput Biol Med 2023; 166:107545. [PMID: 37806057 DOI: 10.1016/j.compbiomed.2023.107545] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 09/04/2023] [Accepted: 09/28/2023] [Indexed: 10/10/2023]
Abstract
Antimicrobial peptides (AMPs) play a crucial role in plant immune regulation, growth and development stages, which have attracted significant attentions in recent years. As the wet-lab experiments are laborious and cost-prohibitive, it is indispensable to develop computational methods to discover novel plant AMPs accurately. In this study, we presented a hierarchical evolutionary ensemble framework, named PAMPred, which consisted of a multi-level heterogeneous architecture to identify plant AMPs. Specifically, to address the existing class imbalance problem, a cluster-based resampling method was adopted to build multiple balanced subsets. Then, several peptide features including sequence information-based and physicochemical properties-based features were fed into the different types of basic learners to increase the ensemble diversity. For boosting the predictive capability of PAMPred, the improved particle swarm optimization (PSO) algorithm and dynamic ensemble pruning strategy were used to optimize the weights at different levels adaptively. Furthermore, extensive ten-fold cross-validation and independent testing experimental results demonstrated that PAMPred achieved excellent prediction performance and generalization ability, and outperformed the state-of-the-art methods. It also indicated that the proposed method could serve as an effective auxiliary tool to identify plant AMPs, which would be conducive to explore the immune regulatory mechanism of plants.
Collapse
Affiliation(s)
- Zhaowei Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Jun Meng
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China.
| | - Haibin Li
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Shihao Xia
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Yu Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Yushi Luan
- School of Bioengineering, Dalian University of Technology, Dalian, Liaoning 116024, China
| |
Collapse
|
20
|
Gong L, Jiang J, Chen S, Qi M. A syndrome differentiation model of TCM based on multi-label deep forest using biomedical text mining. Front Genet 2023; 14:1272016. [PMID: 37854059 PMCID: PMC10579813 DOI: 10.3389/fgene.2023.1272016] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 09/07/2023] [Indexed: 10/20/2023] Open
Abstract
Syndrome differentiation and treatment is the basic principle of traditional Chinese medicine (TCM) to recognize and treat diseases. Accurate syndrome differentiation can provide a reliable basis for treatment, therefore, establishing a scientific intelligent syndrome differentiation method is of great significance to the modernization of TCM. With the development of biomdical text mining technology, TCM has entered the era of intelligence that based on data, and model training increasingly relies on the large-scale labeled data. However, it is difficult to form a large standard data set in the field of TCM due to the low degree of standardization of TCM data collection and the privacy protection of patients' medical records. To solve the above problem, a multi-label deep forest model based on an improved multi-label ReliefF feature selection algorithm, ML-PRDF, is proposed to enhance the representativeness of features within the model, express the original information with fewer features, and achieve optimal classification accuracy, while alleviating the problem of high data processing cost of deep forest models and achieving effective TCM discriminative analysis under small samples. The results show that the proposed model finally outperforms other multi-label classification models in terms of multi-label evaluation criteria, and has higher accuracy in the TCM syndrome differentiation problem compared with the traditional multi-label deep forest, and the comparative study shows that the use of PCC-MLRF algorithm for feature selection can better select representative features.
Collapse
Affiliation(s)
- Lejun Gong
- Jiangsu Key Lab of Big Data Security and Intelligent Processing, School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China
- Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, Nanjing, China
| | - Jindou Jiang
- Jiangsu Key Lab of Big Data Security and Intelligent Processing, School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China
| | - Shiqi Chen
- Jiangsu Key Lab of Big Data Security and Intelligent Processing, School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China
| | - Mingming Qi
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, China
| |
Collapse
|
21
|
Chen XG, Yang X, Li C, Lin X, Zhang W. Non-coding RNA identification with pseudo RNA sequences and feature representation learning. Comput Biol Med 2023; 165:107355. [PMID: 37639767 DOI: 10.1016/j.compbiomed.2023.107355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 07/16/2023] [Accepted: 08/12/2023] [Indexed: 08/31/2023]
Abstract
Distinguishing non-coding RNAs (ncRNAs) from coding RNAs is very important in bioinformatics. Although many methods have been proposed for solving this task, it remains highly challenging to further improve the accuracy of ncRNA identification. In this paper, we propose a coding potential predictor using feature representation learning based on pseudo RNA sequences named CPPFLPS. In this method, we use the pseudo RNA sequences generated by simulating RNA sequence mutations as new samples for data augmentation, and six string operations simulating RNA sequence mutations are considered: base replacement, base insertion, base deletion, subsequence reversion, subsequence repetition and subsequence deletion. In the feature representation learning framework, different types of pseudo RNA sequences are added to the training set to form new training sets that can be used to train baseline classifiers, thus obtaining baseline models. The resulting labels of these baseline models are used as feature vectors to represent RNA sequences, and the resulting feature vectors acquired after feature selection are used to train a predictive model for distinguishing ncRNAs from coding RNAs. Our method achieves better performance compared with that of existing state-of-the-art methods. The implementation of the proposed method is available at https://github.com/chenxgscuec/CPPFLPS.
Collapse
Affiliation(s)
- Xian-Gan Chen
- School of Biomedical Engineering, South-Central Minzu University, Wuhan, 430074, China; Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central Minzu University, Wuhan, 430074, China; Key Laboratory of Cognitive Science(South-Central Minzu University), State Ethnic Affairs Commission, Wuhan, 430074, China.
| | - Xiaofei Yang
- School of Biomedical Engineering, South-Central Minzu University, Wuhan, 430074, China; Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central Minzu University, Wuhan, 430074, China; Key Laboratory of Cognitive Science(South-Central Minzu University), State Ethnic Affairs Commission, Wuhan, 430074, China.
| | - Chenhong Li
- School of Biomedical Engineering, South-Central Minzu University, Wuhan, 430074, China; Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central Minzu University, Wuhan, 430074, China; Key Laboratory of Cognitive Science(South-Central Minzu University), State Ethnic Affairs Commission, Wuhan, 430074, China.
| | - Xianguang Lin
- School of Biomedical Engineering, South-Central Minzu University, Wuhan, 430074, China; Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central Minzu University, Wuhan, 430074, China; Key Laboratory of Cognitive Science(South-Central Minzu University), State Ethnic Affairs Commission, Wuhan, 430074, China.
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China.
| |
Collapse
|
22
|
Meng R, Yin S, Sun J, Hu H, Zhao Q. scAAGA: Single cell data analysis framework using asymmetric autoencoder with gene attention. Comput Biol Med 2023; 165:107414. [PMID: 37660567 DOI: 10.1016/j.compbiomed.2023.107414] [Citation(s) in RCA: 54] [Impact Index Per Article: 54.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 08/02/2023] [Accepted: 08/28/2023] [Indexed: 09/05/2023]
Abstract
In recent years, single-cell RNA sequencing (scRNA-seq) has emerged as a powerful technique for investigating cellular heterogeneity and structure. However, analyzing scRNA-seq data remains challenging, especially in the context of COVID-19 research. Single-cell clustering is a key step in analyzing scRNA-seq data, and deep learning methods have shown great potential in this area. In this work, we propose a novel scRNA-seq analysis framework called scAAGA. Specifically, we utilize an asymmetric autoencoder with a gene attention module to learn important gene features adaptively from scRNA-seq data, with the aim of improving the clustering effect. We apply scAAGA to COVID-19 peripheral blood mononuclear cell (PBMC) scRNA-seq data and compare its performance with state-of-the-art methods. Our results consistently demonstrate that scAAGA outperforms existing methods in terms of adjusted rand index (ARI), normalized mutual information (NMI), and adjusted mutual information (AMI) scores, achieving improvements ranging from 2.8% to 27.8% in NMI scores. Additionally, we discuss a data augmentation technology to expand the datasets and improve the accuracy of scAAGA. Overall, scAAGA presents a robust tool for scRNA-seq data analysis, enhancing the accuracy and reliability of clustering results in COVID-19 research.
Collapse
Affiliation(s)
- Rui Meng
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Shuaidong Yin
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Jianqiang Sun
- School of Information Science and Engineering, Linyi University, Linyi, 276000, China
| | - Huan Hu
- Institute of Applied Genomics, Fuzhou University, Fuzhou, 350108, China.
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China.
| |
Collapse
|
23
|
Dong L, Shi S, Qu X, Luo D, Wang B. Ligand binding affinity prediction with fusion of graph neural networks and 3D structure-based complex graph. Phys Chem Chem Phys 2023; 25:24110-24120. [PMID: 37655493 DOI: 10.1039/d3cp03651k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
Accurate prediction of protein-ligand binding affinity is pivotal for drug design and discovery. Here, we proposed a novel deep fusion graph neural networks framework named FGNN to learn the protein-ligand interactions from the 3D structures of protein-ligand complexes. Unlike 1D sequences for proteins or 2D graphs for ligands, the 3D graph of protein-ligand complex enables the more accurate representations of the protein-ligand interactions. Benchmark studies have shown that our fusion models FGNN can achieve more accurate prediction of binding affinity than any individual algorithm. The advantages of fusion strategies have been demonstrated in terms of expressive power of data, learning efficiency and model interpretability. Our fusion models show satisfactory performances on diverse data sets, demonstrating their generalization ability. Given the good performances in both binding affinity prediction and virtual screening, our fusion models are expected to be practically applied for drug screening and design. Our work highlights the potential of the fusion graph neural network algorithm in solving complex prediction problems in computational biology and chemistry. The fusion graph neural networks (FGNN) model is freely available in https://github.com/LinaDongXMU/FGNN.
Collapse
Affiliation(s)
- Lina Dong
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
| | - Shuai Shi
- Department of Algorithm, TuringQ Co., Ltd., Shanghai, 200240, China
| | - Xiaoyang Qu
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
| | - Ding Luo
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
| | - Binju Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen, 361005, China
| |
Collapse
|
24
|
Peng L, Tan J, Xiong W, Zhang L, Wang Z, Yuan R, Li Z, Chen X. Deciphering ligand-receptor-mediated intercellular communication based on ensemble deep learning and the joint scoring strategy from single-cell transcriptomic data. Comput Biol Med 2023; 163:107137. [PMID: 37364528 DOI: 10.1016/j.compbiomed.2023.107137] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 05/18/2023] [Accepted: 06/04/2023] [Indexed: 06/28/2023]
Abstract
BACKGROUND Cell-cell communication in a tumor microenvironment is vital to tumorigenesis, tumor progression and therapy. Intercellular communication inference helps understand molecular mechanisms of tumor growth, progression and metastasis. METHODS Focusing on ligand-receptor co-expressions, in this study, we developed an ensemble deep learning framework, CellComNet, to decipher ligand-receptor-mediated cell-cell communication from single-cell transcriptomic data. First, credible LRIs are captured by integrating data arrangement, feature extraction, dimension reduction, and LRI classification based on an ensemble of heterogeneous Newton boosting machine and deep neural network. Next, known and identified LRIs are screened based on single-cell RNA sequencing (scRNA-seq) data in certain tissues. Finally, cell-cell communication is inferred by incorporating scRNA-seq data, the screened LRIs, a joint scoring strategy that combines expression thresholding and expression product of ligands and receptors. RESULTS The proposed CellComNet framework was compared with four competing protein-protein interaction prediction models (PIPR, XGBoost, DNNXGB, and OR-RCNN) and obtained the best AUCs and AUPRs on four LRI datasets, elucidating the optimal LRI classification ability. CellComNet was further applied to analyze intercellular communication in human melanoma and head and neck squamous cell carcinoma (HNSCC) tissues. The results demonstrate that cancer-associated fibroblasts highly communicate with melanoma cells and endothelial cells strong communicate with HNSCC cells. CONCLUSIONS The proposed CellComNet framework efficiently identified credible LRIs and significantly improved cell-cell communication inference performance. We anticipate that CellComNet can contribute to anticancer drug design and tumor-targeted therapy.
Collapse
Affiliation(s)
- Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, 412007, Hunan, China; College of Life Sciences and Chemistry, Hunan University of Technology, Zhuzhou, 412007, Hunan, China
| | - Jingwei Tan
- School of Computer Science, Hunan University of Technology, Zhuzhou, 412007, Hunan, China
| | - Wei Xiong
- School of Computer Science, Hunan University of Technology, Zhuzhou, 412007, Hunan, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, Jiangsu, China
| | - Zhao Wang
- School of Computer Science, Hunan University of Technology, Zhuzhou, 412007, Hunan, China
| | - Ruya Yuan
- School of Computer Science, Hunan University of Technology, Zhuzhou, 412007, Hunan, China
| | - Zejun Li
- School of Computer Science, Hunan Institute of Technology, Hengyang, 421002, Hunan, China.
| | - Xing Chen
- School of Science, Jiangnan University, Wuxi, 214122, Jiangsu, China.
| |
Collapse
|
25
|
Chen J, Zhang L, Cheng K, Jin B, Lu X, Che C. Predicting Drug-Target Interaction Via Self-Supervised Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2781-2789. [PMID: 35230952 DOI: 10.1109/tcbb.2022.3153963] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Recent advances in graph representation learning provide new opportunities for computational drug-target interaction (DTI) prediction. However, it still suffers from deficiencies of dependence on manual labels and vulnerability to attacks. Inspired by the success of self-supervised learning (SSL) algorithms, which can leverage input data itself as supervision,we propose SupDTI, a SSL-enhanced drug-target interaction prediction framework based on a heterogeneous network (i.e., drug-protein, drug-drug, and protein-protein interaction network; drug-disease, drug-side-effect, and protein-disease association network; drug-structure and protein-sequence similarity network). Specifically, SupDTI is an end-to-end learning framework consisting of five components. First, localized and globalized graph convolutions are designed to capture the nodes' information from both local and global perspectives, respectively. Then, we develop a variational autoencoder to constrain the nodes' representation to have desired statistical characteristics. Finally, a unified self-supervised learning strategy is leveraged to enhance the nodes' representation, namely, a contrastive learning module is employed to enable the nodes' representation to fit the graph-level representation, followed by a generative learning module which further maximizes the node-level agreement across the global and local views by learning the probabilistic connectivity distribution of the original heterogeneous network. Experimental results show that our model can achieve better prediction performance than state-of-the-art methods.
Collapse
|
26
|
Qian Y, Li X, Wu J, Zhang Q. MCL-DTI: using drug multimodal information and bi-directional cross-attention learning method for predicting drug-target interaction. BMC Bioinformatics 2023; 24:323. [PMID: 37633938 PMCID: PMC10463755 DOI: 10.1186/s12859-023-05447-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 08/15/2023] [Indexed: 08/28/2023] Open
Abstract
BACKGROUND Prediction of drug-target interaction (DTI) is an essential step for drug discovery and drug reposition. Traditional methods are mostly time-consuming and labor-intensive, and deep learning-based methods address these limitations and are applied to engineering. Most of the current deep learning methods employ representation learning of unimodal information such as SMILES sequences, molecular graphs, or molecular images of drugs. In addition, most methods focus on feature extraction from drug and target alone without fusion learning from drug-target interacting parties, which may lead to insufficient feature representation. MOTIVATION In order to capture more comprehensive drug features, we utilize both molecular image and chemical features of drugs. The image of the drug mainly has the structural information and spatial features of the drug, while the chemical information includes its functions and properties, which can complement each other, making drug representation more effective and complete. Meanwhile, to enhance the interactive feature learning of drug and target, we introduce a bidirectional multi-head attention mechanism to improve the performance of DTI. RESULTS To enhance feature learning between drugs and targets, we propose a novel model based on deep learning for DTI task called MCL-DTI which uses multimodal information of drug and learn the representation of drug-target interaction for drug-target prediction. In order to further explore a more comprehensive representation of drug features, this paper first exploits two multimodal information of drugs, molecular image and chemical text, to represent the drug. We also introduce to use bi-rectional multi-head corss attention (MCA) method to learn the interrelationships between drugs and targets. Thus, we build two decoders, which include an multi-head self attention (MSA) block and an MCA block, for cross-information learning. We use a decoder for the drug and target separately to obtain the interaction feature maps. Finally, we feed these feature maps generated by decoders into a fusion block for feature extraction and output the prediction results. CONCLUSIONS MCL-DTI achieves the best results in all the three datasets: Human, C. elegans and Davis, including the balanced datasets and an unbalanced dataset. The results on the drug-drug interaction (DDI) task show that MCL-DTI has a strong generalization capability and can be easily applied to other tasks.
Collapse
Affiliation(s)
- Ying Qian
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062 China
| | - Xinyi Li
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062 China
| | - Jian Wu
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062 China
| | - Qian Zhang
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062 China
| |
Collapse
|
27
|
Zhang Y, Feng Y, Wu M, Deng Z, Wang S. VGAEDTI: drug-target interaction prediction based on variational inference and graph autoencoder. BMC Bioinformatics 2023; 24:278. [PMID: 37415176 DOI: 10.1186/s12859-023-05387-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 06/16/2023] [Indexed: 07/08/2023] Open
Abstract
MOTIVATION Accurate identification of Drug-Target Interactions (DTIs) plays a crucial role in many stages of drug development and drug repurposing. (i) Traditional methods do not consider the use of multi-source data and do not consider the complex relationship between data sources. (ii) How to better mine the hidden features of drug and target space from high-dimensional data, and better solve the accuracy and robustness of the model. RESULTS To solve the above problems, a novel prediction model named VGAEDTI is proposed in this paper. We constructed a heterogeneous network with multiple sources of information using multiple types of drug and target dataIn order to obtain deeper features of drugs and targets, we use two different autoencoders. One is variational graph autoencoder (VGAE) which is used to infer feature representations from drug and target spaces. The second is graph autoencoder (GAE) propagating labels between known DTIs. Experimental results on two public datasets show that the prediction accuracy of VGAEDTI is better than that of six DTIs prediction methods. These results indicate that model can predict new DTIs and provide an effective tool for accelerating drug development and repurposing.
Collapse
Affiliation(s)
- Yuanyuan Zhang
- Yinfei Feng Qingdao University of Technology, Qingdao, China
| | - Yinfei Feng
- Yinfei Feng Qingdao University of Technology, Qingdao, China.
| | - Mengjie Wu
- Yinfei Feng Qingdao University of Technology, Qingdao, China
| | - Zengqian Deng
- Yinfei Feng Qingdao University of Technology, Qingdao, China
| | - Shudong Wang
- School of Computer Science and Technology, China University of Petroleum, Qingdao, China
| |
Collapse
|
28
|
Hu X, Yin Z, Zeng Z, Peng Y. Prediction of miRNA-Disease Associations by Cascade Forest Model Based on Stacked Autoencoder. Molecules 2023; 28:5013. [PMID: 37446675 DOI: 10.3390/molecules28135013] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 06/23/2023] [Accepted: 06/24/2023] [Indexed: 07/15/2023] Open
Abstract
Numerous pieces of evidence have indicated that microRNA (miRNA) plays a crucial role in a series of significant biological processes and is closely related to complex disease. However, the traditional biological experimental methods used to verify disease-related miRNAs are inefficient and expensive. Thus, it is necessary to design some excellent approaches to improve efficiency. In this work, a novel method (CFSAEMDA) is proposed for the prediction of unknown miRNA-disease associations (MDAs). Specifically, we first capture the interactive features of miRNA and disease by integrating multi-source information. Then, the stacked autoencoder is applied for obtaining the underlying feature representation. Finally, the modified cascade forest model is employed to complete the final prediction. The experimental results present that the AUC value obtained by our method is 97.67%. The performance of CFSAEMDA is superior to several of the latest methods. In addition, case studies conducted on lung neoplasms, breast neoplasms and hepatocellular carcinoma further show that the CFSAEMDA method may be regarded as a utility approach to infer unknown disease-miRNA relationships.
Collapse
Affiliation(s)
- Xiang Hu
- Center of Intelligent Computing and Applied Statistics, School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Zhixiang Yin
- Center of Intelligent Computing and Applied Statistics, School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Zhiliang Zeng
- Center of Intelligent Computing and Applied Statistics, School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Yu Peng
- Center of Intelligent Computing and Applied Statistics, School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| |
Collapse
|
29
|
Zhou L, Wang Y, Peng L, Li Z, Luo X. Identifying potential drug-target interactions based on ensemble deep learning. Front Aging Neurosci 2023; 15:1176400. [PMID: 37396659 PMCID: PMC10309650 DOI: 10.3389/fnagi.2023.1176400] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 05/10/2023] [Indexed: 07/04/2023] Open
Abstract
Introduction Drug-target interaction prediction is one important step in drug research and development. Experimental methods are time consuming and laborious. Methods In this study, we developed a novel DTI prediction method called EnGDD by combining initial feature acquisition, dimensional reduction, and DTI classification based on Gradient boosting neural network, Deep neural network, and Deep Forest. Results EnGDD was compared with seven stat-of-the-art DTI prediction methods (BLM-NII, NRLMF, WNNGIP, NEDTP, DTi2Vec, RoFDT, and MolTrans) on the nuclear receptor, GPCR, ion channel, and enzyme datasets under cross validations on drugs, targets, and drug-target pairs, respectively. EnGDD computed the best recall, accuracy, F1-score, AUC, and AUPR under the majority of conditions, demonstrating its powerful DTI identification performance. EnGDD predicted that D00182 and hsa2099, D07871 and hsa1813, DB00599 and hsa2562, D00002 and hsa10935 have a higher interaction probabilities among unknown drug-target pairs and may be potential DTIs on the four datasets, respectively. In particular, D00002 (Nadide) was identified to interact with hsa10935 (Mitochondrial peroxiredoxin3) whose up-regulation might be used to treat neurodegenerative diseases. Finally, EnGDD was used to find possible drug targets for Parkinson's disease and Alzheimer's disease after confirming its DTI identification performance. The results show that D01277, D04641, and D08969 may be applied to the treatment of Parkinson's disease through targeting hsa1813 (dopamine receptor D2) and D02173, D02558, and D03822 may be the clues of treatment for patients with Alzheimer's disease through targeting hsa5743 (prostaglandinendoperoxide synthase 2). The above prediction results need further biomedical validation. Discussion We anticipate that our proposed EnGDD model can help discover potential therapeutic clues for various diseases including neurodegenerative diseases.
Collapse
Affiliation(s)
- Liqian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Yuzhuang Wang
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Zejun Li
- School of Computer Science, Hunan Institute of Technology, Hengyang, China
| | - Xueming Luo
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| |
Collapse
|
30
|
Bao X, Sun J, Yi M, Qiu J, Chen X, Shuai SC, Zhao Q. MPFFPSDC: A multi-pooling feature fusion model for predicting synergistic drug combinations. Methods 2023:S1046-2023(23)00098-1. [PMID: 37321525 DOI: 10.1016/j.ymeth.2023.06.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 06/11/2023] [Accepted: 06/12/2023] [Indexed: 06/17/2023] Open
Abstract
Drug combination therapies are common practice in the treatment of cancer, but not all combinations result in synergy. As traditional screening approaches are restricted in their ability to uncover synergistic drug combinations, computer-aided medicine is becoming a increasingly prevalent in this field. In this work, a predictive model of potential interactions between drugs named MPFFPSDC is presented, which can maintain the symmetry of drug inputs and eliminate inconsistencies in predictive results caused by different drug inputting sequences or positions. The experimental results show that MPFFPSDC outperforms comparative models in major performance indicators and exhibits better generalization for independent data. Furthermore, the case study demonstrates that our model can capture molecular substructures that contribute to the synergistic effect of two drugs. These results indicate that MPFFPSDC not only offers strong predictive performance, but also has good model interpretability that may provide new insights for the study of drug interaction mechanisms and the development of new drugs.
Collapse
Affiliation(s)
- Xin Bao
- School of Automation and Electrical Engineering, Linyi University, Linyi 276000, China
| | - Jianqiang Sun
- School of Automation and Electrical Engineering, Linyi University, Linyi 276000, China.
| | - Ming Yi
- School of Mathematics and Physics, China University of Geosciences, Wuhan 430000, China
| | - Jianlong Qiu
- School of Automation and Electrical Engineering, Linyi University, Linyi 276000, China
| | - Xiangyong Chen
- School of Automation and Electrical Engineering, Linyi University, Linyi 276000, China
| | - Stella C Shuai
- Biological Science, Northwestern University, Evanston, IL 60208, USA
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan 114051, China.
| |
Collapse
|
31
|
Yuan Y, Zhang Y, Meng X, Liu Z, Wang B, Miao R, Zhang R, Su W, Liu L. EDC-DTI: An end-to-end deep collaborative learning model based on multiple information for drug-target interactions prediction. J Mol Graph Model 2023; 122:108498. [PMID: 37126908 DOI: 10.1016/j.jmgm.2023.108498] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 04/10/2023] [Accepted: 04/17/2023] [Indexed: 05/03/2023]
Abstract
Innovations in drug-target interactions (DTIs) prediction accelerate the progression of drug development. The introduction of deep learning models has a dramatic impact on DTIs prediction, with a distinct influence on saving time and money in drug discovery. This study develops an end-to-end deep collaborative learning model for DTIs prediction, called EDC-DTI, to identify new targets for existing drugs based on multiple drug-target-related information including homogeneous information and heterogeneous information by the way of deep learning. Our end-to-end model is composed of a feature builder and a classifier. Feature builder consists of two collaborative feature construction algorithms that extract the molecular properties and the topology property of networks, and the classifier consists of a feature encoder and a feature decoder which are designed for feature integration and DTIs prediction, respectively. The feature encoder, mainly based on the improved graph attention network, incorporates heterogeneous information into drug features and target features separately. The feature decoder is composed of multiple neural networks for predictions. Compared with six popular baseline models, EDC-DTI achieves highest predictive performance in the case of low computational costs. Robustness tests demonstrate that EDC-DTI is able to maintain strong predictive performance on sparse datasets. As well, we use the model to predict the most likely targets to interact with Simvastatin (DB00641), Nifedipine (DB01115) and Afatinib (DB08916) as examples. Results show that most of the predictions can be confirmed by literature with clear evidence.
Collapse
Affiliation(s)
- Yongna Yuan
- School of Information Science & Engineering, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China.
| | - Yuhao Zhang
- School of Information Science & Engineering, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China
| | - Xiangbo Meng
- School of Information Science & Engineering, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China
| | - Zhenyu Liu
- School of Cyberspace Security, Gansu University of Political Science and Law, Anning West Road, Lanzhou, 730070, Gansu, China
| | - Bohan Wang
- School of Information Science & Engineering, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China
| | - Ruidong Miao
- School of Life Science, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China
| | - Ruisheng Zhang
- School of Information Science & Engineering, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China
| | - Wei Su
- School of Information Science & Engineering, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China
| | - Lei Liu
- Duzhe Publishing Group Co. Ltd., DuZhe Road, Lanzhou, 730000, Gansu, China
| |
Collapse
|
32
|
Yang Z, Zhong W, Lv Q, Dong T, Yu-Chian Chen C. Geometric Interaction Graph Neural Network for Predicting Protein-Ligand Binding Affinities from 3D Structures (GIGN). J Phys Chem Lett 2023; 14:2020-2033. [PMID: 36794930 DOI: 10.1021/acs.jpclett.2c03906] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Predicting protein-ligand binding affinities (PLAs) is a core problem in drug discovery. Recent advances have shown great potential in applying machine learning (ML) for PLA prediction. However, most of them omit the 3D structures of complexes and physical interactions between proteins and ligands, which are considered essential to understanding the binding mechanism. This paper proposes a geometric interaction graph neural network (GIGN) that incorporates 3D structures and physical interactions for predicting protein-ligand binding affinities. Specifically, we design a heterogeneous interaction layer that unifies covalent and noncovalent interactions into the message passing phase to learn node representations more effectively. The heterogeneous interaction layer also follows fundamental biological laws, including invariance to translations and rotations of the complexes, thus avoiding expensive data augmentation strategies. GIGN achieves state-of-the-art performance on three external test sets. Moreover, by visualizing learned representations of protein-ligand complexes, we show that the predictions of GIGN are biologically meaningful.
Collapse
Affiliation(s)
- Ziduo Yang
- Intelligent Medical Research Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, Guangdong 510275, China
| | - Weihe Zhong
- Intelligent Medical Research Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, Guangdong 510275, China
| | - Qiujie Lv
- Intelligent Medical Research Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, Guangdong 510275, China
| | - Tiejun Dong
- Intelligent Medical Research Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, Guangdong 510275, China
| | - Calvin Yu-Chian Chen
- Intelligent Medical Research Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, Guangdong 510275, China
- Department of Medical Research, China Medical University Hospital, Taichung 40447, Taiwan
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan
| |
Collapse
|
33
|
Charoenkwan P, Chumnanpuen P, Schaduangrat N, Oh C, Manavalan B, Shoombuatong W. PSRQSP: An effective approach for the interpretable prediction of quorum sensing peptide using propensity score representation learning. Comput Biol Med 2023; 158:106784. [PMID: 36989748 DOI: 10.1016/j.compbiomed.2023.106784] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 02/07/2023] [Accepted: 03/10/2023] [Indexed: 03/14/2023]
Abstract
Quorum sensing peptides (QSPs) are microbial signaling molecules involved in several cellular processes, such as cellular communication, virulence expression, bioluminescence, and swarming, in various bacterial species. Understanding QSPs is essential for identifying novel drug targets for controlling bacterial populations and pathogenicity. In this study, we present a novel computational approach (PSRQSP) for improving the prediction and analysis of QSPs. In PSRQSP, we develop a novel propensity score representation learning (PSR) scheme. Specifically, we utilized the PSR approach to extract and learn a comprehensive set of estimated propensities of 20 amino acids, 400 dipeptides, and 400 g-gap dipeptides from a pool of scoring card method-based models. Finally, to maximize the utility of the propensity scores, we explored a set of optimal propensity scores and combined them to construct a final meta-predictor. Our experimental results showed that combining multiview propensity scores was more beneficial for identifying QSPs than the conventional feature descriptors. Moreover, extensive benchmarking experiments based on the independent test were sufficient to demonstrate the predictive capability and effectiveness of PSRQSP by outperforming the conventional ML-based and existing methods, with an accuracy of 94.44% and AUC of 0.967. PSR-derived propensity scores were employed to determine the crucial physicochemical properties for a better understanding of the functional mechanisms of QSPs. Finally, we constructed an easy-to-use web server for the PSRQSP (http://pmlabstack.pythonanywhere.com/PSRQSP). PSRQSP is anticipated to be an efficient computational tool for accelerating the data-driven discovery of potential QSPs for drug discovery and development.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - Pramote Chumnanpuen
- Department of Zoology, Faculty of Science, Kasetsart University, Bangkok, 10900, Thailand; Omics Center for Agriculture, Bioresources, Food, and Health, Kasetsart University (OmiKU), Bangkok, 10900, Thailand
| | - Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Changmin Oh
- Computational Biology and Bioinformatics Laboratory, Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Gyeonggi-do, Republic of Korea
| | - Balachandran Manavalan
- Computational Biology and Bioinformatics Laboratory, Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Gyeonggi-do, Republic of Korea.
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand.
| |
Collapse
|
34
|
Yang X, Niu Z, Liu Y, Song B, Lu W, Zeng L, Zeng X. Modality-DTA: Multimodality Fusion Strategy for Drug-Target Affinity Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1200-1210. [PMID: 36083952 DOI: 10.1109/tcbb.2022.3205282] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Prediction of the drug-target affinity (DTA) plays an important role in drug discovery. Existing deep learning methods for DTA prediction typically leverage a single modality, namely simplified molecular input line entry specification (SMILES) or amino acid sequence to learn representations. SMILES or amino acid sequences can be encoded into different modalities. Multimodality data provide different kinds of information, with complementary roles for DTA prediction. We propose Modality-DTA, a novel deep learning method for DTA prediction that leverages the multimodality of drugs and targets. A group of backward propagation neural networks is applied to ensure the completeness of the reconstruction process from the latent feature representation to original multimodality data. The tag between the drug and target is used to reduce the noise information in the latent representation from multimodality data. Experiments on three benchmark datasets show that our Modality-DTA outperforms existing methods in all metrics. Modality-DTA reduces the mean square error by 15.7% and improves the area under the precisionrecall curve by 12.74% in the Davis dataset. We further find that the drug modality Morgan fingerprint and the target modality generated by one-hot-encoding play the most significant roles. To the best of our knowledge, Modality-DTA is the first method to explore multimodality for DTA prediction.
Collapse
|
35
|
DeepmRNALoc: A Novel Predictor of Eukaryotic mRNA Subcellular Localization Based on Deep Learning. Molecules 2023; 28:molecules28052284. [PMID: 36903531 PMCID: PMC10005629 DOI: 10.3390/molecules28052284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 02/02/2023] [Accepted: 02/10/2023] [Indexed: 03/06/2023] Open
Abstract
The subcellular localization of messenger RNA (mRNA) precisely controls where protein products are synthesized and where they function. However, obtaining an mRNA's subcellular localization through wet-lab experiments is time-consuming and expensive, and many existing mRNA subcellular localization prediction algorithms need to be improved. In this study, a deep neural network-based eukaryotic mRNA subcellular location prediction method, DeepmRNALoc, was proposed, utilizing a two-stage feature extraction strategy that featured bimodal information splitting and fusing for the first stage and a VGGNet-like CNN module for the second stage. The five-fold cross-validation accuracies of DeepmRNALoc in the cytoplasm, endoplasmic reticulum, extracellular region, mitochondria, and nucleus were 0.895, 0.594, 0.308, 0.944, and 0.865, respectively, demonstrating that it outperforms existing models and techniques.
Collapse
|
36
|
Wang T, Sun J, Zhao Q. Investigating cardiotoxicity related with hERG channel blockers using molecular fingerprints and graph attention mechanism. Comput Biol Med 2023; 153:106464. [PMID: 36584603 DOI: 10.1016/j.compbiomed.2022.106464] [Citation(s) in RCA: 112] [Impact Index Per Article: 112.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 12/12/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]
Abstract
Human ether-a-go-go-related gene (hERG) channel blockade by small molecules is a big concern during drug development in the pharmaceutical industry. Failure or inhibition of hERG channel activity caused by drug molecules can lead to prolonging QT interval, which will result in serious cardiotoxicity. Thus, evaluating the hERG blocking activity of all these small molecular compounds is technically challenging, and the relevant procedures are expensive and time-consuming. In this study, we develop a novel deep learning predictive model named DMFGAM for predicting hERG blockers. In order to characterize the molecule more comprehensively, we first consider the fusion of multiple molecular fingerprint features to characterize its final molecular fingerprint features. Then, we use the multi-head attention mechanism to extract the molecular graph features. Both molecular fingerprint features and molecular graph features are fused as the final features of the compounds to make the feature expression of compounds more comprehensive. Finally, the molecules are classified into hERG blockers or hERG non-blockers through the fully connected neural network. We conduct 5-fold cross-validation experiment to evaluate the performance of DMFGAM, and verify the robustness of DMFGAM on external validation datasets. We believe DMFGAM can serve as a powerful tool to predict hERG channel blockers in the early stages of drug discovery and development.
Collapse
Affiliation(s)
- Tianyi Wang
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Jianqiang Sun
- School of Automation and Electrical Engineering, Linyi University, Linyi, 276000, China
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China.
| |
Collapse
|
37
|
Non-coding RNAs as key players in the neurodegenerative diseases: Multi-platform strategies and approaches for exploring the Genome's dark matter. J Chem Neuroanat 2023; 129:102236. [PMID: 36709005 DOI: 10.1016/j.jchemneu.2023.102236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 01/21/2023] [Accepted: 01/24/2023] [Indexed: 01/26/2023]
Abstract
A growing amount of evidence in the last few years has begun to unravel that non-coding RNAs have a myriad of functions in gene regulation. Intensive investigation on non-coding RNAs (ncRNAs) has led to exploring their broad role in neurodegenerative diseases (NDs) owing to their regulatory role in gene expression. RNA sequencing technologies and transcriptome analysis has unveiled significant dysregulation of ncRNAs attributed to their biogenesis, upregulation, downregulation, aberrant epigenetic regulation, and abnormal transcription. Despite these advances, the understanding of their potential as therapeutic targets and biomarkers underpinning detailed mechanisms is still unknown. Advancements in bioinformatics and molecular technologies have improved our knowledge of the dark matter of the genome in terms of recognition and functional validation. This review aims to shed light on ncRNAs biogenesis, function, and potential role in NDs. Further deepening of their role is provided through a focus on the most recent platforms, experimental approaches, and computational analysis to investigate ncRNAs. Furthermore, this review summarizes and evaluates well-studied miRNAs, lncRNAs and circRNAs concerning their potential role in pathogenesis and use as biomarkers in NDs. Finally, a perspective on the main challenges and novel methods for the future and broad therapeutic use of ncRNAs is offered.
Collapse
|
38
|
McNair D. Artificial Intelligence and Machine Learning for Lead-to-Candidate Decision-Making and Beyond. Annu Rev Pharmacol Toxicol 2023; 63:77-97. [PMID: 35679624 DOI: 10.1146/annurev-pharmtox-051921-023255] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The use of artificial intelligence (AI) and machine learning (ML) in pharmaceutical research and development has to date focused on research: target identification; docking-, fragment-, and motif-based generation of compound libraries; modeling of synthesis feasibility; rank-ordering likely hits according to structural and chemometric similarity to compounds having known activity and affinity to the target(s); optimizing a smaller library for synthesis and high-throughput screening; and combining evidence from screening to support hit-to-lead decisions. Applying AI/ML methods to lead optimization and lead-to-candidate (L2C) decision-making has shown slower progress, especially regarding predicting absorption, distribution, metabolism, excretion, and toxicology properties. The present review surveys reasons why this is so, reports progress that has occurred in recent years, and summarizes some of the issues that remain. Effective AI/ML tools to derisk L2C and later phases of development are important to accelerate the pharmaceutical development process, ameliorate escalating development costs, and achieve greater success rates.
Collapse
Affiliation(s)
- Douglas McNair
- Global Health, Integrated Development, Bill & Melinda Gates Foundation, Seattle, Washington, USA;
| |
Collapse
|
39
|
Peng Y, Zhao S, Zeng Z, Hu X, Yin Z. LGBMDF: A cascade forest framework with LightGBM for predicting drug-target interactions. Front Microbiol 2023; 13:1092467. [PMID: 36687573 PMCID: PMC9849804 DOI: 10.3389/fmicb.2022.1092467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 12/07/2022] [Indexed: 01/07/2023] Open
Abstract
Prediction of drug-target interactions (DTIs) plays an important role in drug development. However, traditional laboratory methods to determine DTIs require a lot of time and capital costs. In recent years, many studies have shown that using machine learning methods to predict DTIs can speed up the drug development process and reduce capital costs. An excellent DTI prediction method should have both high prediction accuracy and low computational cost. In this study, we noticed that the previous research based on deep forests used XGBoost as the estimator in the cascade, we applied LightGBM instead of XGBoost to the cascade forest as the estimator, then the estimator group was determined experimentally as three LightGBMs and three ExtraTrees, this new model is called LGBMDF. We conducted 5-fold cross-validation on LGBMDF and other state-of-the-art methods using the same dataset, and compared their Sn, Sp, MCC, AUC and AUPR. Finally, we found that our method has better performance and faster calculation speed.
Collapse
|
40
|
Huang D, He H, Ouyang J, Zhao C, Dong X, Xie J. Small molecule drug and biotech drug interaction prediction based on multi-modal representation learning. BMC Bioinformatics 2022; 23:561. [PMID: 36575376 PMCID: PMC9793529 DOI: 10.1186/s12859-022-05101-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 12/06/2022] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Drug-drug interactions (DDIs) occur when two or more drugs are taken simultaneously or successively. Early detection of adverse drug interactions can be essential in preventing medical errors and reducing healthcare costs. Many computational methods already predict interactions between small molecule drugs (SMDs). As the number of biotechnology drugs (BioDs) increases, so makes the threat of interactions between SMDs and BioDs. However, few computational methods are available to predict their interactions. RESULTS Considering the structural specificity and relational complexity of SMDs and BioDs, a novel multi-modal representation learning method called Multi-SBI is proposed to predict their interactions. First, multi-modal features are used to adequately represent the heterogeneous structure and complex relationships of SMDs and BioDs. Second, an undersampling method based on Positive-unlabeled learning (PU-sampling) is introduced to obtain negative samples with high confidence from the unlabeled data set. Finally, both learned representations of SMD and BioD are fed into DNN classifiers to predict their interaction events. In addition, we also conduct a retrospective analysis. CONCLUSIONS Our proposed multi-modal representation learning method can extract drug features more comprehensively in heterogeneous drugs. In addition, PU-sampling can effectively reduce the noise in the sampling procedure. Our proposed method significantly outperforms other state-of-the-art drug interaction prediction methods. In a retrospective analysis of DrugBank 5.1.0, 14 out of the 20 predictions with the highest confidence were validated in the latest version of DrugBank 5.1.8, demonstrating that Multi-SBI is a valuable tool for predicting new drug interactions through effectively extracting and learning heterogeneous drug features.
Collapse
Affiliation(s)
- Dingkai Huang
- grid.39436.3b0000 0001 2323 5732School of Computer Engineering and Science, Shanghai University, Shanghai, 200444 China
| | - Hongjian He
- grid.39436.3b0000 0001 2323 5732School of Computer Engineering and Science, Shanghai University, Shanghai, 200444 China
| | - Jiaming Ouyang
- grid.39436.3b0000 0001 2323 5732School of Computer Engineering and Science, Shanghai University, Shanghai, 200444 China
| | - Chang Zhao
- grid.39436.3b0000 0001 2323 5732School of Computer Engineering and Science, Shanghai University, Shanghai, 200444 China
| | - Xin Dong
- grid.39436.3b0000 0001 2323 5732School of Medicine, Shanghai University, Shanghai, 200444 China
| | - Jiang Xie
- grid.39436.3b0000 0001 2323 5732School of Computer Engineering and Science, Shanghai University, Shanghai, 200444 China
| |
Collapse
|
41
|
Qi G, Xu Z, Dan H, Jia X, Jiang Q, Zhang A, Li Z, Liu X, Ma J, Zheng X, Li Z. A Complex Heterogeneous Network Model of Disease Regulated by Noncoding RNAs: A Case Study of Unstable Angina Pectoris. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:5852089. [PMID: 36590836 PMCID: PMC9803582 DOI: 10.1155/2022/5852089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 11/27/2022] [Accepted: 12/02/2022] [Indexed: 12/24/2022]
Abstract
MicroRNAs (miRNAs) are important types of noncoding RNAs, and there is a lack of holistic and systematic understanding of the functions they play in disease. We proposed a research strategy, including two parts network analysis and network modelling, to analyze, model, and predict the regulatory network of miRNAs from a network perspective, using unstable angina pectoris as an example. In the network analysis section, we proposed the WGCNA & SimCluster method using both correlation and similarity to find hub miRNAs, and validation on two datasets showed better results than the methods using correlation or similarity alone. In the network modelling section, we used six knowledge graph or graph neural network models for link prediction of three types of edges and multilabel classification of two types of nodes. Comparative experiments showed that the RotatE model was a good model for link prediction, while the RGCN model was the best model for multilabel classification. Potential target genes were predicted for hub miRNAs and validation of hub miRNA-target gene interactions, target genes as biomarkers and target gene functions were performed using a three-step validation approach. In conclusion, our study provides a new strategy to analyze and model miRNA regulatory networks.
Collapse
Affiliation(s)
- Guanpeng Qi
- School of Pharmacy, Shenyang Pharmaceutical University, Shenyang 110016, China
| | - Ze Xu
- School of Pharmacy, Shenyang Pharmaceutical University, Shenyang 110016, China
| | - Hanyu Dan
- School of Medical Devices, Shenyang Pharmaceutical University, Shenyang 110016, China
| | - Xiangnan Jia
- School of Medical Devices, Shenyang Pharmaceutical University, Shenyang 110016, China
| | - Qiang Jiang
- School of Medical Devices, Shenyang Pharmaceutical University, Shenyang 110016, China
| | - Aijun Zhang
- School of Pharmacy, Shenyang Pharmaceutical University, Shenyang 110016, China
| | - Zhaohang Li
- School of Pharmacy, Shenyang Pharmaceutical University, Shenyang 110016, China
| | - Xin Liu
- School of Life Sciences and Biopharmaceuticals, Shenyang Pharmaceutical University, Shenyang 110016, China
| | - Juman Ma
- School of Life Sciences and Biopharmaceuticals, Shenyang Pharmaceutical University, Shenyang 110016, China
| | - Xiaosong Zheng
- School of Medical Devices, Shenyang Pharmaceutical University, Shenyang 110016, China
| | - Zuojing Li
- School of Medical Devices, Shenyang Pharmaceutical University, Shenyang 110016, China
| |
Collapse
|
42
|
Zhong W, He C, Xiao C, Liu Y, Qin X, Yu Z. Long-distance dependency combined multi-hop graph neural networks for protein-protein interactions prediction. BMC Bioinformatics 2022; 23:521. [PMID: 36471248 PMCID: PMC9724439 DOI: 10.1186/s12859-022-05062-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Accepted: 11/16/2022] [Indexed: 12/10/2022] Open
Abstract
BACKGROUND Protein-protein interactions are widespread in biological systems and play an important role in cell biology. Since traditional laboratory-based methods have some drawbacks, such as time-consuming, money-consuming, etc., a large number of methods based on deep learning have emerged. However, these methods do not take into account the long-distance dependency information between each two amino acids in sequence. In addition, most existing models based on graph neural networks only aggregate the first-order neighbors in protein-protein interaction (PPI) network. Although multi-order neighbor information can be aggregated by increasing the number of layers of neural network, it is easy to cause over-fitting. So, it is necessary to design a network that can capture long distance dependency information between amino acids in the sequence and can directly capture multi-order neighbor information in protein-protein interaction network. RESULTS In this study, we propose a multi-hop neural network (LDMGNN) model combining long distance dependency information to predict the multi-label protein-protein interactions. In the LDMGNN model, we design the protein amino acid sequence encoding (PAASE) module with the multi-head self-attention Transformer block to extract the features of amino acid sequences by calculating the interdependence between every two amino acids. And expand the receptive field in space by constructing a two-hop protein-protein interaction (THPPI) network. We combine PPI network and THPPI network with amino acid sequence features respectively, then input them into two identical GIN blocks at the same time to obtain two embeddings. Next, the two embeddings are fused and input to the classifier for predict multi-label protein-protein interactions. Compared with other state-of-the-art methods, LDMGNN shows the best performance on both the SHS27K and SHS148k datasets. Ablation experiments show that the PAASE module and the construction of THPPI network are feasible and effective. CONCLUSIONS In general terms, our proposed LDMGNN model has achieved satisfactory results in the prediction of multi-label protein-protein interactions.
Collapse
Affiliation(s)
- Wen Zhong
- grid.267139.80000 0000 9188 055XCollege of Science, University of Shanghai for Science and Technology, Jungong Road, Shanghai, 200093 China
| | - Changxiang He
- grid.267139.80000 0000 9188 055XCollege of Science, University of Shanghai for Science and Technology, Jungong Road, Shanghai, 200093 China
| | - Chen Xiao
- grid.267139.80000 0000 9188 055XCollege of Science, University of Shanghai for Science and Technology, Jungong Road, Shanghai, 200093 China
| | - Yuru Liu
- grid.267139.80000 0000 9188 055XCollege of Science, University of Shanghai for Science and Technology, Jungong Road, Shanghai, 200093 China
| | - Xiaofei Qin
- grid.267139.80000 0000 9188 055XSchool of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Jungong Road, Shanghai, 200093 China
| | - Zhensheng Yu
- grid.267139.80000 0000 9188 055XCollege of Science, University of Shanghai for Science and Technology, Jungong Road, Shanghai, 200093 China
| |
Collapse
|
43
|
A survey of graph neural networks in various learning paradigms: methods, applications, and challenges. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10321-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
44
|
Tian Z, Peng X, Fang H, Zhang W, Dai Q, Ye Y. MHADTI: predicting drug-target interactions via multiview heterogeneous information network embedding with hierarchical attention mechanisms. Brief Bioinform 2022; 23:6761042. [PMID: 36242566 DOI: 10.1093/bib/bbac434] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 08/19/2022] [Accepted: 09/08/2022] [Indexed: 12/14/2022] Open
Abstract
MOTIVATION Discovering the drug-target interactions (DTIs) is a crucial step in drug development such as the identification of drug side effects and drug repositioning. Since identifying DTIs by web-biological experiments is time-consuming and costly, many computational-based approaches have been proposed and have become an efficient manner to infer the potential interactions. Although extensive effort is invested to solve this task, the prediction accuracy still needs to be improved. More especially, heterogeneous network-based approaches do not fully consider the complex structure and rich semantic information in these heterogeneous networks. Therefore, it is still a challenge to predict DTIs efficiently. RESULTS In this study, we develop a novel method via Multiview heterogeneous information network embedding with Hierarchical Attention mechanisms to discover potential Drug-Target Interactions (MHADTI). Firstly, MHADTI constructs different similarity networks for drugs and targets by utilizing their multisource information. Combined with the known DTI network, three drug-target heterogeneous information networks (HINs) with different views are established. Secondly, MHADTI learns embeddings of drugs and targets from multiview HINs with hierarchical attention mechanisms, which include the node-level, semantic-level and graph-level attentions. Lastly, MHADTI employs the multilayer perceptron to predict DTIs with the learned deep feature representations. The hierarchical attention mechanisms could fully consider the importance of nodes, meta-paths and graphs in learning the feature representations of drugs and targets, which makes their embeddings more comprehensively. Extensive experimental results demonstrate that MHADTI performs better than other SOTA prediction models. Moreover, analysis of prediction results for some interested drugs and targets further indicates that MHADTI has advantages in discovering DTIs. AVAILABILITY AND IMPLEMENTATION https://github.com/pxystudy/MHADTI.
Collapse
Affiliation(s)
- Zhen Tian
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Xiangyu Peng
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Haichuan Fang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Wenjie Zhang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Qiguo Dai
- School of Computer Science and Engineering, Dalian Minzu University, Dalian,116600, China
| | - Yangdong Ye
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| |
Collapse
|
45
|
Chen Y, Wang J, Wang C, Liu M, Zou Q. Deep learning models for disease-associated circRNA prediction: a review. Brief Bioinform 2022; 23:6696465. [PMID: 36130259 DOI: 10.1093/bib/bbac364] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 07/30/2022] [Accepted: 08/03/2022] [Indexed: 12/14/2022] Open
Abstract
Emerging evidence indicates that circular RNAs (circRNAs) can provide new insights and potential therapeutic targets for disease diagnosis and treatment. However, traditional biological experiments are expensive and time-consuming. Recently, deep learning with a more powerful ability for representation learning enables it to be a promising technology for predicting disease-associated circRNAs. In this review, we mainly introduce the most popular databases related to circRNA, and summarize three types of deep learning-based circRNA-disease associations prediction methods: feature-generation-based, type-discrimination and hybrid-based methods. We further evaluate seven representative models on benchmark with ground truth for both balance and imbalance classification tasks. In addition, we discuss the advantages and limitations of each type of method and highlight suggested applications for future research.
Collapse
Affiliation(s)
- Yaojia Chen
- College of Electronics and Information Engineering Guangdong Ocean University, Zhanjiang, China and the Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Jiacheng Wang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Chuyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Mingxin Liu
- College of Electronics and Information Engineering, Guangdong Ocean University, Zhanjiang, China
| | - Quan Zou
- University of Electronic Science and Technology of China, China
| |
Collapse
|
46
|
Kurata H, Tsukiyama S. ICAN: Interpretable cross-attention network for identifying drug and target protein interactions. PLoS One 2022; 17:e0276609. [PMID: 36279284 PMCID: PMC9591068 DOI: 10.1371/journal.pone.0276609] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 10/10/2022] [Indexed: 11/18/2022] Open
Abstract
Drug-target protein interaction (DTI) identification is fundamental for drug discovery and drug repositioning, because therapeutic drugs act on disease-causing proteins. However, the DTI identification process often requires expensive and time-consuming tasks, including biological experiments involving large numbers of candidate compounds. Thus, a variety of computation approaches have been developed. Of the many approaches available, chemo-genomics feature-based methods have attracted considerable attention. These methods compute the feature descriptors of drugs and proteins as the input data to train machine and deep learning models to enable accurate prediction of unknown DTIs. In addition, attention-based learning methods have been proposed to identify and interpret DTI mechanisms. However, improvements are needed for enhancing prediction performance and DTI mechanism elucidation. To address these problems, we developed an attention-based method designated the interpretable cross-attention network (ICAN), which predicts DTIs using the Simplified Molecular Input Line Entry System of drugs and amino acid sequences of target proteins. We optimized the attention mechanism architecture by exploring the cross-attention or self-attention, attention layer depth, and selection of the context matrixes from the attention mechanism. We found that a plain attention mechanism that decodes drug-related protein context features without any protein-related drug context features effectively achieved high performance. The ICAN outperformed state-of-the-art methods in several metrics on the DAVIS dataset and first revealed with statistical significance that some weighted sites in the cross-attention weight matrix represent experimental binding sites, thus demonstrating the high interpretability of the results. The program is freely available at https://github.com/kuratahiroyuki/ICAN.
Collapse
Affiliation(s)
- Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan
- * E-mail:
| | - Sho Tsukiyama
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan
| |
Collapse
|
47
|
Zhang Y, Wu M, Wang S, Chen W. EFMSDTI: Drug-target interaction prediction based on an efficient fusion of multi-source data. Front Pharmacol 2022; 13:1009996. [PMID: 36210804 PMCID: PMC9538487 DOI: 10.3389/fphar.2022.1009996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 08/29/2022] [Indexed: 11/13/2022] Open
Abstract
Accurate identification of Drug Target Interactions (DTIs) is of great significance for understanding the mechanism of drug treatment and discovering new drugs for disease treatment. Currently, computational methods of DTIs prediction that combine drug and target multi-source data can effectively reduce the cost and time of drug development. However, in multi-source data processing, the contribution of different source data to DTIs is often not considered. Therefore, how to make full use of the contribution of different source data to predict DTIs for efficient fusion is the key to improving the prediction accuracy of DTIs. In this paper, considering the contribution of different source data to DTIs prediction, a DTIs prediction approach based on an effective fusion of drug and target multi-source data is proposed, named EFMSDTI. EFMSDTI first builds 15 similarity networks based on multi-source information networks classified as topological and semantic graphs of drugs and targets according to their biological characteristics. Then, the multi-networks are fused by selective and entropy weighting based on similarity network fusion (SNF) according to their contribution to DTIs prediction. The deep neural networks model learns the embedding of low-dimensional vectors of drugs and targets. Finally, the LightGBM algorithm based on Gradient Boosting Decision Tree (GBDT) is used to complete DTIs prediction. Experimental results show that EFMSDTI has better performance (AUROC and AUPR are 0.982) than several state-of-the-art algorithms. Also, it has a good effect on analyzing the top 1000 prediction results, while 990 of the first 1000DTIs were confirmed. Code and data are available at https://github.com/meng-jie/EFMSDTI.
Collapse
Affiliation(s)
- Yuanyuan Zhang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, Shandong, China
- College of Computer science and Technology, China University of Petroleum (East China), Qingdao, Shandong, China
- *Correspondence: Yuanyuan Zhang,
| | - Mengjie Wu
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, Shandong, China
| | - Shudong Wang
- College of Computer science and Technology, China University of Petroleum (East China), Qingdao, Shandong, China
| | - Wei Chen
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, Shandong, China
| |
Collapse
|
48
|
Yan W, Tang W, Wang L, Bin Y, Xia J. PrMFTP: Multi-functional therapeutic peptides prediction based on multi-head self-attention mechanism and class weight optimization. PLoS Comput Biol 2022; 18:e1010511. [PMID: 36094961 PMCID: PMC9499272 DOI: 10.1371/journal.pcbi.1010511] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 09/22/2022] [Accepted: 08/24/2022] [Indexed: 11/18/2022] Open
Abstract
Prediction of therapeutic peptide is a significant step for the discovery of promising therapeutic drugs. Most of the existing studies have focused on the mono-functional therapeutic peptide prediction. However, the number of multi-functional therapeutic peptides (MFTP) is growing rapidly, which requires new computational schemes to be proposed to facilitate MFTP discovery. In this study, based on multi-head self-attention mechanism and class weight optimization algorithm, we propose a novel model called PrMFTP for MFTP prediction. PrMFTP exploits multi-scale convolutional neural network, bi-directional long short-term memory, and multi-head self-attention mechanisms to fully extract and learn informative features of peptide sequence to predict MFTP. In addition, we design a class weight optimization scheme to address the problem of label imbalanced data. Comprehensive evaluation demonstrate that PrMFTP is superior to other state-of-the-art computational methods for predicting MFTP. We provide a user-friendly web server of PrMFTP, which is available at http://bioinfo.ahu.edu.cn/PrMFTP.
Collapse
Affiliation(s)
- Wenhui Yan
- Information Materials and Intelligent Sensing Laboratory of Anhui Province and Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China
| | - Wending Tang
- Information Materials and Intelligent Sensing Laboratory of Anhui Province and Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China
| | - Lihua Wang
- Information Materials and Intelligent Sensing Laboratory of Anhui Province and Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China
| | - Yannan Bin
- Information Materials and Intelligent Sensing Laboratory of Anhui Province and Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China
- * E-mail: (YB); (JX)
| | - Junfeng Xia
- Information Materials and Intelligent Sensing Laboratory of Anhui Province and Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China
- * E-mail: (YB); (JX)
| |
Collapse
|
49
|
GCHN-DTI: Predicting drug-target interactions by graph convolution on heterogeneous networks. Methods 2022; 206:101-107. [PMID: 36058415 DOI: 10.1016/j.ymeth.2022.08.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 08/17/2022] [Accepted: 08/29/2022] [Indexed: 11/22/2022] Open
Abstract
Determining the interaction of drug and target plays a key role in the process of drug development and discovery. The calculation methods can predict new interactions and speed up the process of drug development. In recent studies, the network-based approaches have been proposed to predict drug-target interactions. However, these methods cannot fully utilize the node information from heterogeneous networks. Therefore, we propose a method based on heterogeneous graph convolutional neural network for drug-target interaction prediction, GCHN-DTI (Predicting drug-target interactions by graph convolution on heterogeneous net-works), to predict potential DTIs. GCHN-DTI integrates network information from drug-target interactions, drug-drug interactions, drug-similarities, target-target interactions, and target-similarities. Then, the graph convolution operation is used in the heterogeneous network to obtain the node embedding of the drugs and the targets. Furthermore, we incorporate an attention mechanism between graph convolutional layers to combine node embedding from each layer. Finally, the drug-target interaction score is predicted based on the node embedding of the drugs and the targets. Our model uses fewer network types and achieves higher prediction performance. In addition, the prediction performance of the model will be significantly improved on the dataset with a higher proportion of positive samples. The experimental evaluations show that GCHN-DTI outperforms several state-of-the-art prediction methods.
Collapse
|
50
|
Lin S, Zhang G, Wei DQ, Xiong Y. DeepPSE: Prediction of polypharmacy side effects by fusing deep representation of drug pairs and attention mechanism. Comput Biol Med 2022; 149:105984. [PMID: 35994933 DOI: 10.1016/j.compbiomed.2022.105984] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 07/17/2022] [Accepted: 08/14/2022] [Indexed: 11/18/2022]
Abstract
Polypharmacy (multiple use of drugs) is an effective strategy for combating complex or co-existing diseases. However, a major consequence of polypharmacy is a higher risk of adverse side effects due to drug-drug interactions, which are rare and observed in relatively small clinical testing. Thus, identification of polypharmacy side effects remains challenging. Here, we propose a deep learning-based method, DeepPSE, to predict polypharmacy side effects in an end-to-end way. DeepPSE is composed of two main modules. First, multiple types of neural networks are constructed and fused to learn the deep representation of a drug pair. Second, the encoder block of transformer that includes self-attention mechanism is built to get latent features, which are further fed into the fully connected layer to predict polypharmacy side effects of drug pairs. Further, DeepPSE is compared with five baseline or state-of-the-art methods on a benchmark dataset of 964 types of polypharmacy side effects across 63473 drug pairs. Experimental results demonstrate that DeepPSE achieves better performance than that of all five methods. The source codes and data are available at https://github.com/ShenggengLin/DeepPSE.
Collapse
Affiliation(s)
- Shenggeng Lin
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Guangwei Zhang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, Guangdong, 510275, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China; Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nayang, Henan, 473006, China; Peng Cheng National Laboratory, Vanke Cloud City Phase I Building 8, Xili Street, Nanshan District, Shenzhen, Guangdong, 518055, China.
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China.
| |
Collapse
|