1
|
Zhang G, Chen Y, Yan C, Wang J, Liang W, Luo J, Luo H. MPASL: multi-perspective learning knowledge graph attention network for synthetic lethality prediction in human cancer. Front Pharmacol 2024; 15:1398231. [PMID: 38835667 PMCID: PMC11148462 DOI: 10.3389/fphar.2024.1398231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Accepted: 04/26/2024] [Indexed: 06/06/2024] Open
Abstract
Synthetic lethality (SL) is widely used to discover the anti-cancer drug targets. However, the identification of SL interactions through wet experiments is costly and inefficient. Hence, the development of efficient and high-accuracy computational methods for SL interactions prediction is of great significance. In this study, we propose MPASL, a multi-perspective learning knowledge graph attention network to enhance synthetic lethality prediction. MPASL utilizes knowledge graph hierarchy propagation to explore multi-source neighbor nodes related to genes. The knowledge graph ripple propagation expands gene representations through existing gene SL preference sets. MPASL can learn the gene representations from both gene-entity perspective and entity-entity perspective. Specifically, based on the aggregation method, we learn to obtain gene-oriented entity embeddings. Then, the gene representations are refined by comparing the various layer-wise neighborhood features of entities using the discrepancy contrastive technique. Finally, the learned gene representation is applied in SL prediction. Experimental results demonstrated that MPASL outperforms several state-of-the-art methods. Additionally, case studies have validated the effectiveness of MPASL in identifying SL interactions between genes.
Collapse
Affiliation(s)
- Ge Zhang
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Yitong Chen
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Chaokun Yan
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Jianlin Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Wenjuan Liang
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Junwei Luo
- College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, Henan, China
| | - Huimin Luo
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| |
Collapse
|
2
|
Liu X, Hu J, Zheng J. SL-Miner: a web server for mining evidence and prioritization of cancer-specific synthetic lethality. Bioinformatics 2024; 40:btae016. [PMID: 38244572 PMCID: PMC10868331 DOI: 10.1093/bioinformatics/btae016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Revised: 12/10/2023] [Accepted: 01/16/2024] [Indexed: 01/22/2024] Open
Abstract
SUMMARY Synthetic lethality (SL) refers to a type of genetic interaction in which the simultaneous inactivation of two genes leads to cell death, while the inactivation of a single gene does not affect cell viability. It significantly expands the range of potential therapeutic targets for anti-cancer treatments. SL interactions are primarily identified through experimental screening and computational prediction. Although various computational methods have been proposed, they tend to ignore providing evidence to support their predictions of SL. Besides, they are rarely user-friendly for biologists who likely have limited programming skills. Moreover, the genetic context specificity of SL interactions is often not taken into consideration. Here, we introduce a web server called SL-Miner, which is designed to mine the evidence of SL relationships between a primary gene and a few candidate SL partner genes in a specific type of cancer, and to prioritize these candidate genes by integrating various types of evidence. For intuitive data visualization, SL-Miner provides a range of charts (e.g. volcano plot and box plot) to help users get insights from the data. AVAILABILITY AND IMPLEMENTATION SL-Miner is available at https://slminer.sist.shanghaitech.edu.cn.
Collapse
Affiliation(s)
- Xin Liu
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Jieni Hu
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai 201210, China
| |
Collapse
|
3
|
Tepeli YI, Seale C, Gonçalves JP. ELISL: early-late integrated synthetic lethality prediction in cancer. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btad764. [PMID: 38113447 DOI: 10.1093/bioinformatics/btad764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 11/06/2023] [Accepted: 12/18/2023] [Indexed: 12/21/2023]
Abstract
MOTIVATION Anti-cancer therapies based on synthetic lethality (SL) exploit tumour vulnerabilities for treatment with reduced side effects, by targeting a gene that is jointly essential with another whose function is lost. Computational prediction is key to expedite SL screening, yet existing methods are vulnerable to prevalent selection bias in SL data and reliant on cancer or tissue type-specific omics, which can be scarce. Notably, sequence similarity remains underexplored as a proxy for related gene function and joint essentiality. RESULTS We propose ELISL, Early-Late Integrated SL prediction with forest ensembles, using context-free protein sequence embeddings and context-specific omics from cell lines and tissue. Across eight cancer types, ELISL showed superior robustness to selection bias and recovery of known SL genes, as well as promising cross-cancer predictions. Co-occurring mutations in a BRCA gene and ELISL-predicted pairs from the HH, FGF, WNT, or NEIL gene families were associated with longer patient survival times, revealing therapeutic potential. AVAILABILITY AND IMPLEMENTATION Data: 10.6084/m9.figshare.23607558 & Code: github.com/joanagoncalveslab/ELISL.
Collapse
Affiliation(s)
- Yasin I Tepeli
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft, The Netherlands
| | - Colm Seale
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft, The Netherlands
- Holland Proton Therapy Center (HollandPTC), Delft, The Netherlands
| | - Joana P Gonçalves
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft, The Netherlands
| |
Collapse
|
4
|
Wang J, Wen Y, Zhang Y, Wang Z, Jiang Y, Dai C, Wu L, Leng D, He S, Bo X. An interpretable artificial intelligence framework for designing synthetic lethality-based anti-cancer combination therapies. J Adv Res 2023:S2090-1232(23)00374-0. [PMID: 38043609 DOI: 10.1016/j.jare.2023.11.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 11/27/2023] [Accepted: 11/29/2023] [Indexed: 12/05/2023] Open
Abstract
INTRODUCTION Synthetic lethality (SL) provides an opportunity to leverage different genetic interactions when designing synergistic combination therapies. To further explore SL-based combination therapies for cancer treatment, it is important to identify and mechanistically characterize more SL interactions. Artificial intelligence (AI) methods have recently been proposed for SL prediction, but the results of these models are often not interpretable such that deriving the underlying mechanism can be challenging. OBJECTIVES This study aims to develop an interpretable AI framework for SL prediction and subsequently utilize it to design SL-based synergistic combination therapies. METHODS We propose a knowledge and data dual-driven AI framework for SL prediction (KDDSL). Specifically, we use gene knowledge related to the SL mechanism to guide the construction of the model and develop a method to identify the most relevant gene knowledge for the predicted results. RESULTS Experimental and literature-based validation confirmed a good balance between predictive and interpretable ability when using KDDSL. Moreover, we demonstrated that KDDSL could help to discover promising drug combinations and clarify associated biological processes, such as the combination of MDM2 and CDK9 inhibitors, which exhibited significant anti-cancer effects in vitro and in vivo. CONCLUSION These data underscore the potential of KDDSL to guide SL-based combination therapy design. There is a need for biomedicine-focused AI strategies to combine rational biological knowledge with developed models.
Collapse
Affiliation(s)
- Jing Wang
- School of Medicine, Tsinghua University, Beijing, 100084, China
| | - Yuqi Wen
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Yixin Zhang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Zhongming Wang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Yuyang Jiang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Chong Dai
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Lianlian Wu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Dongjin Leng
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China.
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China.
| |
Collapse
|
5
|
Pu M, Cheng K, Li X, Xin Y, Wei L, Jin S, Zheng W, Peng G, Tang Q, Zhou J, Zhang Y. Using graph-based model to identify cell specific synthetic lethal effects. Comput Struct Biotechnol J 2023; 21:5099-5110. [PMID: 37920819 PMCID: PMC10618116 DOI: 10.1016/j.csbj.2023.10.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 10/05/2023] [Accepted: 10/06/2023] [Indexed: 11/04/2023] Open
Abstract
Synthetic lethal (SL) pairs are pairs of genes whose simultaneous loss-of-function results in cell death, while a damaging mutation of either gene alone does not affect the cell's survival. This makes SL pairs attractive targets for precision cancer therapies, as targeting the unimpaired gene of the SL pair can selectively kill cancer cells that already harbor the impaired gene. Limited by the difficulty of finding true SL pairs, especially on specific cell types, current computational approaches provide only limited insights because of overlooking the crucial aspects of cellular context dependency and mechanistic understanding of SL pairs. As a result, the identification of SL targets still relies on expensive, time-consuming experimental approaches. In this work, we applied cell-line specific multi-omics data to a specially designed deep learning model to predict cell-line specific SL pairs. Through incorporating multiple types of cell-specific omics data with a self-attention module, we represent gene relationships as graphs. Our approach achieves the prediction of SL pairs in a cell-specific manner and demonstrates the potential to facilitate the discovery of cell-specific SL targets for cancer therapeutics, providing a tool to unearth mechanisms underlying the origin of SL in cancer biology. The code and data of our approach can be found at https://github.com/promethiume/SLwise.
Collapse
Affiliation(s)
| | - Kaiyang Cheng
- StoneWise, AI, Ltd., Beijing, China
- Nanjing University of Chinese Medicine, Shanghai, China
| | - Xiaorong Li
- StoneWise, AI, Ltd., Beijing, China
- Minzu University of China, Beijing, China
| | | | | | - Sutong Jin
- StoneWise, AI, Ltd., Beijing, China
- Harbin Institute of Technology, Weihai, China
| | | | | | - Qihong Tang
- StoneWise, AI, Ltd., Beijing, China
- Guilin University of Electronic Science and Technology, Guangxi, China
| | | | | |
Collapse
|
6
|
Lu X, Chen G, Li J, Hu X, Sun F. MAGCN: A Multiple Attention Graph Convolution Networks for Predicting Synthetic Lethality. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2681-2689. [PMID: 36374879 DOI: 10.1109/tcbb.2022.3221736] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Synthetic lethality (SL) is a potential cancer therapeutic strategy and drug discovery. Computational approaches to identify synthetic lethality genes have become an effective complement to wet experiments which are time consuming and costly. Graph convolutional networks (GCN) has been utilized to such prediction task as be good at capturing the neighborhood dependency in a graph. However, it is still a lack of the mechanism of aggregating the complementary neighboring information from various heterogeneous graphs. Here, we propose the Multiple Attention Graph Convolution Networks for predicting synthetic lethality (MAGCN). First, we obtain the functional similarity features and topological structure features of genes from different data sources respectively, such as Gene Ontology data and Protein-Protein Interaction. Then, graph convolutional network is utilized to accumulate the knowledge from neighbor nodes according to synthetic lethal associations. Meanwhile, we propose a multiple graphs attention model and construct a multiple graphs attention network to learn the contribution factors of different graphs to generate embedded representation by aggregating these graphs. Finally, the generated feature matrix is decoded to predict potential synthetic lethal interaction. Experimental results show that MAGCN is superior to other baseline methods. Case study demonstrates the ability of MAGCN to predict human SL gene pairs.
Collapse
|
7
|
Zhang K, Wu M, Liu Y, Feng Y, Zheng J. KR4SL: knowledge graph reasoning for explainable prediction of synthetic lethality. Bioinformatics 2023; 39:i158-i167. [PMID: 37387166 PMCID: PMC10311291 DOI: 10.1093/bioinformatics/btad261] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Synthetic lethality (SL) is a promising strategy for anticancer therapy, as inhibiting SL partners of genes with cancer-specific mutations can selectively kill the cancer cells without harming the normal cells. Wet-lab techniques for SL screening have issues like high cost and off-target effects. Computational methods can help address these issues. Previous machine learning methods leverage known SL pairs, and the use of knowledge graphs (KGs) can significantly enhance the prediction performance. However, the subgraph structures of KG have not been fully explored. Besides, most machine learning methods lack interpretability, which is an obstacle for wide applications of machine learning to SL identification. RESULTS We present a model named KR4SL to predict SL partners for a given primary gene. It captures the structural semantics of a KG by efficiently constructing and learning from relational digraphs in the KG. To encode the semantic information of the relational digraphs, we fuse textual semantics of entities into propagated messages and enhance the sequential semantics of paths using a recurrent neural network. Moreover, we design an attentive aggregator to identify critical subgraph structures that contribute the most to the SL prediction as explanations. Extensive experiments under different settings show that KR4SL significantly outperforms all the baselines. The explanatory subgraphs for the predicted gene pairs can unveil prediction process and mechanisms underlying synthetic lethality. The improved predictive power and interpretability indicate that deep learning is practically useful for SL-based cancer drug target discovery. AVAILABILITY AND IMPLEMENTATION The source code is freely available at https://github.com/JieZheng-ShanghaiTech/KR4SL.
Collapse
Affiliation(s)
- Ke Zhang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Yong Liu
- Nanyang Technological University, Singapore 639798, Singapore
| | - Yimiao Feng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Lingang Laboratory, Shanghai 201602, China
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Shanghai Engineering Research Center of Intelligent Vision and Imaging, ShanghaiTech University, Shanghai 201210, China
| |
Collapse
|
8
|
Zhu Y, Zhou Y, Liu Y, Wang X, Li J. SLGNN: synthetic lethality prediction in human cancers based on factor-aware knowledge graph neural network. Bioinformatics 2023; 39:6988048. [PMID: 36645245 PMCID: PMC9907046 DOI: 10.1093/bioinformatics/btad015] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 11/29/2022] [Accepted: 01/13/2023] [Indexed: 01/17/2023] Open
Abstract
MOTIVATION Synthetic lethality (SL) is a form of genetic interaction that can selectively kill cancer cells without damaging normal cells. Exploiting this mechanism is gaining popularity in the field of targeted cancer therapy and anticancer drug development. Due to the limitations of identifying SL interactions from laboratory experiments, an increasing number of research groups are devising computational prediction methods to guide the discovery of potential SL pairs. Although existing methods have attempted to capture the underlying mechanisms of SL interactions, methods that have a deeper understanding of and attempt to explain SL mechanisms still need to be developed. RESULTS In this work, we propose a novel SL prediction method, SLGNN. This method is based on the following assumption: SL interactions are caused by different molecular events or biological processes, which we define as SL-related factors that lead to SL interactions. SLGNN, apart from identifying SL interaction pairs, also models the preferences of genes for different SL-related factors, making the results more interpretable for biologists and clinicians. SLGNN consists of three steps: first, we model the combinations of relationships in the gene-related knowledge graph as the SL-related factors. Next, we derive initial embeddings of genes through an explicit message aggregation process of the knowledge graph. Finally, we derive the final gene embeddings through an SL graph, constructed using known SL gene pairs, utilizing factor-based message aggregation. At this stage, a supervised end-to-end training model is used for SL interaction prediction. Based on experimental results, the proposed SLGNN model outperforms all current state-of-the-art SL prediction methods and provides better interpretability. AVAILABILITY AND IMPLEMENTATION SLGNN is freely available at https://github.com/zy972014452/SLGNN.
Collapse
Affiliation(s)
- Yan Zhu
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China
| | - Yuhuan Zhou
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China
| | - Yang Liu
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China.,Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China
| | - Xuan Wang
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China.,Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China
| | - Junyi Li
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China.,Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China
| |
Collapse
|
9
|
Fan K, Tang S, Gökbağ B, Cheng L, Li L. Multi-view graph convolutional network for cancer cell-specific synthetic lethality prediction. Front Genet 2023; 13:1103092. [PMID: 36699450 PMCID: PMC9868610 DOI: 10.3389/fgene.2022.1103092] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 12/22/2022] [Indexed: 01/11/2023] Open
Abstract
Synthetic lethal (SL) genetic interactions have been regarded as a promising focus for investigating potential targeted therapeutics to tackle cancer. However, the costly investment of time and labor associated with wet-lab experimental screenings to discover potential SL relationships motivates the development of computational methods. Although graph neural network (GNN) models have performed well in the prediction of SL gene pairs, existing GNN-based models are not designed for predicting cancer cell-specific SL interactions that are more relevant to experimental validation in vitro. Besides, neither have existing methods fully utilized diverse graph representations of biological features to improve prediction performance. In this work, we propose MVGCN-iSL, a novel multi-view graph convolutional network (GCN) model to predict cancer cell-specific SL gene pairs, by incorporating five biological graph features and multi-omics data. Max pooling operation is applied to integrate five graph-specific representations obtained from GCN models. Afterwards, a deep neural network (DNN) model serves as the prediction module to predict the SL interactions in individual cancer cells (iSL). Extensive experiments have validated the model's successful integration of the multiple graph features and state-of-the-art performance in the prediction of potential SL gene pairs as well as generalization ability to novel genes.
Collapse
Affiliation(s)
- Kunjie Fan
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
| | - Shan Tang
- College of Pharmacy, The Ohio State University, Columbus, OH, United States
| | - Birkan Gökbağ
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
| | - Lijun Cheng
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
| | - Lang Li
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States,College of Pharmacy, The Ohio State University, Columbus, OH, United States,*Correspondence: Lang Li,
| |
Collapse
|
10
|
Tang S, Gökbağ B, Fan K, Shao S, Huo Y, Wu X, Cheng L, Li L. Synthetic lethal gene pairs: Experimental approaches and predictive models. Front Genet 2022; 13:961611. [PMID: 36531238 PMCID: PMC9751344 DOI: 10.3389/fgene.2022.961611] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 11/07/2022] [Indexed: 03/27/2024] Open
Abstract
Synthetic lethality (SL) refers to a genetic interaction in which the simultaneous perturbation of two genes leads to cell or organism death, whereas viability is maintained when only one of the pair is altered. The experimental exploration of these pairs and predictive modeling in computational biology contribute to our understanding of cancer biology and the development of cancer therapies. We extensively reviewed experimental technologies, public data sources, and predictive models in the study of synthetic lethal gene pairs and herein detail biological assumptions, experimental data, statistical models, and computational schemes of various predictive models, speculate regarding their influence on individual sample- and population-based synthetic lethal interactions, discuss the pros and cons of existing SL data and models, and highlight potential research directions in SL discovery.
Collapse
Affiliation(s)
- Shan Tang
- College of Pharmacy, The Ohio State University, Columbus, OH, United States
| | - Birkan Gökbağ
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Kunjie Fan
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Shuai Shao
- College of Pharmacy, The Ohio State University, Columbus, OH, United States
| | - Yang Huo
- Indiana University, Bloomington, IN, United States
| | - Xue Wu
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Lijun Cheng
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Lang Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| |
Collapse
|
11
|
Wang S, Feng Y, Liu X, Liu Y, Wu M, Zheng J. NSF4SL: negative-sample-free contrastive learning for ranking synthetic lethal partner genes in human cancers. Bioinformatics 2022; 38:ii13-ii19. [PMID: 36124790 DOI: 10.1093/bioinformatics/btac462] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Detecting synthetic lethality (SL) is a promising strategy for identifying anti-cancer drug targets. Targeting SL partners of a primary gene mutated in cancer is selectively lethal to cancer cells. Due to high cost of wet-lab experiments and availability of gold standard SL data, supervised machine learning for SL prediction has been popular. However, most of the methods are based on binary classification and thus limited by the lack of reliable negative data. Contrastive learning can train models without any negative sample and is thus promising for finding novel SLs. RESULTS We propose NSF4SL, a negative-sample-free SL prediction model based on a contrastive learning framework. It captures the characteristics of positive SL samples by using two branches of neural networks that interact with each other to learn SL-related gene representations. Moreover, a feature-wise data augmentation strategy is used to mitigate the sparsity of SL data. NSF4SL significantly outperforms all baselines which require negative samples, even in challenging experimental settings. To the best of our knowledge, this is the first time that SL prediction is formulated as a gene ranking problem, which is more practical than the current formulation as binary classification. NSF4SL is the first contrastive learning method for SL prediction and its success points to a new direction of machine-learning methods for identifying novel SLs. AVAILABILITY AND IMPLEMENTATION Our source code is available at https://github.com/JieZheng-ShanghaiTech/NSF4SL. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shike Wang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Yimiao Feng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Xin Liu
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Yong Liu
- Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, Nanyang Technological University, Singapore 639798, Singapore
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China.,Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai 201210, China
| |
Collapse
|
12
|
Liu X, Yu J, Tao S, Yang B, Wang S, Wang L, Bai F, Zheng J. PiLSL: pairwise interaction learning-based graph neural network for synthetic lethality prediction in human cancers. Bioinformatics 2022; 38:ii106-ii112. [PMID: 36124788 DOI: 10.1093/bioinformatics/btac476] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Synthetic lethality (SL) is a type of genetic interaction in which the simultaneous inactivation of two genes leads to cell death, while the inactivation of a single gene does not affect the cell viability. It can effectively expand the range of anti-cancer therapeutic targets. SL interactions are identified mainly by experimental screening and computational prediction. Recent machine-learning methods mostly learn the representation of each gene individually, ignoring the representation of the pairwise interaction between two genes. In addition, the mechanisms of SL, the key to translating SL into cancer therapeutics, are often unclear. RESULTS To fill the gaps, we propose a pairwise interaction learning-based graph neural network (GNN) named PiLSL to learn the representation of pairwise interaction between two genes for SL prediction. First, we construct an enclosing graph for each pair of genes from a knowledge graph. Secondly, we design an attentive embedding propagation layer in a GNN to discriminate the importance among the edges in the enclosing graph and to learn the latent features of the pairwise interaction from the weighted enclosing graph. Finally, we further fuse the latent features with explicit features extracted from multi-omics data to obtain powerful gene representations for SL prediction. Extensive experimental results demonstrate that PiLSL outperforms the best baseline by a large margin and generalizes well under three realistic scenarios. Besides, PiLSL provides an explanation of SL mechanisms via the weighted paths in the enclosing graphs by attention mechanism. AVAILABILITY AND IMPLEMENTATION Our source code is available at https://github.com/JieZheng-ShanghaiTech/PiLSL.
Collapse
Affiliation(s)
- Xin Liu
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Jiale Yu
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Siyu Tao
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Beiyuan Yang
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Shike Wang
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Lin Wang
- School of Life Science and Technology, Shanghai Tech University, Shanghai 201210, China.,Shanghai Institute for Advanced Immunochemical Studies, Shanghai Tech University, Shanghai 201210, China
| | - Fang Bai
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China.,School of Life Science and Technology, Shanghai Tech University, Shanghai 201210, China.,Shanghai Institute for Advanced Immunochemical Studies, Shanghai Tech University, Shanghai 201210, China
| | - Jie Zheng
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China.,Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai 201210, China
| |
Collapse
|
13
|
Seale C, Tepeli Y, Gonçalves JP. Overcoming selection bias in synthetic lethality prediction. Bioinformatics 2022; 38:4360-4368. [PMID: 35876858 PMCID: PMC9477536 DOI: 10.1093/bioinformatics/btac523] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 07/13/2022] [Accepted: 07/22/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Synthetic lethality (SL) between two genes occurs when simultaneous loss of function leads to cell death. This holds great promise for developing anti-cancer therapeutics that target synthetic lethal pairs of endogenously disrupted genes. Identifying novel SL relationships through exhaustive experimental screens is challenging, due to the vast number of candidate pairs. Computational SL prediction is therefore sought to identify promising SL gene pairs for further experimentation. However, current SL prediction methods lack consideration for generalizability in the presence of selection bias in SL data. RESULTS We show that SL data exhibit considerable gene selection bias. Our experiments designed to assess the robustness of SL prediction reveal that models driven by the topology of known SL interactions (e.g. graph, matrix factorization) are especially sensitive to selection bias. We introduce selection bias-resilient synthetic lethality (SBSL) prediction using regularized logistic regression or random forests. Each gene pair is described by 27 molecular features derived from cancer cell line, cancer patient tissue and healthy donor tissue samples. SBSL models are built and tested using approximately 8000 experimentally derived SL pairs across breast, colon, lung and ovarian cancers. Compared to other SL prediction methods, SBSL showed higher predictive performance, better generalizability and robustness to selection bias. Gene dependency, quantifying the essentiality of a gene for cell survival, contributed most to SBSL predictions. Random forests were superior to linear models in the absence of dependency features, highlighting the relevance of mutual exclusivity of somatic mutations, co-expression in healthy tissue and differential expression in tumour samples. AVAILABILITY AND IMPLEMENTATION https://github.com/joanagoncalveslab/sbsl. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Colm Seale
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft 2628 XE, The Netherlands
- Holland Proton Therapy Center (HollandPTC), Delft 2600 AC, The Netherlands
| | - Yasin Tepeli
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft 2628 XE, The Netherlands
| | - Joana P Gonçalves
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft 2628 XE, The Netherlands
| |
Collapse
|
14
|
Wang J, Wu M, Huang X, Wang L, Zhang S, Liu H, Zheng J. SynLethDB 2.0: a web-based knowledge graph database on synthetic lethality for novel anticancer drug discovery. Database (Oxford) 2022; 2022:6585691. [PMID: 35562840 PMCID: PMC9216587 DOI: 10.1093/database/baac030] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 04/04/2022] [Accepted: 04/24/2022] [Indexed: 11/30/2022]
Abstract
Two genes are synthetic lethal if mutations in both genes result in impaired cell viability, while mutation of either gene does not affect the cell survival. The potential usage of synthetic lethality (SL) in anticancer therapeutics has attracted many researchers to identify synthetic lethal gene pairs. To include newly identified SLs and more related knowledge, we present a new version of the SynLethDB database to facilitate the discovery of clinically relevant SLs. We extended the first version of SynLethDB database significantly by including new SLs identified through Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) screening, a knowledge graph about human SLs, a new web interface, etc. Over 16 000 new SLs and 26 types of other relationships have been added, encompassing relationships among 14 100 genes, 53 cancers, 1898 drugs, etc. Moreover, a brand-new web interface has been developed to include modules such as SL query by disease or compound, SL partner gene set enrichment analysis and knowledge graph browsing through a dynamic graph viewer. The data can be downloaded directly from the website or through the RESTful Application Programming Interfaces (APIs). Database URL: https://synlethdb.sist.shanghaitech.edu.cn/v2.
Collapse
Affiliation(s)
- Jie Wang
- School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Pudong, Shanghai 201210, China
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), 1 Fusionopolis Way, Singapore 138632, Singapore
| | - Xuhui Huang
- School of Computing, National University of Singapore, Computing 1, 13 Computing Drive, Singapore 117417, Singapore
| | - Li Wang
- School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Pudong, Shanghai 201210, China
| | - Sophia Zhang
- College of Agriculture and Life Sciences, Cornell University, 260 Roberts Hall, Ithaca, NY 14853, USA
| | - Hui Liu
- School of Computer Science and Technology, Nanjing Tech University, 30 Puzhu Road, Nanjing 211816, China
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Pudong, Shanghai 201210, China.,Shanghai Engineering Research Center of Intelligent Vision and Imaging, 393 Middle Huaxia Road, Pudong, Shanghai 201210, China
| |
Collapse
|
15
|
Long Y, Wu M, Liu Y, Fang Y, Kwoh CK, Chen J, Luo J, Li X. Pre-training graph neural networks for link prediction in biomedical networks. Bioinformatics 2022; 38:2254-2262. [PMID: 35171981 DOI: 10.1093/bioinformatics/btac100] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 01/15/2022] [Accepted: 02/14/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Graphs or networks are widely utilized to model the interactions between different entities (e.g. proteins, drugs, etc.) for biomedical applications. Predicting potential interactions/links in biomedical networks is important for understanding the pathological mechanisms of various complex human diseases, as well as screening compound targets for drug discovery. Graph neural networks (GNNs) have been utilized for link prediction in various biomedical networks, which rely on the node features extracted from different data sources, e.g. sequence, structure and network data. However, it is challenging to effectively integrate these data sources and automatically extract features for different link prediction tasks. RESULTS In this article, we propose a novel Pre-Training Graph Neural Networks-based framework named PT-GNN to integrate different data sources for link prediction in biomedical networks. First, we design expressive deep learning methods [e.g. convolutional neural network and graph convolutional network (GCN)] to learn features for individual nodes from sequence and structure data. Second, we further propose a GCN-based encoder to effectively refine the node features by modelling the dependencies among nodes in the network. Third, the node features are pre-trained based on graph reconstruction tasks. The pre-trained features can be used for model initialization in downstream tasks. Extensive experiments have been conducted on two critical link prediction tasks, i.e. synthetic lethality (SL) prediction and drug-target interaction (DTI) prediction. Experimental results demonstrate PT-GNN outperforms the state-of-the-art methods for SL prediction and DTI prediction. In addition, the pre-trained features benefit improving the performance and reduce the training time of existing models. AVAILABILITY AND IMPLEMENTATION Python codes and dataset are available at: https://github.com/longyahui/PT-GNN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yahui Long
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research, Singapore, Singapore
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research, Singapore, Singapore
| | - Yong Liu
- Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, Singapore, Singapore
| | - Yuan Fang
- School of Information Systems, Singapore Management University, 178902 Singapore, Singapore
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Jinmiao Chen
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research, Singapore, Singapore
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xiaoli Li
- Institute for Infocomm Research, Agency for Science, Technology and Research, Singapore, Singapore
| |
Collapse
|
16
|
Wang J, Zhang Q, Han J, Zhao Y, Zhao C, Yan B, Dai C, Wu L, Wen Y, Zhang Y, Leng D, Wang Z, Yang X, He S, Bo X. Computational methods, databases and tools for synthetic lethality prediction. Brief Bioinform 2022; 23:6555403. [PMID: 35352098 PMCID: PMC9116379 DOI: 10.1093/bib/bbac106] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 02/15/2022] [Accepted: 03/02/2022] [Indexed: 12/17/2022] Open
Abstract
Synthetic lethality (SL) occurs between two genes when the inactivation of either gene alone has no effect on cell survival but the inactivation of both genes results in cell death. SL-based therapy has become one of the most promising targeted cancer therapies in the last decade as PARP inhibitors achieve great success in the clinic. The key point to exploiting SL-based cancer therapy is the identification of robust SL pairs. Although many wet-lab-based methods have been developed to screen SL pairs, known SL pairs are less than 0.1% of all potential pairs due to large number of human gene combinations. Computational prediction methods complement wet-lab-based methods to effectively reduce the search space of SL pairs. In this paper, we review the recent applications of computational methods and commonly used databases for SL prediction. First, we introduce the concept of SL and its screening methods. Second, various SL-related data resources are summarized. Then, computational methods including statistical-based methods, network-based methods, classical machine learning methods and deep learning methods for SL prediction are summarized. In particular, we elaborate on the negative sampling methods applied in these models. Next, representative tools for SL prediction are introduced. Finally, the challenges and future work for SL prediction are discussed.
Collapse
Affiliation(s)
- Jing Wang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Qinglong Zhang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Junshan Han
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yanpeng Zhao
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Caiyun Zhao
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Bowei Yan
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Chong Dai
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Lianlian Wu
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yuqi Wen
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yixin Zhang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Dongjin Leng
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Zhongming Wang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Xiaoxi Yang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| |
Collapse
|
17
|
Ou-Yang L, Lu F, Zhang ZC, Wu M. Matrix factorization for biomedical link prediction and scRNA-seq data imputation: an empirical survey. Brief Bioinform 2021; 23:6447434. [PMID: 34864871 DOI: 10.1093/bib/bbab479] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 09/25/2021] [Accepted: 10/18/2021] [Indexed: 02/02/2023] Open
Abstract
Advances in high-throughput experimental technologies promote the accumulation of vast number of biomedical data. Biomedical link prediction and single-cell RNA-sequencing (scRNA-seq) data imputation are two essential tasks in biomedical data analyses, which can facilitate various downstream studies and gain insights into the mechanisms of complex diseases. Both tasks can be transformed into matrix completion problems. For a variety of matrix completion tasks, matrix factorization has shown promising performance. However, the sparseness and high dimensionality of biomedical networks and scRNA-seq data have raised new challenges. To resolve these issues, various matrix factorization methods have emerged recently. In this paper, we present a comprehensive review on such matrix factorization methods and their usage in biomedical link prediction and scRNA-seq data imputation. Moreover, we select representative matrix factorization methods and conduct a systematic empirical comparison on 15 real data sets to evaluate their performance under different scenarios. By summarizing the experimental results, we provide general guidelines for selecting matrix factorization methods for different biomedical matrix completion tasks and point out some future directions to further improve the performance for biomedical link prediction and scRNA-seq data imputation.
Collapse
Affiliation(s)
- Le Ou-Yang
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China.,Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen,518172, China
| | - Fan Lu
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Zi-Chao Zhang
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China
| | - Min Wu
- Institute for Infocomm Research (I2R), A*STAR, 138632, Singapore
| |
Collapse
|
18
|
Wang S, Xu F, Li Y, Wang J, Zhang K, Liu Y, Wu M, Zheng J. KG4SL: knowledge graph neural network for synthetic lethality prediction in human cancers. Bioinformatics 2021; 37:i418-i425. [PMID: 34252965 PMCID: PMC8336442 DOI: 10.1093/bioinformatics/btab271] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Motivation Synthetic lethality (SL) is a promising gold mine for the discovery of anti-cancer drug targets. Wet-lab screening of SL pairs is afflicted with high cost, batch-effect, and off-target problems. Current computational methods for SL prediction include gene knock-out simulation, knowledge-based data mining and machine learning methods. Most of the existing methods tend to assume that SL pairs are independent of each other, without taking into account the shared biological mechanisms underlying the SL pairs. Although several methods have incorporated genomic and proteomic data to aid SL prediction, these methods involve manual feature engineering that heavily relies on domain knowledge. Results Here, we propose a novel graph neural network (GNN)-based model, named KG4SL, by incorporating knowledge graph (KG) message-passing into SL prediction. The KG was constructed using 11 kinds of entities including genes, compounds, diseases, biological processes and 24 kinds of relationships that could be pertinent to SL. The integration of KG can help harness the independence issue and circumvent manual feature engineering by conducting message-passing on the KG. Our model outperformed all the state-of-the-art baselines in area under the curve, area under precision-recall curve and F1. Extensive experiments, including the comparison of our model with an unsupervised TransE model, a vanilla graph convolutional network model, and their combination, demonstrated the significant impact of incorporating KG into GNN for SL prediction. Availability and implementation : KG4SL is freely available at https://github.com/JieZheng-ShanghaiTech/KG4SL. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shike Wang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Fan Xu
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Yunyang Li
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Jie Wang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Ke Zhang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China.,Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China
| | - Yong Liu
- Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, Nanyang Technological University, Singapore 639798, Singapore
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China.,Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai, 201210, China
| |
Collapse
|
19
|
Pei F, Shi Q, Zhang H, Bahar I. Predicting Protein-Protein Interactions Using Symmetric Logistic Matrix Factorization. J Chem Inf Model 2021; 61:1670-1682. [PMID: 33831302 DOI: 10.1021/acs.jcim.1c00173] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Accurate assessment of protein-protein interactions (PPIs) is critical to deciphering disease mechanisms and developing novel drugs, and with rapidly growing PPI data, the need for more efficient predictive methods is emerging. We propose here a symmetric logistic matrix factorization (symLMF)-based approach to predict PPIs, especially useful for large PPI networks. Benchmarked against two widely used datasets (Saccharomyces cerevisiae and Homo sapiens benchmarks) and their extended versions, the symLMF-based method proves to outperform most of the state-of-the-art data-driven methods applied to human PPIs, and it shows a performance comparable to those of deep learning methods despite its conceptual and technical simplicity and efficiency. Tests performed on humans, yeast, and tissue (brain and liver)- and disease (neurodegenerative and metabolic disorders)-specific datasets further demonstrate the high capability to capture the hidden interactions. Notably, many "de novo predictions" made by symLMF are verified to exist in PPI databases other than those used for training/testing the method, indicating that the method could be of broad utility as a simple, yet efficient and accurate, tool applicable to PPI datasets.
Collapse
Affiliation(s)
| | - Qingya Shi
- School of Medicine, Tsinghua University, Beijing 100084, China
| | | | | |
Collapse
|
20
|
Long Y, Wu M, Liu Y, Zheng J, Kwoh CK, Luo J, Li X. Graph contextualized attention network for predicting synthetic lethality in human cancers. Bioinformatics 2021; 37:2432-2440. [PMID: 33609108 DOI: 10.1093/bioinformatics/btab110] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Revised: 02/09/2021] [Accepted: 02/16/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Synthetic Lethality (SL) plays an increasingly critical role in the targeted anticancer therapeutics. In addition, identifying SL interactions can create opportunities to selectively kill cancer cells without harming normal cells. Given the high cost of wet-lab experiments, in silico prediction of SL interactions as an alternative can be a rapid and cost-effective way to guide the experimental screening of candidate SL pairs. Several matrix factorization-based methods have recently been proposed for human SL prediction. However, they are limited in capturing the dependencies of neighbors. In addition, it is also highly challenging to make accurate predictions for new genes without any known SL partners. RESULTS In this work, we propose a novel graph contextualized attention network named GCATSL to learn gene representations for SL prediction. First, we leverage different data sources to construct multiple feature graphs for genes, which serve as the feature inputs for our GCATSL method. Second, for each feature graph, we design node-level attention mechanism to effectively capture the importance of local and global neighbors and learn local and global representations for the nodes, respectively. We further exploit multi-layer perceptron (MLP) to aggregate the original features with the local and global representations and then derive the feature-specific representations. Third, to derive the final representations, we design feature-level attention to integrate feature-specific representations by taking the importance of different feature graphs into account. Extensive experimental results on three datasets under different settings demonstrated that our GCATSL model outperforms 14 state-of-the-art methods consistently. In addition, case studies further validated the effectiveness of our proposed model in identifying novel SL pairs. AVAILABILITY Python codes and dataset are freely available on GitHub (https://github.com/longyahui/GCATSL) and Zenodo (https://zenodo.org/record/4522679) under the MIT license.
Collapse
Affiliation(s)
- Yahui Long
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410000, China.,School of Computer Science and Engineering, Nanyang Technological University, Singapore, 639798, Singapore
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), 138632, Singapore
| | - Yong Liu
- Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, Nanyang Technological University, 639798, Singapore
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai, 201210, China
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, 639798, Singapore
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410000, China
| | - Xiaoli Li
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), 138632, Singapore
| |
Collapse
|
21
|
DNA polymerase ι compensates for Fanconi anemia pathway deficiency by countering DNA replication stress. Proc Natl Acad Sci U S A 2020; 117:33436-33445. [PMID: 33376220 DOI: 10.1073/pnas.2008821117] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Fanconi anemia (FA) is caused by defects in cellular responses to DNA crosslinking damage and replication stress. Given the constant occurrence of endogenous DNA damage and replication fork stress, it is unclear why complete deletion of FA genes does not have a major impact on cell proliferation and germ-line FA patients are able to progress through development well into their adulthood. To identify potential cellular mechanisms that compensate for the FA deficiency, we performed dropout screens in FA mutant cells with a whole genome guide RNA library. This uncovered a comprehensive genome-wide profile of FA pathway synthetic lethality, including POLI and CDK4 As little is known of the cellular function of DNA polymerase iota (Pol ι), we focused on its role in the loss-of-function FA knockout mutants. Loss of both FA pathway function and Pol ι leads to synthetic defects in cell proliferation and cell survival, and an increase in DNA damage accumulation. Furthermore, FA-deficient cells depend on the function of Pol ι to resume replication upon replication fork stalling. Our results reveal a critical role for Pol ι in DNA repair and replication fork restart and suggest Pol ι as a target for therapeutic intervention in malignancies carrying an FA gene mutation.
Collapse
|
22
|
G2G: A web-server for the prediction of human synthetic lethal interactions. Comput Struct Biotechnol J 2020; 18:1028-1031. [PMID: 32419903 PMCID: PMC7215103 DOI: 10.1016/j.csbj.2020.04.012] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 04/18/2020] [Accepted: 04/19/2020] [Indexed: 12/04/2022] Open
Abstract
Genetic interactions (GIs) are fundamental to our understanding of biological processes in the cell. While GIs have been systematically mapped in yeast, there is scarce information about them in humans. Recently, we have suggested a state-of-the-art hierarchical method that leverages gene ontology information for predicting GIs in yeast. Here, we adapt this method and apply it for the first time to predict GIs in human. We introduce a web service called G2G for this task that is available at http://bnet.cs.tau.ac.il/g2g/.
Collapse
|
23
|
Wan F, Li S, Tian T, Lei Y, Zhao D, Zeng J. EXP2SL: A Machine Learning Framework for Cell-Line-Specific Synthetic Lethality Prediction. Front Pharmacol 2020; 11:112. [PMID: 32184722 PMCID: PMC7058988 DOI: 10.3389/fphar.2020.00112] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2019] [Accepted: 01/28/2020] [Indexed: 12/13/2022] Open
Abstract
Synthetic lethality (SL), an important type of genetic interaction, can provide useful insight into the target identification process for the development of anticancer therapeutics. Although several well-established SL gene pairs have been verified to be conserved in humans, most SL interactions remain cell-line specific. Here, we demonstrated that the cell-line-specific gene expression profiles derived from the shRNA perturbation experiments performed in the LINCS L1000 project can provide useful features for predicting SL interactions in human. In this paper, we developed a semi-supervised neural network-based method called EXP2SL to accurately identify SL interactions from the L1000 gene expression profiles. Through a systematic evaluation on the SL datasets of three different cell lines, we demonstrated that our model achieved better performance than the baseline methods and verified the effectiveness of using the L1000 gene expression features and the semi-supervise training technique in SL prediction.
Collapse
Affiliation(s)
- Fangping Wan
- Institute of Interdisciplinary Information Science, Tsinghua University, Beijing, China
| | - Shuya Li
- Institute of Interdisciplinary Information Science, Tsinghua University, Beijing, China
| | - Tingzhong Tian
- Institute of Interdisciplinary Information Science, Tsinghua University, Beijing, China
| | - Yipin Lei
- Machine Learning Department, Silexon AI Technology Co. Ltd., Nanjing, China
| | - Dan Zhao
- Institute of Interdisciplinary Information Science, Tsinghua University, Beijing, China
| | - Jianyang Zeng
- Institute of Interdisciplinary Information Science, Tsinghua University, Beijing, China
| |
Collapse
|
24
|
Huang J, Wu M, Lu F, Ou-Yang L, Zhu Z. Predicting synthetic lethal interactions in human cancers using graph regularized self-representative matrix factorization. BMC Bioinformatics 2019; 20:657. [PMID: 31870274 PMCID: PMC6929405 DOI: 10.1186/s12859-019-3197-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2019] [Accepted: 11/05/2019] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Synthetic lethality has attracted a lot of attentions in cancer therapeutics due to its utility in identifying new anticancer drug targets. Identifying synthetic lethal (SL) interactions is the key step towards the exploration of synthetic lethality in cancer treatment. However, biological experiments are faced with many challenges when identifying synthetic lethal interactions. Thus, it is necessary to develop computational methods which could serve as useful complements to biological experiments. RESULTS In this paper, we propose a novel graph regularized self-representative matrix factorization (GRSMF) algorithm for synthetic lethal interaction prediction. GRSMF first learns the self-representations from the known SL interactions and further integrates the functional similarities among genes derived from Gene Ontology (GO). It can then effectively predict potential SL interactions by leveraging the information provided by known SL interactions and functional annotations of genes. Extensive experiments on the synthetic lethal interaction data downloaded from SynLethDB database demonstrate the superiority of our GRSMF in predicting potential synthetic lethal interactions, compared with other competing methods. Moreover, case studies of novel interactions are conducted in this paper for further evaluating the effectiveness of GRSMF in synthetic lethal interaction prediction. CONCLUSIONS In this paper, we demonstrate that by adaptively exploiting the self-representation of original SL interaction data, and utilizing functional similarities among genes to enhance the learning of self-representation matrix, our GRSMF could predict potential SL interactions more accurately than other state-of-the-art SL interaction prediction methods.
Collapse
Affiliation(s)
- Jiang Huang
- College of Computer Science and Software Engineering, Shenzhen University, Nanhai Ave 3688, Shenzhen, 518060, China
| | - Min Wu
- Institute for Infocomm Research (I2R), A*STAR, 1 Fusionopolis Way, Singapore, Singapore
| | - Fan Lu
- Guangdong Key Laboratory of Intelligent Information Processing and Shenzhen Key Laboratory of Media Security, College of Electronics and Information Engineering, Shenzhen University, Nanhai Ave 3688, Shenzhen, 518060, China
| | - Le Ou-Yang
- Guangdong Key Laboratory of Intelligent Information Processing and Shenzhen Key Laboratory of Media Security, College of Electronics and Information Engineering, Shenzhen University, Nanhai Ave 3688, Shenzhen, 518060, China. .,Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, China.
| | - Zexuan Zhu
- College of Computer Science and Software Engineering, Shenzhen University, Nanhai Ave 3688, Shenzhen, 518060, China.
| |
Collapse
|