1
|
Ye BJ, Li DF, Li XY, Hao JL, Liu DJ, Yu H, Zhang CD. Methylation synthetic lethality: Exploiting selective drug targets for cancer therapy. Cancer Lett 2024; 597:217010. [PMID: 38849016 DOI: 10.1016/j.canlet.2024.217010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Revised: 05/26/2024] [Accepted: 05/30/2024] [Indexed: 06/09/2024]
Abstract
In cancer, synthetic lethality refers to the drug-induced inactivation of one gene and the inhibition of another in cancer cells by a drug, resulting in the death of only cancer cells; however, this effect is not present in normal cells, leading to targeted killing of cancer cells. Recent intensive epigenetic research has revealed that aberrant epigenetic changes are more frequently observed than gene mutations in certain cancers. Recently, numerous studies have reported various methylation synthetic lethal combinations involving DNA damage repair genes, metabolic pathway genes, and paralogs with significant results in cellular models, some of which have already entered clinical trials with promising results. This review systematically introduces the advantages of methylation synthetic lethality and describes the lethal mechanisms of methylation synthetic lethal combinations that have recently demonstrated success in cellular models. Furthermore, we discuss the future opportunities and challenges of methylation synthetic lethality in targeted anticancer therapies.
Collapse
Affiliation(s)
- Bing-Jie Ye
- Clinical Medicine, The Fourth Affiliated Hospital of China Medical University, Shenyang 110032, China
| | - Di-Fei Li
- Clinical Medicine, The Fourth Affiliated Hospital of China Medical University, Shenyang 110032, China
| | - Xin-Yun Li
- Clinical Medicine, The Fourth Affiliated Hospital of China Medical University, Shenyang 110032, China
| | - Jia-Lin Hao
- Central Laboratory, The Fourth Affiliated Hospital of China Medical University, Shenyang 110032, China
| | - Di-Jie Liu
- Central Laboratory, The Fourth Affiliated Hospital of China Medical University, Shenyang 110032, China
| | - Hang Yu
- Department of Surgical Oncology, The Fourth Affiliated Hospital of China Medical University, Shenyang 110032, China
| | - Chun-Dong Zhang
- Central Laboratory, The Fourth Affiliated Hospital of China Medical University, Shenyang 110032, China; Department of Surgical Oncology, The Fourth Affiliated Hospital of China Medical University, Shenyang 110032, China.
| |
Collapse
|
2
|
Wu G, Zaker A, Ebrahimi A, Tripathi S, Mer AS. Text-mining-based feature selection for anticancer drug response prediction. BIOINFORMATICS ADVANCES 2024; 4:vbae047. [PMID: 38606185 PMCID: PMC11009020 DOI: 10.1093/bioadv/vbae047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 03/09/2024] [Accepted: 03/22/2024] [Indexed: 04/13/2024]
Abstract
Motivation Predicting anticancer treatment response from baseline genomic data is a critical obstacle in personalized medicine. Machine learning methods are commonly used for predicting drug response from gene expression data. In the process of constructing these machine learning models, one of the most significant challenges is identifying appropriate features among a massive number of genes. Results In this study, we utilize features (genes) extracted using the text-mining of scientific literatures. Using two independent cancer pharmacogenomic datasets, we demonstrate that text-mining-based features outperform traditional feature selection techniques in machine learning tasks. In addition, our analysis reveals that text-mining feature-based machine learning models trained on in vitro data also perform well when predicting the response of in vivo cancer models. Our results demonstrate that text-mining-based feature selection is an easy to implement approach that is suitable for building machine learning models for anticancer drug response prediction. Availability and implementation https://github.com/merlab/text_features.
Collapse
Affiliation(s)
- Grace Wu
- Division of Engineering Science, University of Toronto, Toronto, M5S2E4, Canada
| | - Arvin Zaker
- Department of Biochemistry, Microbiology & Immunology, University of Ottawa, Ottawa, K1H8M5, Canada
- Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, K1H8M5, Canada
| | - Amirhosein Ebrahimi
- Department of Biochemistry, Microbiology & Immunology, University of Ottawa, Ottawa, K1H8M5, Canada
| | - Shivanshi Tripathi
- Department of Biochemistry, Microbiology & Immunology, University of Ottawa, Ottawa, K1H8M5, Canada
- Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, K1H8M5, Canada
| | - Arvind Singh Mer
- Department of Biochemistry, Microbiology & Immunology, University of Ottawa, Ottawa, K1H8M5, Canada
- Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, K1H8M5, Canada
- School of Electrical Engineering & Computer Science, University of Ottawa, Ottawa, K1N6N5, Canada
| |
Collapse
|
3
|
Tepeli YI, Seale C, Gonçalves JP. ELISL: early-late integrated synthetic lethality prediction in cancer. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btad764. [PMID: 38113447 DOI: 10.1093/bioinformatics/btad764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 11/06/2023] [Accepted: 12/18/2023] [Indexed: 12/21/2023]
Abstract
MOTIVATION Anti-cancer therapies based on synthetic lethality (SL) exploit tumour vulnerabilities for treatment with reduced side effects, by targeting a gene that is jointly essential with another whose function is lost. Computational prediction is key to expedite SL screening, yet existing methods are vulnerable to prevalent selection bias in SL data and reliant on cancer or tissue type-specific omics, which can be scarce. Notably, sequence similarity remains underexplored as a proxy for related gene function and joint essentiality. RESULTS We propose ELISL, Early-Late Integrated SL prediction with forest ensembles, using context-free protein sequence embeddings and context-specific omics from cell lines and tissue. Across eight cancer types, ELISL showed superior robustness to selection bias and recovery of known SL genes, as well as promising cross-cancer predictions. Co-occurring mutations in a BRCA gene and ELISL-predicted pairs from the HH, FGF, WNT, or NEIL gene families were associated with longer patient survival times, revealing therapeutic potential. AVAILABILITY AND IMPLEMENTATION Data: 10.6084/m9.figshare.23607558 & Code: github.com/joanagoncalveslab/ELISL.
Collapse
Affiliation(s)
- Yasin I Tepeli
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft, The Netherlands
| | - Colm Seale
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft, The Netherlands
- Holland Proton Therapy Center (HollandPTC), Delft, The Netherlands
| | - Joana P Gonçalves
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft, The Netherlands
| |
Collapse
|
4
|
Cai C, Radhakrishnan A, Uhler C. Synthetic Lethality Screening with Recursive Feature Machines. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.03.569803. [PMID: 38106093 PMCID: PMC10723282 DOI: 10.1101/2023.12.03.569803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Synthetic lethality refers to a genetic interaction where the simultaneous perturbation of gene pairs leads to cell death. Synthetically lethal gene pairs (SL pairs) provide a potential avenue for selectively targeting cancer cells based on genetic vulnerabilities. The rise of large-scale gene perturbation screens such as the Cancer Dependency Map (DepMap) offers the opportunity to identify SL pairs automatically using machine learning. We build on a recently developed class of feature learning kernel machines known as Recursive Feature Machines (RFMs) to develop a pipeline for identifying SL pairs based on CRISPR viability data from DepMap. In particular, we first train RFMs to predict viability scores for a given CRISPR gene knockout from cell line embeddings consisting of gene expression and mutation features. After training, RFMs use a statistical operator known as average gradient outer product to provide weights for each feature indicating the importance of each feature in predicting cellular viability. We subsequently apply correlation-based filters to re-weight RFM feature importances and identify those features that are most indicative of low cellular viability. Our resulting pipeline is computationally efficient, taking under 3 minutes for analyzing all 17, 453 knockouts from DepMap for candidate SL pairs. We show that our pipeline more accurately recovers experimentally verified SL pairs than prior approaches. Moreover, our pipeline finds new candidate SL pairs, thereby opening novel avenues for identifying genetic vulnerabilities in cancer.
Collapse
Affiliation(s)
- Cathy Cai
- Eric and Wendy Schmidt Center, Broad Institute of MIT and Harvard
- Laboratory of Information and Decision Systems, Massachusetts Institute of Technology
| | - Adityanarayanan Radhakrishnan
- Eric and Wendy Schmidt Center, Broad Institute of MIT and Harvard
- School of Engineering and Applied Sciences, Harvard University
| | - Caroline Uhler
- Eric and Wendy Schmidt Center, Broad Institute of MIT and Harvard
- Laboratory of Information and Decision Systems, Massachusetts Institute of Technology
| |
Collapse
|
5
|
Wang J, Wen Y, Zhang Y, Wang Z, Jiang Y, Dai C, Wu L, Leng D, He S, Bo X. An interpretable artificial intelligence framework for designing synthetic lethality-based anti-cancer combination therapies. J Adv Res 2023:S2090-1232(23)00374-0. [PMID: 38043609 DOI: 10.1016/j.jare.2023.11.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 11/27/2023] [Accepted: 11/29/2023] [Indexed: 12/05/2023] Open
Abstract
INTRODUCTION Synthetic lethality (SL) provides an opportunity to leverage different genetic interactions when designing synergistic combination therapies. To further explore SL-based combination therapies for cancer treatment, it is important to identify and mechanistically characterize more SL interactions. Artificial intelligence (AI) methods have recently been proposed for SL prediction, but the results of these models are often not interpretable such that deriving the underlying mechanism can be challenging. OBJECTIVES This study aims to develop an interpretable AI framework for SL prediction and subsequently utilize it to design SL-based synergistic combination therapies. METHODS We propose a knowledge and data dual-driven AI framework for SL prediction (KDDSL). Specifically, we use gene knowledge related to the SL mechanism to guide the construction of the model and develop a method to identify the most relevant gene knowledge for the predicted results. RESULTS Experimental and literature-based validation confirmed a good balance between predictive and interpretable ability when using KDDSL. Moreover, we demonstrated that KDDSL could help to discover promising drug combinations and clarify associated biological processes, such as the combination of MDM2 and CDK9 inhibitors, which exhibited significant anti-cancer effects in vitro and in vivo. CONCLUSION These data underscore the potential of KDDSL to guide SL-based combination therapy design. There is a need for biomedicine-focused AI strategies to combine rational biological knowledge with developed models.
Collapse
Affiliation(s)
- Jing Wang
- School of Medicine, Tsinghua University, Beijing, 100084, China
| | - Yuqi Wen
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Yixin Zhang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Zhongming Wang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Yuyang Jiang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Chong Dai
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Lianlian Wu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Dongjin Leng
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China.
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China.
| |
Collapse
|
6
|
Pu M, Cheng K, Li X, Xin Y, Wei L, Jin S, Zheng W, Peng G, Tang Q, Zhou J, Zhang Y. Using graph-based model to identify cell specific synthetic lethal effects. Comput Struct Biotechnol J 2023; 21:5099-5110. [PMID: 37920819 PMCID: PMC10618116 DOI: 10.1016/j.csbj.2023.10.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 10/05/2023] [Accepted: 10/06/2023] [Indexed: 11/04/2023] Open
Abstract
Synthetic lethal (SL) pairs are pairs of genes whose simultaneous loss-of-function results in cell death, while a damaging mutation of either gene alone does not affect the cell's survival. This makes SL pairs attractive targets for precision cancer therapies, as targeting the unimpaired gene of the SL pair can selectively kill cancer cells that already harbor the impaired gene. Limited by the difficulty of finding true SL pairs, especially on specific cell types, current computational approaches provide only limited insights because of overlooking the crucial aspects of cellular context dependency and mechanistic understanding of SL pairs. As a result, the identification of SL targets still relies on expensive, time-consuming experimental approaches. In this work, we applied cell-line specific multi-omics data to a specially designed deep learning model to predict cell-line specific SL pairs. Through incorporating multiple types of cell-specific omics data with a self-attention module, we represent gene relationships as graphs. Our approach achieves the prediction of SL pairs in a cell-specific manner and demonstrates the potential to facilitate the discovery of cell-specific SL targets for cancer therapeutics, providing a tool to unearth mechanisms underlying the origin of SL in cancer biology. The code and data of our approach can be found at https://github.com/promethiume/SLwise.
Collapse
Affiliation(s)
| | - Kaiyang Cheng
- StoneWise, AI, Ltd., Beijing, China
- Nanjing University of Chinese Medicine, Shanghai, China
| | - Xiaorong Li
- StoneWise, AI, Ltd., Beijing, China
- Minzu University of China, Beijing, China
| | | | | | - Sutong Jin
- StoneWise, AI, Ltd., Beijing, China
- Harbin Institute of Technology, Weihai, China
| | | | | | - Qihong Tang
- StoneWise, AI, Ltd., Beijing, China
- Guilin University of Electronic Science and Technology, Guangxi, China
| | | | | |
Collapse
|
7
|
Tang S, Gökbağ B, Fan K, Shao S, Huo Y, Wu X, Cheng L, Li L. Synthetic lethal gene pairs: Experimental approaches and predictive models. Front Genet 2022; 13:961611. [PMID: 36531238 PMCID: PMC9751344 DOI: 10.3389/fgene.2022.961611] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 11/07/2022] [Indexed: 03/27/2024] Open
Abstract
Synthetic lethality (SL) refers to a genetic interaction in which the simultaneous perturbation of two genes leads to cell or organism death, whereas viability is maintained when only one of the pair is altered. The experimental exploration of these pairs and predictive modeling in computational biology contribute to our understanding of cancer biology and the development of cancer therapies. We extensively reviewed experimental technologies, public data sources, and predictive models in the study of synthetic lethal gene pairs and herein detail biological assumptions, experimental data, statistical models, and computational schemes of various predictive models, speculate regarding their influence on individual sample- and population-based synthetic lethal interactions, discuss the pros and cons of existing SL data and models, and highlight potential research directions in SL discovery.
Collapse
Affiliation(s)
- Shan Tang
- College of Pharmacy, The Ohio State University, Columbus, OH, United States
| | - Birkan Gökbağ
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Kunjie Fan
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Shuai Shao
- College of Pharmacy, The Ohio State University, Columbus, OH, United States
| | - Yang Huo
- Indiana University, Bloomington, IN, United States
| | - Xue Wu
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Lijun Cheng
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Lang Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| |
Collapse
|
8
|
Wang S, Feng Y, Liu X, Liu Y, Wu M, Zheng J. NSF4SL: negative-sample-free contrastive learning for ranking synthetic lethal partner genes in human cancers. Bioinformatics 2022; 38:ii13-ii19. [PMID: 36124790 DOI: 10.1093/bioinformatics/btac462] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Detecting synthetic lethality (SL) is a promising strategy for identifying anti-cancer drug targets. Targeting SL partners of a primary gene mutated in cancer is selectively lethal to cancer cells. Due to high cost of wet-lab experiments and availability of gold standard SL data, supervised machine learning for SL prediction has been popular. However, most of the methods are based on binary classification and thus limited by the lack of reliable negative data. Contrastive learning can train models without any negative sample and is thus promising for finding novel SLs. RESULTS We propose NSF4SL, a negative-sample-free SL prediction model based on a contrastive learning framework. It captures the characteristics of positive SL samples by using two branches of neural networks that interact with each other to learn SL-related gene representations. Moreover, a feature-wise data augmentation strategy is used to mitigate the sparsity of SL data. NSF4SL significantly outperforms all baselines which require negative samples, even in challenging experimental settings. To the best of our knowledge, this is the first time that SL prediction is formulated as a gene ranking problem, which is more practical than the current formulation as binary classification. NSF4SL is the first contrastive learning method for SL prediction and its success points to a new direction of machine-learning methods for identifying novel SLs. AVAILABILITY AND IMPLEMENTATION Our source code is available at https://github.com/JieZheng-ShanghaiTech/NSF4SL. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shike Wang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Yimiao Feng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Xin Liu
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Yong Liu
- Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, Nanyang Technological University, Singapore 639798, Singapore
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China.,Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai 201210, China
| |
Collapse
|
9
|
Liu X, Yu J, Tao S, Yang B, Wang S, Wang L, Bai F, Zheng J. PiLSL: pairwise interaction learning-based graph neural network for synthetic lethality prediction in human cancers. Bioinformatics 2022; 38:ii106-ii112. [PMID: 36124788 DOI: 10.1093/bioinformatics/btac476] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Synthetic lethality (SL) is a type of genetic interaction in which the simultaneous inactivation of two genes leads to cell death, while the inactivation of a single gene does not affect the cell viability. It can effectively expand the range of anti-cancer therapeutic targets. SL interactions are identified mainly by experimental screening and computational prediction. Recent machine-learning methods mostly learn the representation of each gene individually, ignoring the representation of the pairwise interaction between two genes. In addition, the mechanisms of SL, the key to translating SL into cancer therapeutics, are often unclear. RESULTS To fill the gaps, we propose a pairwise interaction learning-based graph neural network (GNN) named PiLSL to learn the representation of pairwise interaction between two genes for SL prediction. First, we construct an enclosing graph for each pair of genes from a knowledge graph. Secondly, we design an attentive embedding propagation layer in a GNN to discriminate the importance among the edges in the enclosing graph and to learn the latent features of the pairwise interaction from the weighted enclosing graph. Finally, we further fuse the latent features with explicit features extracted from multi-omics data to obtain powerful gene representations for SL prediction. Extensive experimental results demonstrate that PiLSL outperforms the best baseline by a large margin and generalizes well under three realistic scenarios. Besides, PiLSL provides an explanation of SL mechanisms via the weighted paths in the enclosing graphs by attention mechanism. AVAILABILITY AND IMPLEMENTATION Our source code is available at https://github.com/JieZheng-ShanghaiTech/PiLSL.
Collapse
Affiliation(s)
- Xin Liu
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Jiale Yu
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Siyu Tao
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Beiyuan Yang
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Shike Wang
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Lin Wang
- School of Life Science and Technology, Shanghai Tech University, Shanghai 201210, China.,Shanghai Institute for Advanced Immunochemical Studies, Shanghai Tech University, Shanghai 201210, China
| | - Fang Bai
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China.,School of Life Science and Technology, Shanghai Tech University, Shanghai 201210, China.,Shanghai Institute for Advanced Immunochemical Studies, Shanghai Tech University, Shanghai 201210, China
| | - Jie Zheng
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China.,Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai 201210, China
| |
Collapse
|
10
|
Seale C, Tepeli Y, Gonçalves JP. Overcoming selection bias in synthetic lethality prediction. Bioinformatics 2022; 38:4360-4368. [PMID: 35876858 PMCID: PMC9477536 DOI: 10.1093/bioinformatics/btac523] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 07/13/2022] [Accepted: 07/22/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Synthetic lethality (SL) between two genes occurs when simultaneous loss of function leads to cell death. This holds great promise for developing anti-cancer therapeutics that target synthetic lethal pairs of endogenously disrupted genes. Identifying novel SL relationships through exhaustive experimental screens is challenging, due to the vast number of candidate pairs. Computational SL prediction is therefore sought to identify promising SL gene pairs for further experimentation. However, current SL prediction methods lack consideration for generalizability in the presence of selection bias in SL data. RESULTS We show that SL data exhibit considerable gene selection bias. Our experiments designed to assess the robustness of SL prediction reveal that models driven by the topology of known SL interactions (e.g. graph, matrix factorization) are especially sensitive to selection bias. We introduce selection bias-resilient synthetic lethality (SBSL) prediction using regularized logistic regression or random forests. Each gene pair is described by 27 molecular features derived from cancer cell line, cancer patient tissue and healthy donor tissue samples. SBSL models are built and tested using approximately 8000 experimentally derived SL pairs across breast, colon, lung and ovarian cancers. Compared to other SL prediction methods, SBSL showed higher predictive performance, better generalizability and robustness to selection bias. Gene dependency, quantifying the essentiality of a gene for cell survival, contributed most to SBSL predictions. Random forests were superior to linear models in the absence of dependency features, highlighting the relevance of mutual exclusivity of somatic mutations, co-expression in healthy tissue and differential expression in tumour samples. AVAILABILITY AND IMPLEMENTATION https://github.com/joanagoncalveslab/sbsl. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Colm Seale
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft 2628 XE, The Netherlands
- Holland Proton Therapy Center (HollandPTC), Delft 2600 AC, The Netherlands
| | - Yasin Tepeli
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft 2628 XE, The Netherlands
| | - Joana P Gonçalves
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft 2628 XE, The Netherlands
| |
Collapse
|
11
|
Wang J, Zhang Q, Han J, Zhao Y, Zhao C, Yan B, Dai C, Wu L, Wen Y, Zhang Y, Leng D, Wang Z, Yang X, He S, Bo X. Computational methods, databases and tools for synthetic lethality prediction. Brief Bioinform 2022; 23:6555403. [PMID: 35352098 PMCID: PMC9116379 DOI: 10.1093/bib/bbac106] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 02/15/2022] [Accepted: 03/02/2022] [Indexed: 12/17/2022] Open
Abstract
Synthetic lethality (SL) occurs between two genes when the inactivation of either gene alone has no effect on cell survival but the inactivation of both genes results in cell death. SL-based therapy has become one of the most promising targeted cancer therapies in the last decade as PARP inhibitors achieve great success in the clinic. The key point to exploiting SL-based cancer therapy is the identification of robust SL pairs. Although many wet-lab-based methods have been developed to screen SL pairs, known SL pairs are less than 0.1% of all potential pairs due to large number of human gene combinations. Computational prediction methods complement wet-lab-based methods to effectively reduce the search space of SL pairs. In this paper, we review the recent applications of computational methods and commonly used databases for SL prediction. First, we introduce the concept of SL and its screening methods. Second, various SL-related data resources are summarized. Then, computational methods including statistical-based methods, network-based methods, classical machine learning methods and deep learning methods for SL prediction are summarized. In particular, we elaborate on the negative sampling methods applied in these models. Next, representative tools for SL prediction are introduced. Finally, the challenges and future work for SL prediction are discussed.
Collapse
Affiliation(s)
- Jing Wang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Qinglong Zhang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Junshan Han
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yanpeng Zhao
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Caiyun Zhao
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Bowei Yan
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Chong Dai
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Lianlian Wu
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yuqi Wen
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yixin Zhang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Dongjin Leng
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Zhongming Wang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Xiaoxi Yang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| |
Collapse
|
12
|
Caudai C, Galizia A, Geraci F, Le Pera L, Morea V, Salerno E, Via A, Colombo T. AI applications in functional genomics. Comput Struct Biotechnol J 2021; 19:5762-5790. [PMID: 34765093 PMCID: PMC8566780 DOI: 10.1016/j.csbj.2021.10.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 10/05/2021] [Accepted: 10/05/2021] [Indexed: 12/13/2022] Open
Abstract
We review the current applications of artificial intelligence (AI) in functional genomics. The recent explosion of AI follows the remarkable achievements made possible by "deep learning", along with a burst of "big data" that can meet its hunger. Biology is about to overthrow astronomy as the paradigmatic representative of big data producer. This has been made possible by huge advancements in the field of high throughput technologies, applied to determine how the individual components of a biological system work together to accomplish different processes. The disciplines contributing to this bulk of data are collectively known as functional genomics. They consist in studies of: i) the information contained in the DNA (genomics); ii) the modifications that DNA can reversibly undergo (epigenomics); iii) the RNA transcripts originated by a genome (transcriptomics); iv) the ensemble of chemical modifications decorating different types of RNA transcripts (epitranscriptomics); v) the products of protein-coding transcripts (proteomics); and vi) the small molecules produced from cell metabolism (metabolomics) present in an organism or system at a given time, in physiological or pathological conditions. After reviewing main applications of AI in functional genomics, we discuss important accompanying issues, including ethical, legal and economic issues and the importance of explainability.
Collapse
Affiliation(s)
- Claudia Caudai
- CNR, Institute of Information Science and Technologies “A. Faedo” (ISTI), Pisa, Italy
| | - Antonella Galizia
- CNR, Institute of Applied Mathematics and Information Technologies (IMATI), Genoa, Italy
| | - Filippo Geraci
- CNR, Institute for Informatics and Telematics (IIT), Pisa, Italy
| | - Loredana Le Pera
- CNR, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies (IBIOM), Bari, Italy
- CNR, Institute of Molecular Biology and Pathology (IBPM), Rome, Italy
| | - Veronica Morea
- CNR, Institute of Molecular Biology and Pathology (IBPM), Rome, Italy
| | - Emanuele Salerno
- CNR, Institute of Information Science and Technologies “A. Faedo” (ISTI), Pisa, Italy
| | - Allegra Via
- CNR, Institute of Molecular Biology and Pathology (IBPM), Rome, Italy
| | - Teresa Colombo
- CNR, Institute of Molecular Biology and Pathology (IBPM), Rome, Italy
| |
Collapse
|
13
|
Wang S, Xu F, Li Y, Wang J, Zhang K, Liu Y, Wu M, Zheng J. KG4SL: knowledge graph neural network for synthetic lethality prediction in human cancers. Bioinformatics 2021; 37:i418-i425. [PMID: 34252965 PMCID: PMC8336442 DOI: 10.1093/bioinformatics/btab271] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Motivation Synthetic lethality (SL) is a promising gold mine for the discovery of anti-cancer drug targets. Wet-lab screening of SL pairs is afflicted with high cost, batch-effect, and off-target problems. Current computational methods for SL prediction include gene knock-out simulation, knowledge-based data mining and machine learning methods. Most of the existing methods tend to assume that SL pairs are independent of each other, without taking into account the shared biological mechanisms underlying the SL pairs. Although several methods have incorporated genomic and proteomic data to aid SL prediction, these methods involve manual feature engineering that heavily relies on domain knowledge. Results Here, we propose a novel graph neural network (GNN)-based model, named KG4SL, by incorporating knowledge graph (KG) message-passing into SL prediction. The KG was constructed using 11 kinds of entities including genes, compounds, diseases, biological processes and 24 kinds of relationships that could be pertinent to SL. The integration of KG can help harness the independence issue and circumvent manual feature engineering by conducting message-passing on the KG. Our model outperformed all the state-of-the-art baselines in area under the curve, area under precision-recall curve and F1. Extensive experiments, including the comparison of our model with an unsupervised TransE model, a vanilla graph convolutional network model, and their combination, demonstrated the significant impact of incorporating KG into GNN for SL prediction. Availability and implementation : KG4SL is freely available at https://github.com/JieZheng-ShanghaiTech/KG4SL. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shike Wang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Fan Xu
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Yunyang Li
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Jie Wang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Ke Zhang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China.,Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China
| | - Yong Liu
- Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, Nanyang Technological University, Singapore 639798, Singapore
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China.,Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai, 201210, China
| |
Collapse
|