1
|
Feng Y, Long Y, Wang H, Ouyang Y, Li Q, Wu M, Zheng J. Benchmarking machine learning methods for synthetic lethality prediction in cancer. Nat Commun 2024; 15:9058. [PMID: 39428397 PMCID: PMC11491473 DOI: 10.1038/s41467-024-52900-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 09/23/2024] [Indexed: 10/22/2024] Open
Abstract
Synthetic lethality (SL) is a gold mine of anticancer drug targets, exposing cancer-specific dependencies of cellular survival. To complement resource-intensive experimental screening, many machine learning methods for SL prediction have emerged recently. However, a comprehensive benchmarking is lacking. This study systematically benchmarks 12 recent machine learning methods for SL prediction, assessing their performance across diverse data splitting scenarios, negative sample ratios, and negative sampling techniques, on both classification and ranking tasks. We observe that all the methods can perform significantly better by improving data quality, e.g., excluding computationally derived SLs from training and sampling negative labels based on gene expression. Among the methods, SLMGAE performs the best. Furthermore, the methods have limitations in realistic scenarios such as cold-start independent tests and context-specific SLs. These results, together with source code and datasets made freely available, provide guidance for selecting suitable methods and developing more powerful techniques for SL virtual screening.
Collapse
Affiliation(s)
- Yimiao Feng
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
- Lingang Laboratory, Shanghai, China
| | - Yahui Long
- Bioformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - He Wang
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Yang Ouyang
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Quan Li
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore.
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China.
- Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai, China.
| |
Collapse
|
2
|
Fan K, Gökbağ B, Tang S, Li S, Huang Y, Wang L, Cheng L, Li L. Synthetic lethal connectivity and graph transformer improve synthetic lethality prediction. Brief Bioinform 2024; 25:bbae425. [PMID: 39210507 PMCID: PMC11361842 DOI: 10.1093/bib/bbae425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 06/14/2024] [Accepted: 08/16/2024] [Indexed: 09/04/2024] Open
Abstract
Synthetic lethality (SL) has shown great promise for the discovery of novel targets in cancer. CRISPR double-knockout (CDKO) technologies can only screen several hundred genes and their combinations, but not genome-wide. Therefore, good SL prediction models are highly needed for genes and gene pairs selection in CDKO experiments. However, lack of scalable SL properties prevents generalizability of SL interactions to out-of-sample data, thereby hindering modeling efforts. In this paper, we recognize that SL connectivity is a scalable and generalizable SL property. We develop a novel two-step multilayer encoder for individual sample-specific SL prediction model (MLEC-iSL), which predicts SL connectivity first and SL interactions subsequently. MLEC-iSL has three encoders, namely, gene, graph, and transformer encoders. MLEC-iSL achieves high SL prediction performance in K562 (AUPR, 0.73; AUC, 0.72) and Jurkat (AUPR, 0.73; AUC, 0.71) cells, while no existing methods exceed 0.62 AUPR and AUC. The prediction performance of MLEC-iSL is validated in a CDKO experiment in 22Rv1 cells, yielding a 46.8% SL rate among 987 selected gene pairs. The screen also reveals SL dependency between apoptosis and mitosis cell death pathways.
Collapse
Affiliation(s)
- Kunjie Fan
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Birkan Gökbağ
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Shan Tang
- Department of Biomedical Informatics, College of Pharmacy, The Ohio State University, 500 W. 12 ave, Columbus, OH 43210, United States
| | - Shangjia Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Yirui Huang
- Department of Biomedical Informatics, College of Pharmacy, The Ohio State University, 500 W. 12 ave, Columbus, OH 43210, United States
| | - Lingling Wang
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Lijun Cheng
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Lang Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
- Department of Biomedical Informatics, College of Pharmacy, The Ohio State University, 500 W. 12 ave, Columbus, OH 43210, United States
| |
Collapse
|
3
|
Gogoshin G, Rodin AS. Graph Neural Networks in Cancer and Oncology Research: Emerging and Future Trends. Cancers (Basel) 2023; 15:5858. [PMID: 38136405 PMCID: PMC10742144 DOI: 10.3390/cancers15245858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/09/2023] [Accepted: 12/14/2023] [Indexed: 12/24/2023] Open
Abstract
Next-generation cancer and oncology research needs to take full advantage of the multimodal structured, or graph, information, with the graph data types ranging from molecular structures to spatially resolved imaging and digital pathology, biological networks, and knowledge graphs. Graph Neural Networks (GNNs) efficiently combine the graph structure representations with the high predictive performance of deep learning, especially on large multimodal datasets. In this review article, we survey the landscape of recent (2020-present) GNN applications in the context of cancer and oncology research, and delineate six currently predominant research areas. We then identify the most promising directions for future research. We compare GNNs with graphical models and "non-structured" deep learning, and devise guidelines for cancer and oncology researchers or physician-scientists, asking the question of whether they should adopt the GNN methodology in their research pipelines.
Collapse
Affiliation(s)
- Grigoriy Gogoshin
- Department of Computational and Quantitative Medicine, Beckman Research Institute, and Diabetes and Metabolism Research Institute, City of Hope National Medical Center, 1500 East Duarte Road, Duarte, CA 91010, USA
| | - Andrei S. Rodin
- Department of Computational and Quantitative Medicine, Beckman Research Institute, and Diabetes and Metabolism Research Institute, City of Hope National Medical Center, 1500 East Duarte Road, Duarte, CA 91010, USA
| |
Collapse
|
4
|
Pu M, Cheng K, Li X, Xin Y, Wei L, Jin S, Zheng W, Peng G, Tang Q, Zhou J, Zhang Y. Using graph-based model to identify cell specific synthetic lethal effects. Comput Struct Biotechnol J 2023; 21:5099-5110. [PMID: 37920819 PMCID: PMC10618116 DOI: 10.1016/j.csbj.2023.10.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 10/05/2023] [Accepted: 10/06/2023] [Indexed: 11/04/2023] Open
Abstract
Synthetic lethal (SL) pairs are pairs of genes whose simultaneous loss-of-function results in cell death, while a damaging mutation of either gene alone does not affect the cell's survival. This makes SL pairs attractive targets for precision cancer therapies, as targeting the unimpaired gene of the SL pair can selectively kill cancer cells that already harbor the impaired gene. Limited by the difficulty of finding true SL pairs, especially on specific cell types, current computational approaches provide only limited insights because of overlooking the crucial aspects of cellular context dependency and mechanistic understanding of SL pairs. As a result, the identification of SL targets still relies on expensive, time-consuming experimental approaches. In this work, we applied cell-line specific multi-omics data to a specially designed deep learning model to predict cell-line specific SL pairs. Through incorporating multiple types of cell-specific omics data with a self-attention module, we represent gene relationships as graphs. Our approach achieves the prediction of SL pairs in a cell-specific manner and demonstrates the potential to facilitate the discovery of cell-specific SL targets for cancer therapeutics, providing a tool to unearth mechanisms underlying the origin of SL in cancer biology. The code and data of our approach can be found at https://github.com/promethiume/SLwise.
Collapse
Affiliation(s)
| | - Kaiyang Cheng
- StoneWise, AI, Ltd., Beijing, China
- Nanjing University of Chinese Medicine, Shanghai, China
| | - Xiaorong Li
- StoneWise, AI, Ltd., Beijing, China
- Minzu University of China, Beijing, China
| | | | | | - Sutong Jin
- StoneWise, AI, Ltd., Beijing, China
- Harbin Institute of Technology, Weihai, China
| | | | | | - Qihong Tang
- StoneWise, AI, Ltd., Beijing, China
- Guilin University of Electronic Science and Technology, Guangxi, China
| | | | | |
Collapse
|