1
|
Zhang S, Li P, Wang S, Zhu J, Huang Z, Cai F, Freidel S, Ling F, Schwarz E, Chen J. BioM2: biologically informed multi-stage machine learning for phenotype prediction using omics data. Brief Bioinform 2024; 25:bbae384. [PMID: 39126426 PMCID: PMC11316398 DOI: 10.1093/bib/bbae384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 06/15/2024] [Accepted: 07/24/2024] [Indexed: 08/12/2024] Open
Abstract
Navigating the complex landscape of high-dimensional omics data with machine learning models presents a significant challenge. The integration of biological domain knowledge into these models has shown promise in creating more meaningful stratifications of predictor variables, leading to algorithms that are both more accurate and generalizable. However, the wider availability of machine learning tools capable of incorporating such biological knowledge remains limited. Addressing this gap, we introduce BioM2, a novel R package designed for biologically informed multistage machine learning. BioM2 uniquely leverages biological information to effectively stratify and aggregate high-dimensional biological data in the context of machine learning. Demonstrating its utility with genome-wide DNA methylation and transcriptome-wide gene expression data, BioM2 has shown to enhance predictive performance, surpassing traditional machine learning models that operate without the integration of biological knowledge. A key feature of BioM2 is its ability to rank predictor variables within biological categories, specifically Gene Ontology pathways. This functionality not only aids in the interpretability of the results but also enables a subsequent modular network analysis of these variables, shedding light on the intricate systems-level biology underpinning the predictive outcome. We have proposed a biologically informed multistage machine learning framework termed BioM2 for phenotype prediction based on omics data. BioM2 has been incorporated into the BioM2 CRAN package (https://cran.r-project.org/web/packages/BioM2/index.html).
Collapse
Affiliation(s)
- Shunjie Zhang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Pan Li
- Center for Intelligent Medicine, Greater Bay Area Institute of Precision Medicine (Guangzhou), School of Life Sciences, Fudan University, No. 6, 2nd Nanjiang Road, Nansha District, 511462 Guangzhou, China
| | - Shenghan Wang
- Center for Intelligent Medicine, Greater Bay Area Institute of Precision Medicine (Guangzhou), School of Life Sciences, Fudan University, No. 6, 2nd Nanjiang Road, Nansha District, 511462 Guangzhou, China
| | - Jijun Zhu
- Center for Intelligent Medicine, Greater Bay Area Institute of Precision Medicine (Guangzhou), School of Life Sciences, Fudan University, No. 6, 2nd Nanjiang Road, Nansha District, 511462 Guangzhou, China
| | - Zhongting Huang
- Center for Intelligent Medicine, Greater Bay Area Institute of Precision Medicine (Guangzhou), School of Life Sciences, Fudan University, No. 6, 2nd Nanjiang Road, Nansha District, 511462 Guangzhou, China
| | - Fuqiang Cai
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Sebastian Freidel
- Hector Institute for Artificial Intelligence in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, M7, Mannheim 68161, Germany
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, J5, Mannheim 68159, Germany
| | - Fei Ling
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Emanuel Schwarz
- Hector Institute for Artificial Intelligence in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, M7, Mannheim 68161, Germany
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, J5, Mannheim 68159, Germany
| | - Junfang Chen
- Center for Intelligent Medicine, Greater Bay Area Institute of Precision Medicine (Guangzhou), School of Life Sciences, Fudan University, No. 6, 2nd Nanjiang Road, Nansha District, 511462 Guangzhou, China
- Center for Evolutionary Biology, School of Life Sciences, Fudan University, Shanghai, China
| |
Collapse
|
2
|
Yu J, Chen N, Zheng Z, Gao M, Liang N, Wong KC. Chromothripsis detection with multiple myeloma patients based on deep graph learning. Bioinformatics 2023; 39:btad422. [PMID: 37399092 PMCID: PMC10343948 DOI: 10.1093/bioinformatics/btad422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 06/20/2023] [Accepted: 06/30/2023] [Indexed: 07/05/2023] Open
Abstract
MOTIVATION Chromothripsis, associated with poor clinical outcomes, is prognostically vital in multiple myeloma. The catastrophic event is reported to be detectable prior to the progression of multiple myeloma. As a result, chromothripsis detection can contribute to risk estimation and early treatment guidelines for multiple myeloma patients. However, manual diagnosis remains the gold standard approach to detect chromothripsis events with the whole-genome sequencing technology to retrieve both copy number variation (CNV) and structural variation data. Meanwhile, CNV data are much easier to obtain than structural variation data. Hence, in order to reduce the reliance on human experts' efforts and structural variation data extraction, it is necessary to establish a reliable and accurate chromothripsis detection method based on CNV data. RESULTS To address those issues, we propose a method to detect chromothripsis solely based on CNV data. With the help of structure learning, the intrinsic relationship-directed acyclic graph of CNV features is inferred to derive a CNV embedding graph (i.e. CNV-DAG). Subsequently, a neural network based on Graph Transformer, local feature extraction, and non-linear feature interaction, is proposed with the embedding graph as the input to distinguish whether the chromothripsis event occurs. Ablation experiments, clustering, and feature importance analysis are also conducted to enable the proposed model to be explained by capturing mechanistic insights. AVAILABILITY AND IMPLEMENTATION The source code and data are freely available at https://github.com/luvyfdawnYu/CNV_chromothripsis.
Collapse
Affiliation(s)
- Jixiang Yu
- Department of Computer Science, City University of Hong Kong, Kowloon, 999077, Hong Kong
| | - Nanjun Chen
- Department of Computer Science, City University of Hong Kong, Kowloon, 999077, Hong Kong
| | - Zetian Zheng
- Department of Computer Science, City University of Hong Kong, Kowloon, 999077, Hong Kong
| | - Ming Gao
- School of Management Science and Engineering, Dongbei University of Finance and Economics, Dalian 116025, China
| | - Ning Liang
- University of Michigan, Ann Arbor, MI 48105, United States
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Kowloon, 999077, Hong Kong
- Shenzhen Research Institute, City University of Hong Kong, Shenzhen 518057, China
- Hong Kong Institute for Data Science, City University of Hong Kong, Kowloon, 999077, Hong Kong
| |
Collapse
|
3
|
Chen ZX, Liang L, Huang HQ, Li JD, He RQ, Huang ZG, Song R, Chen G, Li JJ, Cai ZW, Huang JA. LPCAT1 enhances the invasion and migration in gastric cancer: Based on computational biology methods and in vitro experiments. Cancer Med 2023. [PMID: 37184260 DOI: 10.1002/cam4.5991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 04/10/2023] [Accepted: 04/15/2023] [Indexed: 05/16/2023] Open
Abstract
BACKGROUND AND AIM The biological functions and clinical implications of lysophosphatidylcholine acyltransferase 1 (LPCAT1) remain unclarified in gastric cancer (GC). The aim of the current study was to explore the possible clinicopathological significance of LPCAT1 and its perspective mechanism in GC tissues. MATERIALS AND METHODS The protein expression and mRNA levels of LPCAT1 were detected from in-house immunohistochemistry and public high-throughput RNA arrays and RNA sequencing. To have a comprehensive understanding of the clinical value of LPCAT1 in GC, all enrolled data were integrated to calculate the expression difference and standard mean difference (SMD). The biological mechanism of LPCAT1 in GC was confirmed by computational biology and in vitro experiments. Migration and invasion assays were also conducted to confirm the effect of LPCAT1 in GC. RESULTS Both protein and mRNA expression levels of LPCAT1 in GC were remarkably higher than those in noncancerous controls. Comprehensively, the SMD of LPCAT1 mRNA was 1.11 (95% CI = 0.86-1.36) in GC, and the summarized AUC was 0.85 based on 15 datasets containing 1727 cases of GC and 940 cases of non-GC controls. Moreover, LPCAT1 could accelerate the invasion and migration of GC by boosting the neutrophil degranulation pathway and disturbing the immune microenvironment. CONCLUSION An increased level of LPCAT1 may promote the progression of GC.
Collapse
Affiliation(s)
- Zu-Xuan Chen
- Department of Medical Oncology, The Second Affiliated Hospital of Guangxi Medical University, Nanning, People's Republic of China
| | - Liang Liang
- Department of General Surgery, The Second Affiliated Hospital of Guangxi Medical University, Nanning, People's Republic of China
| | - He-Qing Huang
- Department of Radiotherapy, The First Affiliated Hospital of Guangxi Medical University, Nanning, People's Republic of China
| | - Jian-Di Li
- Department of Pathology, The First Affiliated Hospital of Guangxi Medical University, Nanning, People's Republic of China
| | - Rong-Quan He
- Department of Medical Oncology, The First Affiliated Hospital of Guangxi Medical University, Nanning, People's Republic of China
| | - Zhi-Guang Huang
- Department of Pathology, The First Affiliated Hospital of Guangxi Medical University, Nanning, People's Republic of China
| | - Rui Song
- Department of Gastroenterology, The Second Affiliated Hospital of Guangxi Medical University, Nanning, People's Republic of China
| | - Gang Chen
- Department of Pathology, The First Affiliated Hospital of Guangxi Medical University, Nanning, People's Republic of China
| | - Jian-Jun Li
- Department of General Surgery, The Second Affiliated Hospital of Guangxi Medical University, Nanning, People's Republic of China
| | - Zheng-Wen Cai
- Department of Medical Oncology, The Second Affiliated Hospital of Guangxi Medical University, Nanning, People's Republic of China
| | - Jie-An Huang
- Department of Gastroenterology, The Second Affiliated Hospital of Guangxi Medical University, Nanning, People's Republic of China
| |
Collapse
|
4
|
Zhang Y, Tang Y, Sun Z, Jia J, Fang Y, Wan X, Fang D. Tn5 tagments and transposes oligos to single-stranded DNA for strand-specific RNA sequencing. Genome Res 2023; 33:412-426. [PMID: 36958795 PMCID: PMC10078286 DOI: 10.1101/gr.277213.122] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 02/01/2023] [Indexed: 03/25/2023]
Abstract
Tn5 transposon tagments double-stranded DNA and RNA/DNA hybrids to generate nucleic acids that are ready to be amplified for high-throughput sequencing. The nucleic acid substrates for the Tn5 transposon must be explored to increase the applications of Tn5. Here, we found that the Tn5 transposon can transpose oligos into the 5' end of single-stranded DNA longer than 140 nucleotides. Based on this property of Tn5, we developed a tagmentation-based and ligation-enabled single-stranded DNA sequencing method called TABLE-seq. Through a series of reaction temperature, time, and enzyme concentration tests, we applied TABLE-seq to strand-specific RNA sequencing, starting with as little as 30 pg of total RNA. Moreover, compared with traditional dUTP-based strand-specific RNA sequencing, this method detects more genes, has a higher strand specificity, and shows more evenly distributed reads across genes. Together, our results provide insights into the properties of Tn5 transposons and expand the applications of Tn5 in cutting-edge sequencing techniques.
Collapse
Affiliation(s)
- Yanjun Zhang
- Zhejiang Provincial Key Laboratory for Cancer Molecular Cell Biology, Life Sciences Institute, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Yin Tang
- Zhejiang Provincial Key Laboratory for Cancer Molecular Cell Biology, Life Sciences Institute, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Zhongxing Sun
- Zhejiang Provincial Key Laboratory for Cancer Molecular Cell Biology, Life Sciences Institute, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Junqi Jia
- Zhejiang Provincial Key Laboratory for Cancer Molecular Cell Biology, Life Sciences Institute, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Yuan Fang
- Zhejiang Provincial Key Laboratory for Cancer Molecular Cell Biology, Life Sciences Institute, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Xinyi Wan
- Zhejiang Provincial Key Laboratory for Cancer Molecular Cell Biology, Life Sciences Institute, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Dong Fang
- Zhejiang Provincial Key Laboratory for Cancer Molecular Cell Biology, Life Sciences Institute, Zhejiang University, Hangzhou, Zhejiang 310058, China;
- Department of Medical Oncology, Key Laboratory of Cancer Prevention and Intervention, Ministry of Education, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310009, China
| |
Collapse
|
5
|
Rahman MM, Islam MR, Rahman F, Rahaman MS, Khan MS, Abrar S, Ray TK, Uddin MB, Kali MSK, Dua K, Kamal MA, Chellappan DK. Emerging Promise of Computational Techniques in Anti-Cancer Research: At a Glance. Bioengineering (Basel) 2022; 9:bioengineering9080335. [PMID: 35892749 PMCID: PMC9332125 DOI: 10.3390/bioengineering9080335] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 07/09/2022] [Accepted: 07/18/2022] [Indexed: 01/07/2023] Open
Abstract
Research on the immune system and cancer has led to the development of new medicines that enable the former to attack cancer cells. Drugs that specifically target and destroy cancer cells are on the horizon; there are also drugs that use specific signals to stop cancer cells multiplying. Machine learning algorithms can significantly support and increase the rate of research on complicated diseases to help find new remedies. One area of medical study that could greatly benefit from machine learning algorithms is the exploration of cancer genomes and the discovery of the best treatment protocols for different subtypes of the disease. However, developing a new drug is time-consuming, complicated, dangerous, and costly. Traditional drug production can take up to 15 years, costing over USD 1 billion. Therefore, computer-aided drug design (CADD) has emerged as a powerful and promising technology to develop quicker, cheaper, and more efficient designs. Many new technologies and methods have been introduced to enhance drug development productivity and analytical methodologies, and they have become a crucial part of many drug discovery programs; many scanning programs, for example, use ligand screening and structural virtual screening techniques from hit detection to optimization. In this review, we examined various types of computational methods focusing on anticancer drugs. Machine-based learning in basic and translational cancer research that could reach new levels of personalized medicine marked by speedy and advanced data analysis is still beyond reach. Ending cancer as we know it means ensuring that every patient has access to safe and effective therapies. Recent developments in computational drug discovery technologies have had a large and remarkable impact on the design of anticancer drugs and have also yielded useful insights into the field of cancer therapy. With an emphasis on anticancer medications, we covered the various components of computer-aided drug development in this paper. Transcriptomics, toxicogenomics, functional genomics, and biological networks are only a few examples of the bioinformatics techniques used to forecast anticancer medications and treatment combinations based on multi-omics data. We believe that a general review of the databases that are now available and the computational techniques used today will be beneficial for the creation of new cancer treatment approaches.
Collapse
Affiliation(s)
- Md. Mominur Rahman
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka 1207, Bangladesh; (M.M.R.); (M.R.I.); (F.R.); (M.S.R.); (M.S.K.); (S.A.); (T.K.R.); (M.B.U.); (M.S.K.K.); (M.A.K.)
| | - Md. Rezaul Islam
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka 1207, Bangladesh; (M.M.R.); (M.R.I.); (F.R.); (M.S.R.); (M.S.K.); (S.A.); (T.K.R.); (M.B.U.); (M.S.K.K.); (M.A.K.)
| | - Firoza Rahman
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka 1207, Bangladesh; (M.M.R.); (M.R.I.); (F.R.); (M.S.R.); (M.S.K.); (S.A.); (T.K.R.); (M.B.U.); (M.S.K.K.); (M.A.K.)
| | - Md. Saidur Rahaman
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka 1207, Bangladesh; (M.M.R.); (M.R.I.); (F.R.); (M.S.R.); (M.S.K.); (S.A.); (T.K.R.); (M.B.U.); (M.S.K.K.); (M.A.K.)
| | - Md. Shajib Khan
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka 1207, Bangladesh; (M.M.R.); (M.R.I.); (F.R.); (M.S.R.); (M.S.K.); (S.A.); (T.K.R.); (M.B.U.); (M.S.K.K.); (M.A.K.)
| | - Sayedul Abrar
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka 1207, Bangladesh; (M.M.R.); (M.R.I.); (F.R.); (M.S.R.); (M.S.K.); (S.A.); (T.K.R.); (M.B.U.); (M.S.K.K.); (M.A.K.)
| | - Tanmay Kumar Ray
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka 1207, Bangladesh; (M.M.R.); (M.R.I.); (F.R.); (M.S.R.); (M.S.K.); (S.A.); (T.K.R.); (M.B.U.); (M.S.K.K.); (M.A.K.)
| | - Mohammad Borhan Uddin
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka 1207, Bangladesh; (M.M.R.); (M.R.I.); (F.R.); (M.S.R.); (M.S.K.); (S.A.); (T.K.R.); (M.B.U.); (M.S.K.K.); (M.A.K.)
| | - Most. Sumaiya Khatun Kali
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka 1207, Bangladesh; (M.M.R.); (M.R.I.); (F.R.); (M.S.R.); (M.S.K.); (S.A.); (T.K.R.); (M.B.U.); (M.S.K.K.); (M.A.K.)
| | - Kamal Dua
- Discipline of Pharmacy, Graduate School of Health, University of Technology Sydney, Sydney, NSW 2007, Australia;
- Faculty of Health, Australian Research Centre in Complementary and Integrative Medicine, University of Technology Sydney, Ultimo, NSW 2007, Australia
- Uttaranchal Institute of Pharmaceutical Sciences, Uttaranchal University, Dehradun 248007, India
| | - Mohammad Amjad Kamal
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka 1207, Bangladesh; (M.M.R.); (M.R.I.); (F.R.); (M.S.R.); (M.S.K.); (S.A.); (T.K.R.); (M.B.U.); (M.S.K.K.); (M.A.K.)
- Institutes for Systems Genetics, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu 610041, China
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Enzymoics, 7 Peterlee Place, Novel Global Community Educational Foundation, Hebersham, NSW 2770, Australia
| | - Dinesh Kumar Chellappan
- Department of Life Sciences, School of Pharmacy, International Medical University, Bukit Jalil, Kuala Lumpur 57000, Malaysia
- Correspondence:
| |
Collapse
|