1
|
Shahid, Hayat M, Raza A, Akbar S, Alghamdi W, Iqbal N, Zou Q. pACPs-DNN: Predicting anticancer peptides using novel peptide transformation into evolutionary and structure matrix-based images with self-attention deep learning model. Comput Biol Chem 2025; 117:108441. [PMID: 40168838 DOI: 10.1016/j.compbiolchem.2025.108441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2025] [Revised: 03/18/2025] [Accepted: 03/22/2025] [Indexed: 04/03/2025]
Abstract
Globally, cancer remains a major health challenge due to its high mortality rates. Traditional experimental approaches and therapies are resource-intensive and often cause significant side effects. Anticancer peptides (ACPs) have emerged as alternative therapeutic agents owing to their selectivity, safety, and potential to mitigate drug resistance. In this paper, we propose pACPs-DNN, a novel attention mechanism-based deep learning model developed for the accurate prediction of ACPs and non-ACPs. The pACPs-DNN model transforms input peptides into image representations using residue-wise energy contact matrix (RECM), substitution Matrix Representation (SMR), and Position Specific Scoring Matrix (PSSM) embeddings, followed by local binary pattern (LBP)-based decomposition to capture enhanced structural and local semantic features. These transformations generate novel feature sets, including RECM_LBP, LBP_SMR, and LBP_PSSM. Subsequently, a two-tier feature selection approach is employed to identify a high-ranking optimal feature set, which is then used to train an attention-based deep neural network. The proposed pACPs-DNN model achieves an impressive training accuracy of 96.91 % and an AUC of 0.98. To evaluate its generalization capability, the model was validated on independent datasets, demonstrating significant improvements of 5 % and 3.5 % in accuracy over existing models on the Ind-I and Ind-II datasets, respectively. The demonstrated efficacy and robustness of pACPs-DNN highlight its potential as a valuable tool for advancing drug discovery and academic research in cancer-related therapeutic development.
Collapse
Affiliation(s)
- Shahid
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KP 23200, Pakistan
| | - Maqsood Hayat
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KP 23200, Pakistan.
| | - Ali Raza
- Department of Computer Science, Bahria University, Islamabad 44220, Pakistan
| | - Shahid Akbar
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KP 23200, Pakistan; Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Wajdi Alghamdi
- Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Nadeem Iqbal
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KP 23200, Pakistan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China; Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China.
| |
Collapse
|
2
|
Ghorbian M, Ghobaei-Arani M, Ghorbian S. Transforming breast cancer diagnosis and treatment with large language Models: A comprehensive survey. Methods 2025; 239:85-110. [PMID: 40199412 DOI: 10.1016/j.ymeth.2025.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2025] [Revised: 03/24/2025] [Accepted: 04/01/2025] [Indexed: 04/10/2025] Open
Abstract
Breast cancer (BrCa), being one of the most prevalent forms of cancer in women, poses many challenges in the field of treatment and diagnosis due to its complex biological mechanisms. Early and accurate diagnosis plays a fundamental role in improving survival rates, but the limitations of existing imaging methods and clinical data interpretation often prevent optimal results. Large Language Models (LLMs), which are developed based on advanced architectures such as transformers, have brought about a significant revolution in data processing and medical decision-making. By analyzing a large volume of medical and clinical data, these models enable early diagnosis by identifying patterns in images and medical records and provide personalized treatment strategies by integrating genetic markers and clinical guidelines. Despite the transformative potential of these models, their use in BrCa management faces challenges such as data sensitivity, algorithm transparency, ethical considerations, and model compatibility with the details of medical applications that need to be addressed to achieve reliable results. This review systematically reviews the impact of LLMs on BrCa treatment and diagnosis. This study's objectives include analyzing the role of LLM technology in diagnosing and treating this disease. The findings indicate that the application of LLMs has resulted in significant improvements in various aspects of BrCa management, such as a 35% increase in the Efficiency of Diagnosis and BrCa Treatment (EDBC), a 30% enhancement in the System's Clinical Trust and Reliability (SCTR), and a 20% improvement in the quality of patient education and information (IPEI). Ultimately, this study demonstrates the importance of LLMs in advancing precision medicine for BrCa and paves the way for effective patient-centered care solutions.
Collapse
Affiliation(s)
- Mohsen Ghorbian
- Department of Computer Engineering, Qo.C., Islamic Azad University, Qom, Iran
| | | | - Saied Ghorbian
- Department of Molecular Genetics, Ah.C., Islamic Azad University, Ahar, Iran.
| |
Collapse
|
3
|
Liang MZ, Huang XF, Zhu JC, Bao JX, Chen CL, Wang XW, Lou YW, Pan YT, Dai YW. A machine learning-based glycolysis and fatty acid metabolism-related prognostic signature is constructed and identified ACSL5 as a novel marker inhibiting the proliferation of breast cancer. Comput Biol Chem 2025; 119:108507. [PMID: 40403353 DOI: 10.1016/j.compbiolchem.2025.108507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2024] [Revised: 04/27/2025] [Accepted: 05/09/2025] [Indexed: 05/24/2025]
Abstract
INTRODUCTION A new perspective on cancer metabolism suggests that it varies by context and is diverse. Cancer metabolism reprogramming can create a heterogeneous microenvironment that affects immune cell infiltration and function, complicating the selection of treatment methods. However, the specifics of this relationship remain unclear in breast cancer. This research aims to explore how glycolysis and fatty acid metabolism (GF) influence the immune microenvironment and their predictive capabilities for immunotherapy responses and overall survival. METHODS We at first time identified 602 GF-related genes. Utilizing multiple datasets from various centers and employing 10 different machine learning algorithms, we developed a GF-related signature called GFSscore, driven by artificial intelligence. RESULTS The GFSscore served as an independent prognostic indicator and demonstrated greater robustness than other models. Its validity was validated through multiple databases. Our study found that breast cancer patients with a high GFSscore, indicative of a greater tendency towards glycolytic activity, experienced poorer prognosis due to immunosuppression from distinct immune evasion mechanisms. Conversely, those with a low GFSscore, more inclined towards fatty acid metabolism, had better outcomes. Additionally, the GFSscore has the potential to forecast how well a patient might respond to immunotherapy and their susceptibility to chemotherapy medications. Moreover, we found that the overexpressed ACSL5 gene inhibits the proliferation of BRCA through experiments. CONCLUSIONS The GFSscore may offer patients personalized therapy by identifying new therapeutic targets for tumors. By understanding the relationship between cancer metabolism and the immune microenvironment, we can better tailor treatments to individual patients.
Collapse
Affiliation(s)
- Mei-Zhen Liang
- Department of Thyroid and Breast Surgery, The Third Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| | - Xian-Feng Huang
- Department of Colorectal and Anal Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| | - Jun-Chang Zhu
- Department of Colorectal and Anal Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| | - Jing-Xia Bao
- Department of Breast Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| | - Cheng-Liang Chen
- Department of Thyroid and Breast Surgery, The Third Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| | - Xiao-Wu Wang
- Department of Burns and Skin Repair Surgery, The Third Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Yun-Wei Lou
- Department of Gastroenterology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| | - Ya-Ting Pan
- Yongkang First People's Hospital Medical Group, Jinhua, Zhejiang, China.
| | - Yin-Wei Dai
- Department of Thyroid and Breast Surgery, The Third Affiliated Hospital of Wenzhou Medical University, Wenzhou, China; Department of Obstetrics and Gynecology, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China.
| |
Collapse
|
4
|
Yu S, Zhou P. An optimized transformer model for efficient detection of thoracic diseases in chest X-rays with multi-scale feature fusion. PLoS One 2025; 20:e0323239. [PMID: 40334189 PMCID: PMC12058152 DOI: 10.1371/journal.pone.0323239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2025] [Accepted: 04/04/2025] [Indexed: 05/09/2025] Open
Abstract
This study presents the development and application of an optimized Detection Transformer (DETR) model, known as CD-DETR, for the detection of thoracic diseases from chest X-ray (CXR) images. The CD-DETR model addresses the challenges of detecting minor pathologies in CXRs, particularly in regions with uneven medical resource distribution. In the central and western regions of China, due to a shortage of radiologists, CXRs from township hospitals are concentrated in central hospitals for diagnosis. This requires processing a large number of CXRs in a short period of time to obtain results. The model integrates a multi-scale feature fusion approach, leveraging Efficient Channel Attention (ECA-Net) and Spatial Attention Upsampling (SAU) to enhance feature representation and improve detection accuracy. It also introduces a dedicated Chest Diseases Intersection over Union (CDIoU) loss function to optimize the detection of small targets and reduce class imbalance. Experimental results on the NIH Chest X-ray dataset demonstrate that CD-DETR achieves a precision of 88.3% and recall of 86.6%, outperforming other DETR variants by an average of 5% and CNN-based models like YOLOv7 by 6-8% in these metrics, showing its potential for practical application in medical imaging diagnostics.
Collapse
Affiliation(s)
- Shasha Yu
- Information Center, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Peng Zhou
- FutureFront Interdisciplinary Research Institute, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
5
|
Eledkawy A, Hamza T, El-Metwally S. Towards precision oncology: a multi-level cancer classification system integrating liquid biopsy and machine learning. BioData Min 2025; 18:29. [PMID: 40217526 PMCID: PMC11987386 DOI: 10.1186/s13040-025-00439-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2025] [Accepted: 03/10/2025] [Indexed: 04/14/2025] Open
Abstract
BACKGROUND Millions of people die from cancer every year. Early cancer detection is crucial for ensuring higher survival rates, as it provides an opportunity for timely medical interventions. This paper proposes a multi-level cancer classification system that uses plasma cfDNA/ctDNA mutations and protein biomarkers to identify seven distinct cancer types: colorectal, breast, upper gastrointestinal, lung, pancreas, ovarian, and liver. RESULTS The proposed system employs a multi-stage binary classification framework where each stage is customized for a specific cancer type. A majority vote feature selection process is employed by combining six feature selectors: Information Value, Chi-Square, Random Forest Feature Importance, Extra Tree Feature Importance, Recursive Feature Elimination, and L1 Regularization. Following the feature selection process, classifiers-including eXtreme Gradient Boosting, Random Forest, Extra Tree, and Quadratic Discriminant Analysis-are customized for each cancer type individually or in an ensemble soft voting setup to optimize predictive accuracy. The proposed system outperformed previously published results, achieving an AUC of 98.2% and an accuracy of 96.21%. To ensure reproducibility of the results, the trained models and the dataset used in this study are made publicly available via the GitHub repository ( https://github.com/SaraEl-Metwally/Towards-Precision-Oncology ). CONCLUSION The identified biomarkers enhance the interpretability of the diagnosis, facilitating more informed decision-making. The system's performance underscores its effectiveness in tissue localization, contributing to improved patient outcomes through timely medical interventions.
Collapse
Affiliation(s)
- Amr Eledkawy
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, P.O. Box: 35516, Mansoura, Egypt
| | - Taher Hamza
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, P.O. Box: 35516, Mansoura, Egypt
| | - Sara El-Metwally
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, P.O. Box: 35516, Mansoura, Egypt.
- Biomedical Informatics Department, Faculty of Computer Science and Engineering, New Mansoura University, Gamasa, 35712, Egypt.
| |
Collapse
|
6
|
Akbar S, Raza A, Awan HH, Zou Q, Alghamdi W, Saeed A. pNPs-CapsNet: Predicting Neuropeptides Using Protein Language Models and FastText Encoding-Based Weighted Multi-View Feature Integration with Deep Capsule Neural Network. ACS OMEGA 2025; 10:12403-12416. [PMID: 40191328 PMCID: PMC11966582 DOI: 10.1021/acsomega.4c11449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/24/2024] [Revised: 02/04/2025] [Accepted: 03/07/2025] [Indexed: 04/09/2025]
Abstract
Neuropeptides (NPs) are critical signaling molecules that are essential in numerous physiological processes and possess significant therapeutic potential. Computational prediction of NPs has emerged as a promising alternative to traditional experimental methods, often labor-intensive, time-consuming, and expensive. Recent advancements in computational peptide models provide a cost-effective approach to identifying NPs, characterized by high selectivity toward target cells and minimal side effects. In this study, we propose a novel deep capsule neural network-based computational model, namely pNPs-CapsNet, to predict NPs and non-NPs accurately. Input samples are numerically encoded using pretrained protein language models, including ESM, ProtBERT-BFD, and ProtT5, to extract attention mechanism-based contextual and semantic features. A differential evolution-based weighted feature integration method is utilized to construct a multiview vector. Additionally, a two-tier feature selection strategy, comprising MRMD and SHAP analysis, is developed to identify and select optimal features. Finally, the novel capsule neural network (CapsNet) is trained using the selected optimal feature set. The proposed pNPs-CapsNet model achieved a remarkable predictive accuracy of 98.10% and an AUC of 0.98. To validate the generalization capability of the pNPs-CapsNet model, independent samples reported an accuracy of 95.21% and an AUC of 0.96. The pNPs-CapsNet model outperforms existing state-of-the-art models, demonstrating 4% and 2.5% improved predictive accuracy for training and independent data sets, respectively. The demonstrated efficacy and consistency of pNPs-CapsNet underline its potential as a valuable and robust tool for advancing drug discovery and academic research.
Collapse
Affiliation(s)
- Shahid Akbar
- Institute
of Fundamental and Frontier Sciences, University
of Electronic Science and Technology of China, Chengdu 610054, China
- Department
of Computer Science, Abdul Wali Khan University
Mardan, Mardan 23200, Khyber Pakhtunkhwa, Pakistan
| | - Ali Raza
- Department
of Computer Science, Bahria University, Islamabad 44220, Pakistan
| | - Hamid Hussain Awan
- Department
of Computer Science, Rawalpindi Women University, Rawalpindi 46300, Punjab, Pakistan
| | - Quan Zou
- Institute
of Fundamental and Frontier Sciences, University
of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze
Delta Region Institute (Quzhou), University
of Electronic Science and Technology of China, Quzhou 324000, PR China
| | - Wajdi Alghamdi
- Department
of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Aamir Saeed
- Department
of Computer Science and IT, University of
Engineering and Technology, Jalozai Campus, Peshawar 25000, Pakistan
| |
Collapse
|