1
|
Mir BA, Tayara H, Chong KT. SB-Net: Synergizing CNN and LSTM networks for uncovering retrosynthetic pathways in organic synthesis. Comput Biol Chem 2024; 112:108130. [PMID: 38954849 DOI: 10.1016/j.compbiolchem.2024.108130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 05/17/2024] [Accepted: 06/12/2024] [Indexed: 07/04/2024]
Abstract
Retrosynthesis is vital in synthesizing target products, guiding reaction pathway design crucial for drug and material discovery. Current models often neglect multi-scale feature extraction, limiting efficacy in leveraging molecular descriptors. Our proposed SB-Net model, a deep-learning architecture tailored for retrosynthesis prediction, addresses this gap. SB-Net combines CNN and Bi-LSTM architectures, excelling in capturing multi-scale molecular features. It integrates parallel branches for processing one-hot encoded descriptors and ECFP, merging through dense layers. Experimental results demonstrate SB-Net's superiority, achieving 73.6 % top-1 and 94.6 % top-10 accuracy on USPTO-50k data. Versatility is validated on MetaNetX, with rates of 52.8 % top-1, 74.3 % top-3, 79.8 % top-5, and 83.5 % top-10. SB-Net's success in bioretrosynthesis prediction tasks indicates its efficacy. This research advances computational chemistry, offering a robust deep-learning model for retrosynthesis prediction. With implications for drug discovery and synthesis planning, SB-Net promises innovative and efficient pathways.
Collapse
Affiliation(s)
- Bilal Ahmad Mir
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, South Korea.
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea; Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, South Korea.
| |
Collapse
|
2
|
Zahid MU, Nisar MD, Fazil A, Ryu J, Shah MH. Composite Ensemble Learning Framework for Passive Drone Radio Frequency Fingerprinting in Sixth-Generation Networks. SENSORS (BASEL, SWITZERLAND) 2024; 24:5618. [PMID: 39275529 PMCID: PMC11397939 DOI: 10.3390/s24175618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 08/09/2024] [Accepted: 08/28/2024] [Indexed: 09/16/2024]
Abstract
The rapid evolution of drone technology has introduced unprecedented challenges in security, particularly concerning the threat of unconventional drone and swarm attacks. In order to deal with threats, drones need to be classified by intercepting their Radio Frequency (RF) signals. With the arrival of Sixth Generation (6G) networks, it is required to develop sophisticated methods to properly categorize drone signals in order to achieve optimal resource sharing, high-security levels, and mobility management. However, deep ensemble learning has not been investigated properly in the case of 6G. It is anticipated that it will incorporate drone-based BTS and cellular networks that, in one way or another, may be subjected to jamming, intentional interferences, or other dangers from unauthorized UAVs. Thus, this study is conducted based on Radio Frequency Fingerprinting (RFF) of drones identified to detect unauthorized ones so that proper actions can be taken to protect the network's security and integrity. This paper proposes a novel method-a Composite Ensemble Learning (CEL)-based neural network-for drone signal classification. The proposed method integrates wavelet-based denoising and combines automatic and manual feature extraction techniques to foster feature diversity, robustness, and performance enhancement. Through extensive experiments conducted on open-source benchmark datasets of drones, our approach demonstrates superior classification accuracies compared to recent benchmark deep learning techniques across various Signal-to-Noise Ratios (SNRs). This novel approach holds promise for enhancing communication efficiency, security, and safety in 6G networks amidst the proliferation of drone-based applications.
Collapse
Affiliation(s)
- Muhammad Usama Zahid
- Electrical and Computer Engineering Department, Sir Syed CASE Institute of Technology, Islamabad 04524, Pakistan
| | - Muhammad Danish Nisar
- Electrical and Computer Engineering Department, Sir Syed CASE Institute of Technology, Islamabad 04524, Pakistan
| | - Adnan Fazil
- Department of Avionics Engineering, Air University, E-9, Islamabad 44230, Pakistan
| | - Jihyoung Ryu
- Electronics and Telecommunications Research Institute (ETRI), Gwangju 61012, Republic of Korea
| | - Maqsood Hussain Shah
- SFI Insight Centre for Data Analytics and the School of Electronic Engineering, Dublin City University, D09 V209 Dublin, Ireland
| |
Collapse
|
3
|
Hassan MT, Tayara H, Chong KT. NaII-Pred: An ensemble-learning framework for the identification and interpretation of sodium ion inhibitors by fusing multiple feature representation. Comput Biol Med 2024; 178:108737. [PMID: 38879934 DOI: 10.1016/j.compbiomed.2024.108737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 04/21/2024] [Accepted: 06/08/2024] [Indexed: 06/18/2024]
Abstract
High-affinity ligand peptides for ion channels are essential for controlling the flow of ions across the plasma membrane. These peptides are now being investigated as possible therapeutic possibilities for a variety of illnesses, including cancer and cardiovascular disease. So, the identification and interpretation of ligand peptide inhibitors to control ion flow across cells become pivotal for exploration. In this work, we developed an ensemble-based model, NaII-Pred, for the identification of sodium ion inhibitors. The ensemble model was trained, tested, and evaluated on three different datasets. The NaII-Pred method employs six different descriptors and a hybrid feature set in conjunction with five conventional machine learning classifiers to create 35 baseline models. Through an ensemble approach, the top five baseline models trained on the hybrid feature set were integrated to yield the final predictive model, NaII-Pred. Our proposed model, NaII-Pred, outperforms the baseline models and the current predictors on both datasets. We believe NaII-Pred will play a critical role in screening and identifying potential sodium ion inhibitors and will be an invaluable tool.
Collapse
Affiliation(s)
- Mir Tanveerul Hassan
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju, 54896, South Korea.
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea; Advances Electronics and Information Research Centre, Jeonbuk National University, Jeonju, 54896, South Korea.
| |
Collapse
|
4
|
Gaffar S, Tayara H, Chong KT. Stack-AAgP: Computational prediction and interpretation of anti-angiogenic peptides using a meta-learning framework. Comput Biol Med 2024; 174:108438. [PMID: 38613893 DOI: 10.1016/j.compbiomed.2024.108438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 04/01/2024] [Accepted: 04/07/2024] [Indexed: 04/15/2024]
Abstract
BACKGROUND Angiogenesis plays a vital role in the pathogenesis of several human diseases, particularly in the case of solid tumors. In the realm of cancer treatment, recent investigations into peptides with anti-angiogenic properties have yielded encouraging outcomes, thereby creating a hopeful therapeutic avenue for the treatment of cancer. Therefore, correctly identifying the anti-angiogenic peptides is extremely important in comprehending their biophysical and biochemical traits, laying the groundwork for uncovering novel drugs to combat cancer. METHODS In this work, we present a novel ensemble-learning-based model, Stack-AAgP, specifically designed for the accurate identification and interpretation of anti-angiogenic peptides (AAPs). Initially, a feature representation approach is employed, generating 24 baseline models through six machine learning algorithms (random forest [RF], extra tree classifier [ETC], extreme gradient boosting [XGB], light gradient boosting machine [LGBM], CatBoost, and SVM) and four feature encodings (pseudo-amino acid composition [PAAC], amphiphilic pseudo-amino acid composition [APAAC], composition of k-spaced amino acid pairs [CKSAAP], and quasi-sequence-order [QSOrder]). Subsequently, the output (predicted probabilities) from 24 baseline models was inputted into the same six machine-learning classifiers to generate their respective meta-classifiers. Finally, the meta-classifiers were stacked together using the ensemble-learning framework to construct the final predictive model. RESULTS Findings from the independent test demonstrate that Stack-AAgP outperforms the state-of-the-art methods by a considerable margin. Systematic experiments were conducted to assess the influence of hyperparameters on the proposed model. Our model, Stack-AAgP, was evaluated on the independent NT15 dataset, revealing superiority over existing predictors with an accuracy improvement ranging from 5% to 7.5% and an increase in Matthews Correlation Coefficient (MCC) from 7.2% to 12.2%.
Collapse
Affiliation(s)
- Saima Gaffar
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju, 54896, South Korea.
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea; Advances Electronics and Information Research Centre, Jeonbuk National University, Jeonju, 54896, South Korea.
| |
Collapse
|
5
|
Feng T, Hu T, Liu W, Zhang Y. Enhancer Recognition: A Transformer Encoder-Based Method with WGAN-GP for Data Augmentation. Int J Mol Sci 2023; 24:17548. [PMID: 38139375 PMCID: PMC10743946 DOI: 10.3390/ijms242417548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Revised: 11/29/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023] Open
Abstract
Enhancers are located upstream or downstream of key deoxyribonucleic acid (DNA) sequences in genes and can adjust the transcription activity of neighboring genes. Identifying enhancers and determining their functions are important for understanding gene regulatory networks and expression regulatory mechanisms. However, traditional enhancer recognition relies on manual feature engineering, which is time-consuming and labor-intensive, making it difficult to perform large-scale recognition analysis. In addition, if the original dataset is too small, there is a risk of overfitting. In recent years, emerging methods, such as deep learning, have provided new insights for enhancing identification. However, these methods also present certain challenges. Deep learning models typically require a large amount of high-quality data, and data acquisition demands considerable time and resources. To address these challenges, in this paper, we propose a data-augmentation method based on generative adversarial networks to solve the problem of small datasets. Moreover, we used regularization methods such as weight decay to improve the generalizability of the model and alleviate overfitting. The Transformer encoder was used as the main component to capture the complex relationships and dependencies in enhancer sequences. The encoding layer was designed based on the principle of k-mers to preserve more information from the original DNA sequence. Compared with existing methods, the proposed approach made significant progress in enhancing the accuracy and strength of enhancer identification and prediction, demonstrating the effectiveness of the proposed method. This paper provides valuable insights for enhancer analysis and is of great significance for understanding gene regulatory mechanisms and studying disease correlations.
Collapse
Affiliation(s)
- Tianyu Feng
- College of Information Science & Engineering, Lanzhou University, Lanzhou 730000, China; (T.F.); (T.H.)
| | - Tao Hu
- College of Information Science & Engineering, Lanzhou University, Lanzhou 730000, China; (T.F.); (T.H.)
| | - Wenyu Liu
- College of Ecology, Lanzhou University, Lanzhou 730000, China;
| | - Yang Zhang
- Supercomputer Center, Lanzhou University, Lanzhou 730000, China
| |
Collapse
|
6
|
Venkatesan VK, Kuppusamy Murugesan KR, Chandrasekaran KA, Thyluru Ramakrishna M, Khan SB, Almusharraf A, Albuali A. Cancer Diagnosis through Contour Visualization of Gene Expression Leveraging Deep Learning Techniques. Diagnostics (Basel) 2023; 13:3452. [PMID: 37998588 PMCID: PMC10670706 DOI: 10.3390/diagnostics13223452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 10/30/2023] [Accepted: 11/04/2023] [Indexed: 11/25/2023] Open
Abstract
Prompt diagnostics and appropriate cancer therapy necessitate the use of gene expression databases. The integration of analytical methods can enhance detection precision by capturing intricate patterns and subtle connections in the data. This study proposes a diagnostic-integrated approach combining Empirical Bayes Harmonization (EBS), Jensen-Shannon Divergence (JSD), deep learning, and contour mathematics for cancer detection using gene expression data. EBS preprocesses the gene expression data, while JSD measures the distributional differences between cancerous and non-cancerous samples, providing invaluable insights into gene expression patterns. Deep learning (DL) models are employed for automatic deep feature extraction and to discern complex patterns from the data. Contour mathematics is applied to visualize decision boundaries and regions in the high-dimensional feature space. JSD imparts significant information to the deep learning model, directing it to concentrate on pertinent features associated with cancerous samples. Contour visualization elucidates the model's decision-making process, bolstering interpretability. The amalgamation of JSD, deep learning, and contour mathematics in gene expression dataset analysis diagnostics presents a promising pathway for precise cancer detection. This method taps into the prowess of deep learning for feature extraction while employing JSD to pinpoint distributional differences and contour mathematics for visual elucidation. The outcomes underscore its potential as a formidable instrument for cancer detection, furnishing crucial insights for timely diagnostics and tailor-made treatment strategies.
Collapse
Affiliation(s)
- Vinoth Kumar Venkatesan
- School of Computer Science Engineering and Information Systems (SCORE), Vellore Institute of Technology, Vellore 632014, India;
| | - Karthick Raghunath Kuppusamy Murugesan
- Department of Computer Science and Engineering, Faculty of Engineering and Technology, JAIN (Deemed-to-be University), Bangalore 562112, India; (K.R.K.M.); (M.T.R.)
| | | | - Mahesh Thyluru Ramakrishna
- Department of Computer Science and Engineering, Faculty of Engineering and Technology, JAIN (Deemed-to-be University), Bangalore 562112, India; (K.R.K.M.); (M.T.R.)
| | - Surbhi Bhatia Khan
- Department of Data Science, School of Science Engineering and Environment, University of Salford, Manchester M5 4WT, UK
- Department of Engineering and Environment, University of Religions and Denominations, Qom 37491-13357, Iran
- Department of Electrical and Computer Engineering, Lebanese American University, Byblos P.O. Box 13-5053, Lebanon
| | - Ahlam Almusharraf
- Department of Business Administration, College of Business and Administration, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia;
| | - Abdullah Albuali
- Department of Computer Science, School of Computer Science and Information Technology, King Faisal University, Hofuf 11671, Saudi Arabia;
| |
Collapse
|