1
|
Kosanam S, Pasupula R. Cardioprotective effects of cinnamoyl imidazole on apoptosis and oxidative stress in hypoxia/reoxygenation-induced H9C2 cell lines. Life Sci 2024; 359:123189. [PMID: 39481831 DOI: 10.1016/j.lfs.2024.123189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2024] [Revised: 10/21/2024] [Accepted: 10/27/2024] [Indexed: 11/03/2024]
Abstract
BACKGROUND This study explored the effects of cinnamoyl imidazole on alleviating oxidative stress and apoptosis in hypoxia/reoxygenation (H/R)-induced H9C2 cells, using computational analysis with in-vitro validation. METHODS Computational techniques, including SwissADME and Swiss Target Prediction, were employed to predict the ADME properties and to identify targets of cinnamoyl imidazole. Differential gene expression (DEG) analysis was conducted on myocardial infarction (MI) datasets obtained from the Gene Expression Omnibus. Gene enrichment and molecular simulation studies were done to focus on apoptotic pathways. The computational findings were validated through In vitro experiments on H9C2 cardiomyocytes subjected to 8 h of hypoxia followed by 24 h of reoxygenation. Antioxidant enzyme levels (catalase, GST, GSH-Px, and SOD), mitochondrial membrane potential (ΔΨm), caspase-3 activity, and the expression of CASP3, MAPK8, JAK2, and BCL2L1 were assessed. RESULTS Cinnamoyl imidazole has demonstrated favourable pharmacokinetic properties, characterized by high gastrointestinal absorption and low toxicity with negative toxicity for organ endpoints. Molecular docking studies revealed the strong binding affinities for CASP3, MAPK8, and JAK2. In vitro results showed a significant increase in cell viability (94.7 % at 10 μM, p < 0.001) and antioxidant enzyme activity, along with a 64.3 % reduction in caspase-3 activity at 1000 μM (p < 0.01). Cinnamoyl imidazole treatment preserved mitochondrial membrane potential, downregulated pro-apoptotic genes CASP3 and MAPK8, and upregulated the anti-apoptotic gene BCL2L1. CONCLUSION Cinnamoyl imidazole effectively mitigates oxidative stress and apoptosis in H/R-induced H9C2 cells, enhancing cell viability and antioxidant defenses while maintaining mitochondrial integrity.
Collapse
Affiliation(s)
- Sreya Kosanam
- Department of Pharmacology, College of Pharmacy, Koneru Lakshmaiah Education Foundation, KL deemed to be University, Green Fields, Vaddeswaram, Andhra Pradesh, India
| | - Rajeshwari Pasupula
- Department of Pharmacology, College of Pharmacy, Koneru Lakshmaiah Education Foundation, KL deemed to be University, Green Fields, Vaddeswaram, Andhra Pradesh, India.
| |
Collapse
|
2
|
Harati Kabir V, Mahdavifar Khayati R, Motie Nasrabadi A, Nabavi SM. Prediction of Expanded Disability Status Scale in patients with MS using deep learning. Comput Biol Med 2024; 182:109143. [PMID: 39270459 DOI: 10.1016/j.compbiomed.2024.109143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 08/12/2024] [Accepted: 09/08/2024] [Indexed: 09/15/2024]
Abstract
Multiple sclerosis (MS) is a chronic neurological condition that leads to significant disability in patients. Accurate prediction of disease progression, specifically the Expanded Disability Status Scale (EDSS), is crucial for personalizing treatment and improving patient outcomes. This study aims to develop a robust deep neural network framework to predict EDSS in MS patients using MRI scans. Our model demonstrates high accuracy and reliability in both lesion segmentation and disability classification tasks. For segmentation, the model achieves a Dice Coefficient of 0.87, a Jaccard Index of 0.79, sensitivity of 0.85, and specificity of 0.88. In classification, it attains an overall accuracy of 91.2 %, with a precision of 0.89, recall of 0.88, and an F1-Score of 0.885. Ablation studies highlight the significant impact of integrating T2-weighted and FLAIR images, improving accuracy from 85.7 % (T1-weighted alone) to 93.4 %. Comparative analysis with state-of-the-art methods demonstrates our model's superiority, outperforming Method A and Method B in both accuracy and F1-Score. Despite these advancements, challenges such as data quality, sample size, and computational complexity remain. Future research should focus on standardizing imaging protocols, incorporating larger and more diverse datasets, and optimizing model efficiency. Advancing deep learning architectures and utilizing multimodal data can enhance predictive power and facilitate real-time clinical applications. Our study significantly contributes to refining MS treatment strategies by providing a comprehensive evaluation of our model's performance and addressing key limitations. Accurate disability predictions enable personalized treatments, early interventions, and improved patient outcomes, thus enhancing the quality of life for individuals affected by MS.
Collapse
Affiliation(s)
| | | | | | - Seyed Massood Nabavi
- Department of Regenerative Medicine, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, Tehran, Iran
| |
Collapse
|
3
|
Teo YX, Lee RE, Nurzaman SG, Tan CP, Chan PY. Action tremor features discovery for essential tremor and Parkinson's disease with explainable multilayer BiLSTM. Comput Biol Med 2024; 180:108957. [PMID: 39098236 DOI: 10.1016/j.compbiomed.2024.108957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 07/04/2024] [Accepted: 07/26/2024] [Indexed: 08/06/2024]
Abstract
The tremors of Parkinson's disease (PD) and essential tremor (ET) are known to have overlapping characteristics that make it complicated for clinicians to distinguish them. While deep learning is robust in detecting features unnoticeable to humans, an opaque trained model is impractical in clinical scenarios as coincidental correlations in the training data may be used by the model to make classifications, which may result in misdiagnosis. This work aims to overcome the aforementioned challenge of deep learning models by introducing a multilayer BiLSTM network with explainable AI (XAI) that can better explain tremulous characteristics and quantify the respective discovered important regions in tremor differentiation. The proposed network classifies PD, ET, and normal tremors during drinking actions and derives the contribution from tremor characteristics, (i.e., time, frequency, amplitude, and actions) utilized in the classification task. The analysis shows that the XAI-BiLSTM marks the regions with high tremor amplitude as important in classification, which is verified by a high correlation between relevance distribution and tremor displacement amplitude. The XAI-BiLSTM discovered that the transition phases from arm resting to lifting (during the drinking cycle) is the most important action to classify tremors. Additionally, the XAI-BiLSTM reveals frequency ranges that only contribute to the classification of one tremor class, which may be the potential distinctive feature to overcome the overlapping frequencies problem. By revealing critical timing and frequency patterns unique to PD and ET tremors, this proposed XAI-BiLSTM model enables clinicians to make more informed classifications, potentially reducing misclassification rates and improving treatment outcomes.
Collapse
Affiliation(s)
- Yu Xuan Teo
- Department of Electrical & Robotics Engineering, School of Engineering, Monash University Malaysia, Malaysia.
| | - Rui En Lee
- Department of Electrical & Robotics Engineering, School of Engineering, Monash University Malaysia, Malaysia.
| | - Surya Girinatha Nurzaman
- Department of Mechanical Engineering, School of Engineering, Monash University Malaysia, Bandar Sunway, Malaysia.
| | - Chee Pin Tan
- Department of Electrical & Robotics Engineering, School of Engineering, Monash University Malaysia, Malaysia.
| | - Ping Yi Chan
- Department of Electrical & Robotics Engineering, School of Engineering, Monash University Malaysia, Malaysia.
| |
Collapse
|
4
|
Alkadri S, Del Maestro RF, Driscoll M. Unveiling surgical expertise through machine learning in a novel VR/AR spinal simulator: A multilayered approach using transfer learning and connection weights analysis. Comput Biol Med 2024; 179:108809. [PMID: 38944904 DOI: 10.1016/j.compbiomed.2024.108809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 06/10/2024] [Accepted: 06/24/2024] [Indexed: 07/02/2024]
Abstract
BACKGROUND Virtual and augmented reality surgical simulators, integrated with machine learning, are becoming essential for training psychomotor skills, and analyzing surgical performance. Despite the promise of methods like the Connection Weights Algorithm, the small sample sizes (small number of participants (N)) typical of these trials challenge the generalizability and robustness of models. Approaches like data augmentation and transfer learning from models trained on similar surgical tasks address these limitations. OBJECTIVE To demonstrate the efficacy of artificial neural network and transfer learning algorithms in evaluating virtual surgical performances, applied to a simulated oblique lateral lumbar interbody fusion technique in an augmented and virtual reality simulator. DESIGN The study developed and integrated artificial neural network algorithms within a novel simulator platform, using data from the simulated tasks to generate 276 performance metrics across motion, safety, and efficiency. Innovatively, it applies transfer learning from a pre-trained ANN model developed for a similar spinal simulator, enhancing the training process, and addressing the challenge of small datasets. SETTING Musculoskeletal Biomechanics Research Lab; Neurosurgical Simulation and Artificial Intelligence Learning Centre, McGill University, Montreal, Canada. PARTICIPANTS Twenty-seven participants divided into 3 groups: 9 post-residents, 6 senior and 12 junior residents. RESULTS Two models, a stand-alone model trained from scratch and another leveraging transfer learning, were trained on nine selected surgical metrics achieving 75 % and 87.5 % testing accuracy respectively. CONCLUSIONS This study presents a novel blueprint for addressing limited datasets in surgical simulations through the strategic use of transfer learning and data augmentation. It also evaluates and reinforces the application of the Connection Weights Algorithm from our previous publication. Together, these methodologies not only enhance the precision of performance classification but also advance the validation of surgical training platforms.
Collapse
Affiliation(s)
- Sami Alkadri
- Musculoskeletal Biomechanics Research Lab, Department of Mechanical Engineering, McGill University, Macdonald Engineering Building, 815 Sherbrooke St W, Montreal, H3A 2K7, QC, Canada; Neurosurgical Simulation and Artificial Intelligence Learning Centre, Department of Neurology & Neurosurgery, Montreal Neurological Institute, McGill University, 2200 Leo Pariseau, Suite, 2210, Montreal, H2X 4B3, Quebec, Canada
| | - Rolando F Del Maestro
- Neurosurgical Simulation and Artificial Intelligence Learning Centre, Department of Neurology & Neurosurgery, Montreal Neurological Institute, McGill University, 2200 Leo Pariseau, Suite, 2210, Montreal, H2X 4B3, Quebec, Canada
| | - Mark Driscoll
- Musculoskeletal Biomechanics Research Lab, Department of Mechanical Engineering, McGill University, Macdonald Engineering Building, 815 Sherbrooke St W, Montreal, H3A 2K7, QC, Canada; Orthopaedic Research Lab, Montreal General Hospital, 1650 Cedar Ave (LS1.409), Montreal, H3G 1A4, Quebec, Canada.
| |
Collapse
|
5
|
Singh S, Kaur N, Gehlot A. Application of artificial intelligence in drug design: A review. Comput Biol Med 2024; 179:108810. [PMID: 38991316 DOI: 10.1016/j.compbiomed.2024.108810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 05/31/2024] [Accepted: 06/24/2024] [Indexed: 07/13/2024]
Abstract
Artificial intelligence (AI) is a field of computer science that involves acquiring information, developing rule bases, and mimicking human behaviour. The fundamental concept behind AI is to create intelligent computer systems that can operate with minimal human intervention or without any intervention at all. These rule-based systems are developed using various machine learning and deep learning models, enabling them to solve complex problems. AI is integrated with these models to learn, understand, and analyse provided data. The rapid advancement of Artificial Intelligence (AI) is reshaping numerous industries, with the pharmaceutical sector experiencing a notable transformation. AI is increasingly being employed to automate, optimize, and personalize various facets of the pharmaceutical industry, particularly in pharmacological research. Traditional drug development methods areknown for being time-consuming, expensive, and less efficient, often taking around a decade and costing billions of dollars. The integration of artificial intelligence (AI) techniques addresses these challenges by enabling the examination of compounds with desired properties from a vast pool of input drugs. Furthermore, it plays a crucial role in drug screening by predicting toxicity, bioactivity, ADME properties (absorption, distribution, metabolism, and excretion), physicochemical properties, and more. AI enhances the drug design process by improving the efficiency and accuracy of predicting drug behaviour, interactions, and properties. These approaches further significantly improve the precision of drug discovery processes and decrease clinical trial costs leading to the development of more effective drugs.
Collapse
Affiliation(s)
- Simrandeep Singh
- Department of Electronics & Communication Engineering, UCRD, Chandigarh University, Gharuan, Punjab, India.
| | - Navjot Kaur
- Department of Pharmacognosy, Amar Shaheed Baba Ajit Singh Jujhar Singh Memorial College of Pharmacy, Bela, Ropar, India
| | - Anita Gehlot
- Uttaranchal Institute of technology, Uttaranchal University, Dehradun, India
| |
Collapse
|
6
|
Maryam, Rehman MU, Hussain I, Tayara H, Chong KT. A graph neural network approach for predicting drug susceptibility in the human microbiome. Comput Biol Med 2024; 179:108729. [PMID: 38955124 DOI: 10.1016/j.compbiomed.2024.108729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 06/04/2024] [Accepted: 06/08/2024] [Indexed: 07/04/2024]
Abstract
Recent studies have illuminated the critical role of the human microbiome in maintaining health and influencing the pharmacological responses of drugs. Clinical trials, encompassing approximately 150 drugs, have unveiled interactions with the gastrointestinal microbiome, resulting in the conversion of these drugs into inactive metabolites. It is imperative to explore the field of pharmacomicrobiomics during the early stages of drug discovery, prior to clinical trials. To achieve this, the utilization of machine learning and deep learning models is highly desirable. In this study, we have proposed graph-based neural network models, namely GCN, GAT, and GINCOV models, utilizing the SMILES dataset of drug microbiome. Our primary objective was to classify the susceptibility of drugs to depletion by gut microbiota. Our results indicate that the GINCOV surpassed the other models, achieving impressive performance metrics, with an accuracy of 93% on the test dataset. This proposed Graph Neural Network (GNN) model offers a rapid and efficient method for screening drugs susceptible to gut microbiota depletion and also encourages the improvement of patient-specific dosage responses and formulations.
Collapse
Affiliation(s)
- Maryam
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea
| | - Mobeen Ur Rehman
- Khalifa University Center for Autonomous Robotic Systems (KUCARS), Khalifa University, United Arab Emirates
| | - Irfan Hussain
- Khalifa University Center for Autonomous Robotic Systems (KUCARS), Khalifa University, United Arab Emirates
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju, 54896, South Korea.
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea; Advances Electronics and Information Research Centre, Jeonbuk National University, Jeonju, 54896, South Korea.
| |
Collapse
|
7
|
Kataria R, Duhan N, Kaundal R. Navigating the human-monkeypox virus interactome: HuPoxNET atlas reveals functional insights. Front Microbiol 2024; 15:1399555. [PMID: 39155985 PMCID: PMC11327128 DOI: 10.3389/fmicb.2024.1399555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 07/09/2024] [Indexed: 08/20/2024] Open
Abstract
Monkeypox virus, a close relative of variola virus, has significantly increased the incidence of monkeypox disease in humans, with several clinical symptoms. The sporadic spread of the disease outbreaks has resulted in the need for a comprehensive understanding of the molecular mechanisms underlying disease infection and potential therapeutic targets. Protein-protein interactions play a crucial role in various cellular processes and regulate different immune signals during virus infection. Computational algorithms have gained high significance in the prediction of potential protein interaction pairs. Here, we developed a comprehensive database called HuPoxNET (https://kaabil.net/hupoxnet/) using the state-of-the-art MERN stack technology. The database leverages two sequence-based computational models to predict strain-specific protein-protein interactions between human and monkeypox virus proteins. Furthermore, various protein annotations of the human and viral proteins such as gene ontology, KEGG pathways, subcellular localization, protein domains, and novel drug targets identified from our study are also available on the database. HuPoxNET is a user-friendly platform for the scientific community to gain more insights into the monkeypox disease infection and aid in the development of therapeutic drugs against the disease.
Collapse
Affiliation(s)
- Raghav Kataria
- Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences, Logan, UT, United States
- Bioinformatics Facility, Center for Integrated BioSystems, Logan, UT, United States
| | - Naveen Duhan
- Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences, Logan, UT, United States
- Bioinformatics Facility, Center for Integrated BioSystems, Logan, UT, United States
| | - Rakesh Kaundal
- Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences, Logan, UT, United States
- Bioinformatics Facility, Center for Integrated BioSystems, Logan, UT, United States
- Department of Computer Science, College of Science, Utah State University, Logan, UT, United States
| |
Collapse
|
8
|
Zheng X, Lamoth CJ, Timmerman H, Otten E, Reneman MF. Establishing central sensitization inventory cut-off values in Dutch-speaking patients with chronic low back pain by unsupervised machine learning. Comput Biol Med 2024; 178:108739. [PMID: 38875910 DOI: 10.1016/j.compbiomed.2024.108739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 05/29/2024] [Accepted: 06/08/2024] [Indexed: 06/16/2024]
Abstract
BACKGROUND Human Assumed Central Sensitization (HACS) is involved in the development and maintenance of chronic low back pain (CLBP). The Central Sensitization Inventory (CSI) was developed to evaluate the presence of HACS, with a cut-off value of 40/100. However, various factors including pain conditions (e.g., CLBP), contexts, and gender may influence this cut-off value. Unsupervised clustering approaches can address these complexities by considering diverse factors and exploring possible HACS-related subgroups. Therefore, this study aimed to determine the cut-off values for a Dutch-speaking population with CLBP based on unsupervised machine learning. METHODS Questionnaire data covering pain, physical, and psychological aspects were collected from patients with CLBP and aged-matched healthy controls (HC). Four clustering approaches were applied to identify HACS-related subgroups based on the questionnaire data and gender. The clustering performance was assessed using internal and external indicators. Subsequently, receiver operating characteristic (ROC) analysis was conducted on the best clustering results to determine the optimal cut-off values. RESULTS The study included 63 HCs and 88 patients with CLBP. Hierarchical clustering yielded the best results, identifying three clusters: healthy group, CLBP with low HACS level, and CLBP with high HACS level groups. The cut-off value for the overall groups were 35 (sensitivity 0.76, specificity 0.76). CONCLUSION This study found distinct patient subgroups. An overall CSI cut-off value of 35 was suggested. This study may provide new insights into identifying HACS-related patterns and contributes to establishing accurate cut-off values.
Collapse
Affiliation(s)
- Xiaoping Zheng
- University of Groningen, University Medical Center Groningen, Department of Human Movement Sciences, Groningen, the Netherlands
| | - Claudine Jc Lamoth
- University of Groningen, University Medical Center Groningen, Department of Human Movement Sciences, Groningen, the Netherlands
| | - Hans Timmerman
- University of Groningen, University Medical Center Groningen, Department of Anesthesiology, Pain Center, Groningen, the Netherlands
| | - Egbert Otten
- University of Groningen, University Medical Center Groningen, Department of Human Movement Sciences, Groningen, the Netherlands
| | - Michiel F Reneman
- University of Groningen, University Medical Center Groningen, Department of Rehabilitation Medicine, Groningen, the Netherlands.
| |
Collapse
|
9
|
El-Assaad AM, Hamieh T. SARS-CoV-2: Prediction of critical ionic amino acid mutations. Comput Biol Med 2024; 178:108688. [PMID: 38870723 DOI: 10.1016/j.compbiomed.2024.108688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 05/26/2024] [Accepted: 06/01/2024] [Indexed: 06/15/2024]
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), that caused coronavirus disease 2019 (COVID-19), has been studied thoroughly, and several variants are revealed across the world with their corresponding mutations. Studies and vaccines development focus on the genetic mutations of the S protein due to its vital role in allowing the virus attach and fuse with the membrane of a host cell. In this perspective, we study the effects of all ionic amino acid mutations of the SARS-CoV-2 viral spike protein S1 when bound to Antibody CC12.1 within the SARS-CoV-2:CC12.1 complex model. Binding free energy calculations between SARS-CoV-2 and antibody CC12.1 are based on the Analysis of Electrostatic Similarities of Proteins (AESOP) framework, where the electrostatic potentials are calculated using Adaptive Poisson-Boltzmann Solver (APBS). The atomic radii and charges that feed into the APBS calculations are calculated using the PDB2PQR software. Our results are the first to propose in silico potential life-threatening mutations of SARS-CoV-2 beyond the present mutations found in the five common variants worldwide. We find each of the following mutations: K378A, R408A, K424A, R454A, R457A, K458A, and K462A, to play significant roles in the binding to Antibody CC12.1, since they are turned into strong inhibitors on both chains of the S1 protein, whereas the mutations D405A, D420A, and D427A, show to play important roles in this binding, as they are turned into mild inhibitors on both chains of the S1 protein.
Collapse
Affiliation(s)
- Atlal M El-Assaad
- Department of Electrical Engineering & Computer Science, University of Toledo (UT), Toledo OH 43606, USA; Department of Computer Science, Lebanese International University (LIU), Bekaa, Lebanon.
| | - Tayssir Hamieh
- Faculty of Science and Engineering, Maastricht University, P.O. Box 616, 6200 MD Maastricht, the Netherlands; Laboratory of Materials, Catalysis, Environment and Analytical Methods (MCEMA), Faculty of Sciences, Lebanese University, Hadath, Lebanon.
| |
Collapse
|
10
|
Srisongkram T. DeepRA: A novel deep learning-read-across framework and its application in non-sugar sweeteners mutagenicity prediction. Comput Biol Med 2024; 178:108731. [PMID: 38870727 DOI: 10.1016/j.compbiomed.2024.108731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 05/07/2024] [Accepted: 06/08/2024] [Indexed: 06/15/2024]
Abstract
Non-sugar sweeteners (NSSs) or artificial sweeteners have long been used as food chemicals since World War II. NSSs, however, also raise a concern about their mutagenicity. Evaluating the mutagenic ability of NSSs is crucial for food safety; this step is needed for every new chemical registration in the food and pharmaceutical industries. A computational assessment provides less time, money, and involved animals than the in vivo experiments; thus, this study developed a novel computational method from an ensemble convolutional deep neural network and read-across algorithms, called DeepRA, to classify the mutagenicity of chemicals. The mutagenicity data were obtained from the curated Ames test data set. The DeepRA model was developed using both molecular descriptors and molecular fingerprints. The obtained DeepRA model provides accurate and reliable mutagenicity classification through an independent test set. This model was then used to examine the NSSs-related chemicals, enabling the evaluation of mutagenicity from the NSSs-like substances. Finally, this model was publicly available at https://github.com/taraponglab/deepra for further use in chemical regulation and risk assessment.
Collapse
Affiliation(s)
- Tarapong Srisongkram
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand.
| |
Collapse
|
11
|
Ghafoor H, Asim MN, Ibrahim MA, Ahmed S, Dengel A. CAPTURE: Comprehensive anti-cancer peptide predictor with a unique amino acid sequence encoder. Comput Biol Med 2024; 176:108538. [PMID: 38759585 DOI: 10.1016/j.compbiomed.2024.108538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 04/26/2024] [Accepted: 04/28/2024] [Indexed: 05/19/2024]
Abstract
Anticancer peptides (ACPs) key properties including bioactivity, high efficacy, low toxicity, and lack of drug resistance make them ideal candidates for cancer therapies. To deeply explore the potential of ACPs and accelerate development of cancer therapies, although 53 Artificial Intelligence supported computational predictors have been developed for ACPs and non ACPs classification but only one predictor has been developed for ACPs functional types annotations. Moreover, these predictors extract amino acids distribution patterns to transform peptides sequences into statistical vectors that are further fed to classifiers for discriminating peptides sequences and annotating peptides functional classes. Overall, these predictors remain fail in extracting diverse types of amino acids distribution patterns from peptide sequences. The paper in hand presents a unique CARE encoder that transforms peptides sequences into statistical vectors by extracting 4 different types of distribution patterns including correlation, distribution, composition, and transition. Across public benchmark dataset, proposed encoder potential is explored under two different evaluation settings namely; intrinsic and extrinsic. Extrinsic evaluation indicates that 12 different machine learning classifiers achieve superior performance with the proposed encoder as compared to 55 existing encoders. Furthermore, an intrinsic evaluation reveals that, unlike existing encoders, the proposed encoder generates more discriminative clusters for ACPs and non-ACPs classes. Across 8 public benchmark ACPs and non-ACPs classification datasets, proposed encoder and Adaboost classifier based CAPTURE predictor outperforms existing predictors with an average accuracy, recall and MCC score of 1%, 4%, and 2% respectively. In generalizeability evaluation case study, across 7 benchmark anti-microbial peptides classification datasets, CAPTURE surpasses existing predictors by an average AU-ROC of 2%. CAPTURE predictive pipeline along with label powerset method outperforms state-of-the-art ACPs functional types predictor by 5%, 5%, 5%, 6%, and 3% in terms of average accuracy, subset accuracy, precision, recall, and F1 respectively. CAPTURE web application is available at https://sds_genetic_analysis.opendfki.de/CAPTURE.
Collapse
Affiliation(s)
- Hina Ghafoor
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| | - Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany.
| | - Muhammad Ali Ibrahim
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| | - Andreas Dengel
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| |
Collapse
|
12
|
Chikhale RV, Choudhary R, Malhotra J, Eldesoky GE, Mangal P, Patil PC. Identification of novel hit molecules targeting M. tuberculosis polyketide synthase 13 by combining generative AI and physics-based methods. Comput Biol Med 2024; 176:108573. [PMID: 38723396 DOI: 10.1016/j.compbiomed.2024.108573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 05/05/2024] [Accepted: 05/06/2024] [Indexed: 05/31/2024]
Abstract
In this work we investigated the Pks13-TE domain, which plays a critical role in the viability of the mycobacteria. In this report, we have used a series of AI and Physics-based tools to identify Pks13-TE inhibitors. The Reinvent 4, pKCSM, KDeep, and SwissADME are AI-ML-based tools. AutoDock Vina, PLANTS, MDS, and MM-GBSA are physics-based methods. A combination of these methods yields powerful support in the drug discovery cycle. Known inhibitors of Pks13-TE were collected, curated, and used as input for the AI-based tools, and Mol2Mol molecular optimisation methods generated novel inhibitors. These ligands were filtered based on physics-based methods like molecular docking and molecular dynamics using multiple tools for consensus generation. Rigorous analysis was performed on the selected compounds to reduce the chemical space while retaining the most promising compounds. The molecule interactions, stability of the protein-ligand complexes and the comparable binding energies with the native ligand were essential factors for narrowing the ligands set. The filtered ligands from docking, MDS, and binding energy colocations were further tested for their ADMET properties since they are among the essential criteria for this series of molecules. It was found that ligands Mt1 to Mt6 have excellent predicted pharmacokinetic, pharmacodynamic and toxicity profiles and good synthesisability.
Collapse
Affiliation(s)
- Rupesh V Chikhale
- Department of Pharmaceutical and Biological Chemistry, School of Pharmacy, University College London, London, UK.
| | - Rinku Choudhary
- SilicoScientia Private Limited, Nagananda Commercial Complex, No. 07/3, 15/1, 18th Main Road, Jayanagar 9th Block, Bengaluru, 5600413, India; Department of Bioinformatics, Rajiv Gandhi Institute of IT and Biotechnology, Bharati Vidyapeeth Deemed to Be University, Pune-Satara Road, Pune, India
| | - Jagriti Malhotra
- SilicoScientia Private Limited, Nagananda Commercial Complex, No. 07/3, 15/1, 18th Main Road, Jayanagar 9th Block, Bengaluru, 5600413, India; Department of Bioinformatics, Rajiv Gandhi Institute of IT and Biotechnology, Bharati Vidyapeeth Deemed to Be University, Pune-Satara Road, Pune, India
| | - Gaber E Eldesoky
- Chemistry Department, College of Science, King Saud University, Riyadh, 11451, Saudi Arabia
| | - Parth Mangal
- SilicoScientia Private Limited, Nagananda Commercial Complex, No. 07/3, 15/1, 18th Main Road, Jayanagar 9th Block, Bengaluru, 5600413, India; Department of Bioinformatics, Rajiv Gandhi Institute of IT and Biotechnology, Bharati Vidyapeeth Deemed to Be University, Pune-Satara Road, Pune, India
| | - Pritee Chunarkar Patil
- Department of Bioinformatics, Rajiv Gandhi Institute of IT and Biotechnology, Bharati Vidyapeeth Deemed to Be University, Pune-Satara Road, Pune, India
| |
Collapse
|
13
|
Le VT, Malik MS, Tseng YH, Lee YC, Huang CI, Ou YY. DeepPLM_mCNN: An approach for enhancing ion channel and ion transporter recognition by multi-window CNN based on features from pre-trained language models. Comput Biol Chem 2024; 110:108055. [PMID: 38555810 DOI: 10.1016/j.compbiolchem.2024.108055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2023] [Revised: 02/28/2024] [Accepted: 03/19/2024] [Indexed: 04/02/2024]
Abstract
Accurate classification of membrane proteins like ion channels and transporters is critical for elucidating cellular processes and drug development. We present DeepPLM_mCNN, a novel framework combining Pretrained Language Models (PLMs) and multi-window convolutional neural networks (mCNNs) for effective classification of membrane proteins into ion channels and ion transporters. Our approach extracts informative features from protein sequences by utilizing various PLMs, including TAPE, ProtT5_XL_U50, ESM-1b, ESM-2_480, and ESM-2_1280. These PLM-derived features are then input into a mCNN architecture to learn conserved motifs important for classification. When evaluated on ion transporters, our best performing model utilizing ProtT5 achieved 90% sensitivity, 95.8% specificity, and 95.4% overall accuracy. For ion channels, we obtained 88.3% sensitivity, 95.7% specificity, and 95.2% overall accuracy using ESM-1b features. Our proposed DeepPLM_mCNN framework demonstrates significant improvements over previous methods on unseen test data. This study illustrates the potential of combining PLMs and deep learning for accurate computational identification of membrane proteins from sequence data alone. Our findings have important implications for membrane protein research and drug development targeting ion channels and transporters. The data and source codes in this study are publicly available at the following link: https://github.com/s1129108/DeepPLM_mCNN.
Collapse
Affiliation(s)
- Van-The Le
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan
| | - Muhammad-Shahid Malik
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan; Department of Computer Science and Engineering, Karakoram International University, Pakistan
| | - Yi-Hsuan Tseng
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan
| | - Yu-Cheng Lee
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan
| | - Cheng-I Huang
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan
| | - Yu-Yen Ou
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan; Graduate Program in Biomedical Informatics, Yuan Ze University, Chung-Li, 32003, Taiwan.
| |
Collapse
|
14
|
da Silva LSA, Seman LO, Camponogara E, Mariani VC, Dos Santos Coelho L. Bilinear optimization of protein structure prediction: An exact approach via AB off-lattice model. Comput Biol Med 2024; 176:108558. [PMID: 38754216 DOI: 10.1016/j.compbiomed.2024.108558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 04/25/2024] [Accepted: 05/05/2024] [Indexed: 05/18/2024]
Abstract
Protein structure prediction (PSP) remains a central challenge in computational biology due to its inherent complexity and high dimensionality. While numerous heuristic approaches have appeared in the literature, their success varies. The AB off-lattice model, which characterizes proteins as sequences of A (hydrophobic) and B (hydrophilic) beads, presents a simplified perspective on PSP. This work presents a mathematical optimization-based methodology capitalizing on the off-lattice AB model. Dissecting the inherent non-linearities of the energy landscape of protein folding allowed for formulating the PSP as a bilinear optimization problem. This formulation was achieved by introducing auxiliary variables and constraints that encapsulate the nuanced relationship between the protein's conformational space and its energy landscape. The proposed bilinear model exhibited notable accuracy in pinpointing the global minimum energy conformations on a benchmark dataset presented by the Protein Data Bank (PDB). Compared to traditional heuristic-based methods, this bilinear approach yielded exact solutions, reducing the likelihood of local minima entrapment. This research highlights the potential of reframing the traditionally non-linear protein structure prediction problem into a bilinear optimization problem through the off-lattice AB model. Such a transformation offers a route toward methodologies that can determine the global solution, challenging current PSP paradigms. Exploration into hybrid models, merging bilinear optimization and heuristic components, might present an avenue for balancing accuracy with computational efficiency.
Collapse
Affiliation(s)
- Luiza Scapinello Aquino da Silva
- Electrical Engineering Graduate Program (PPGEE), Federal University of Parana (UFPR), Coronel Francisco Heraclito dos Santos, Curitiba, 81530-000, Paraná, Brazil.
| | - Laio Oriel Seman
- Department of Automation and Systems Engineering, Federal University of Santa Catarina (UFSC), Engenheiro Agronômico Andrei Cristian Ferreira, Florianópolis, 88040-900, Santa Catarina, Brazil
| | - Eduardo Camponogara
- Department of Automation and Systems Engineering, Federal University of Santa Catarina (UFSC), Engenheiro Agronômico Andrei Cristian Ferreira, Florianópolis, 88040-900, Santa Catarina, Brazil
| | - Viviana Cocco Mariani
- Electrical Engineering Graduate Program (PPGEE), Federal University of Parana (UFPR), Coronel Francisco Heraclito dos Santos, Curitiba, 81530-000, Paraná, Brazil; Mechanical Engineering Graduate Program (PGMec), Federal University of Parana (UFPR), Coronel Francisco Heraclito dos Santos, Curitiba, 81530-000, Paraná, Brazil
| | - Leandro Dos Santos Coelho
- Electrical Engineering Graduate Program (PPGEE), Federal University of Parana (UFPR), Coronel Francisco Heraclito dos Santos, Curitiba, 81530-000, Paraná, Brazil
| |
Collapse
|
15
|
Siuly S, Khare SK, Kabir E, Sadiq MT, Wang H. An efficient Parkinson's disease detection framework: Leveraging time-frequency representation and AlexNet convolutional neural network. Comput Biol Med 2024; 174:108462. [PMID: 38599069 DOI: 10.1016/j.compbiomed.2024.108462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 04/07/2024] [Accepted: 04/07/2024] [Indexed: 04/12/2024]
Abstract
Parkinson's disease (PD) is a progressive neurodegenerative disorder affecting the quality of life of over 10 million individuals worldwide. Early diagnosis is crucial for timely intervention and better patient outcomes. Electroencephalogram (EEG) signals are commonly used for early PD diagnosis due to their potential in monitoring disease progression. But traditional EEG-based methods lack exploration of brain regions that provide essential information about PD, and their performance falls short for real-time applications. To address these limitations, this study proposes a novel approach using a Time-Frequency Representation (TFR) based AlexNet Convolutional Neural Network (CNN) model to explore EEG channel-based analysis and identify critical brain regions efficiently diagnosing PD from EEG data. The Wavelet Scattering Transform (WST) is employed to capture distinct temporal and spectral characteristics, while AlexNet CNN is utilized to detect complex spatial patterns at different scales, accurately identifying intricate EEG patterns associated with PD. The experiment results on two real-time EEG PD datasets: San Diego dataset and the Iowa dataset demonstrate that frontal and central brain regions, including AF4 and AFz electrodes, contribute significantly to providing more representative features compared to other regions for PD detection. The proposed architecture achieves an impressive accuracy of 99.84% for the San Diego dataset and 95.79% for the Iowa dataset, outperforming existing EEG-based PD detection methods. The findings of this research will assist to create an essential technology for efficient PD diagnosis, enhancing patient care and quality of life.
Collapse
Affiliation(s)
- Siuly Siuly
- Institute for Sustainable Industries & Liveable Cities, Victoria University, Melbourne, Australia; Centre for Health Research, University of Southern Queensland, Toowoomba, Australia.
| | - Smith K Khare
- Mærsk Mc-Kinney Møller Institute, Faculty of Engineering, University of Southern Denmark, Denmark
| | - Enamul Kabir
- School of Mathematics, Physics and Computing, University of Southern Queensland, Toowoomba, Australia
| | - Muhammad Tariq Sadiq
- School of Computer Science and Electronic Engineering, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, United Kingdom
| | - Hua Wang
- Institute for Sustainable Industries & Liveable Cities, Victoria University, Melbourne, Australia
| |
Collapse
|
16
|
Nada H, Kim S, Lee K. PT-Finder: A multi-modal neural network approach to target identification. Comput Biol Med 2024; 174:108444. [PMID: 38636325 DOI: 10.1016/j.compbiomed.2024.108444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 04/04/2024] [Accepted: 04/07/2024] [Indexed: 04/20/2024]
Abstract
Efficient target identification for bioactive compounds, including novel synthetic analogs, is crucial for accelerating the drug discovery pipeline. However, the process of target identification presents significant challenges and is often expensive, which in turn can hinder the drug discovery efforts. To address these challenges machine learning applications have arisen as a promising approach for predicting the targets for novel chemical compounds. These methods allow the exploration of ligand-target interactions, uncovering of biochemical mechanisms, and the investigation of drug repurposing. Typically, the current target identification tools rely on assessing ligand structural similarities. Herein, a multi-modal neural network model was built using a library of proteins, their respective sequences, and active inhibitors. Subsequent validations showed the model to possess accuracy of 82 % and MPRAUC of 0.80. Leveraging the trained model, we developed PT-Finder (Protein Target Finder), a user-friendly offline application that is capable of predicting the target proteins for hundreds of compounds within a few seconds. This combination of offline operation, speed, and accuracy positions PT-Finder as a powerful tool to accelerate drug discovery workflows. PT-Finder and its source codes have been made freely accessible for download at https://github.com/PT-Finder/PT-Finder.
Collapse
Affiliation(s)
- Hossam Nada
- BK21 FOUR Team and Integrated Research Institute for Drug Development, College of Pharmacy, Dongguk University-Seoul, Goyang, 10326, Republic of Korea
| | - Sungdo Kim
- BK21 FOUR Team and Integrated Research Institute for Drug Development, College of Pharmacy, Dongguk University-Seoul, Goyang, 10326, Republic of Korea
| | - Kyeong Lee
- BK21 FOUR Team and Integrated Research Institute for Drug Development, College of Pharmacy, Dongguk University-Seoul, Goyang, 10326, Republic of Korea.
| |
Collapse
|
17
|
Zhang H, Liu X, Cheng W, Wang T, Chen Y. Prediction of drug-target binding affinity based on deep learning models. Comput Biol Med 2024; 174:108435. [PMID: 38608327 DOI: 10.1016/j.compbiomed.2024.108435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 04/05/2024] [Accepted: 04/07/2024] [Indexed: 04/14/2024]
Abstract
The prediction of drug-target binding affinity (DTA) plays an important role in drug discovery. Computerized virtual screening techniques have been used for DTA prediction, greatly reducing the time and economic costs of drug discovery. However, these techniques have not succeeded in reversing the low success rate of new drug development. In recent years, the continuous development of deep learning (DL) technology has brought new opportunities for drug discovery through the DTA prediction. This shift has moved the prediction of DTA from traditional machine learning methods to DL. The DL frameworks used for DTA prediction include convolutional neural networks (CNN), graph convolutional neural networks (GCN), and recurrent neural networks (RNN), and reinforcement learning (RL), among others. This review article summarizes the available literature on DTA prediction using DL models, including DTA quantification metrics and datasets, and DL algorithms used for DTA prediction (including input representation of models, neural network frameworks, valuation indicators, and model interpretability). In addition, the opportunities, challenges, and prospects of the application of DL frameworks for DTA prediction in the field of drug discovery are discussed.
Collapse
Affiliation(s)
- Hao Zhang
- College of Science, Nanjing Agricultural University, Nanjing, 210095, China
| | - Xiaoqian Liu
- College of Science, Nanjing Agricultural University, Nanjing, 210095, China
| | - Wenya Cheng
- College of Science, Nanjing Agricultural University, Nanjing, 210095, China
| | - Tianshi Wang
- College of Science, Nanjing Agricultural University, Nanjing, 210095, China
| | - Yuanyuan Chen
- College of Science, Nanjing Agricultural University, Nanjing, 210095, China.
| |
Collapse
|
18
|
Karampuri A, Kundur S, Perugu S. Exploratory drug discovery in breast cancer patients: A multimodal deep learning approach to identify novel drug candidates targeting RTK signaling. Comput Biol Med 2024; 174:108433. [PMID: 38642491 DOI: 10.1016/j.compbiomed.2024.108433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/04/2024] [Accepted: 04/07/2024] [Indexed: 04/22/2024]
Abstract
Breast cancer, a highly formidable and diverse malignancy predominantly affecting women globally, poses a significant threat due to its intricate genetic variability, rendering it challenging to diagnose accurately. Various therapies such as immunotherapy, radiotherapy, and diverse chemotherapy approaches like drug repurposing and combination therapy are widely used depending on cancer subtype and metastasis severity. Our study revolves around an innovative drug discovery strategy targeting potential drug candidates specific to RTK signalling, a prominently targeted receptor class in cancer. To accomplish this, we have developed a multimodal deep neural network (MM-DNN) based QSAR model integrating omics datasets to elucidate genomic, proteomic expression data, and drug responses, validated rigorously. The results showcase an R2 value of 0.917 and an RMSE value of 0.312, affirming the model's commendable predictive capabilities. Structural analogs of drug molecules specific to RTK signalling were sourced from the PubChem database, followed by meticulous screening to eliminate dissimilar compounds. Leveraging the MM-DNN-based QSAR model, we predicted the biological activity of these molecules, subsequently clustering them into three distinct groups. Feature importance analysis was performed. Consequently, we successfully identified prime drug candidates tailored for each potential downstream regulatory protein within the RTK signalling pathway. This method makes the early stages of drug development faster by removing inactive compounds, providing a hopeful path in combating breast cancer.
Collapse
Affiliation(s)
- Anush Karampuri
- Department of Biotechnology, National Institute of Technology, Warangal, 500604, India
| | - Sunitha Kundur
- Department of Biotechnology, National Institute of Technology, Warangal, 500604, India
| | - Shyam Perugu
- Department of Biotechnology, National Institute of Technology, Warangal, 500604, India.
| |
Collapse
|
19
|
Macedo-da-Silva J, Mule SN, Rosa-Fernandes L, Palmisano G. A computational pipeline elucidating functions of conserved hypothetical Trypanosoma cruzi proteins based on public proteomic data. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2024; 138:401-428. [PMID: 38220431 DOI: 10.1016/bs.apcsb.2023.07.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
The proteome is complex, dynamic, and functionally diverse. Functional proteomics aims to characterize the functions of proteins in biological systems. However, there is a delay in annotating the function of proteins, even in model organisms. This gap is even greater in other organisms, including Trypanosoma cruzi, the causative agent of the parasitic, systemic, and sometimes fatal disease called Chagas disease. About 99.8% of Trypanosoma cruzi proteome is not manually annotated (unreviewed), among which>25% are conserved hypothetical proteins (CHPs), calling attention to the knowledge gap on the protein content of this organism. CHPs are conserved proteins among different species of various evolutionary lineages; however, they lack functional validation. This study describes a bioinformatics pipeline applied to public proteomic data to infer possible biological functions of conserved hypothetical Trypanosoma cruzi proteins. Here, the adopted strategy consisted of collecting differentially expressed proteins between the epimastigote and metacyclic trypomastigotes stages of Trypanosoma cruzi; followed by the functional characterization of these CHPs applying a manifold learning technique for dimension reduction and 3D structure homology analysis (Spalog). We found a panel of 25 and 26 upregulated proteins in the epimastigote and metacyclic trypomastigote stages, respectively; among these, 18 CHPs (8 in the epimastigote stage and 10 in the metacyclic stage) were characterized. The data generated corroborate the literature and complement the functional analyses of differentially regulated proteins at each stage, as they attribute potential functions to CHPs, which are frequently identified in Trypanosoma cruzi proteomics studies. However, it is important to point out that experimental validation is required to deepen our understanding of the CHPs.
Collapse
Affiliation(s)
- Janaina Macedo-da-Silva
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, Sao Paulo, Brazil
| | - Simon Ngao Mule
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, Sao Paulo, Brazil
| | - Livia Rosa-Fernandes
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, Sao Paulo, Brazil; Centre for Motor Neuron Disease Research, Faculty of Medicine, Health & Human Sciences, Macquarie Medical School, Sydney, NSW, Australia
| | - Giuseppe Palmisano
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, Sao Paulo, Brazil; School of Natural Sciences, Macquarie University, Sydney, NSW, Australia.
| |
Collapse
|
20
|
Sharma K, Saini N, Hasija Y. Identifying the mitochondrial metabolism network by integration of machine learning and explainable artificial intelligence in skeletal muscle in type 2 diabetes. Mitochondrion 2024; 74:101821. [PMID: 38040172 DOI: 10.1016/j.mito.2023.11.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 10/04/2023] [Accepted: 11/26/2023] [Indexed: 12/03/2023]
Abstract
Imbalance in glucose metabolism and insulin resistance are two primary features of type 2 diabetes/diabetes mellitus. Its etiology is linked to mitochondrial dysfunction in skeletal muscle tissue. The mitochondria are vital organelles involved in ATP synthesis and metabolism. The underlying biological pathways leading to mitochondrial dysfunction in type 2 diabetes can help us understand the pathophysiology of the disease. In this study, the mitochondrial gene expression dataset were retrieved from the GSE22309, GSE25462, and GSE18732 using Mitocarta 3.0, focusing specifically on genes that are associated with mitochondrial function in type 2 disease. Feature selection on the expression dataset of skeletal muscle tissue from 107 control patients and 70 type 2 diabetes patients using the XGBoost algorithm having the highest accuracy. For interpretation and analysis of results linked to the disease by examining the feature importance deduced from the model was done using SHAP (SHapley Additive exPlanations). Next, to comprehend the biological connections, study of protein-protien and mRNA-miRNA networks was conducted using String and Mienturnet respectively. The analysis revealed BDH1, YARS2, AKAP10, RARS2, MRPS31, were potential mitochondrial target genes among the other twenty genes. These genes are mainly involved in the transport and organization of mitochondria, regulation of its membrane potential, and intrinsic apoptotic signaling etc. mRNA-miRNA interaction network revealed a significant role of miR-375; miR-30a-5p; miR-16-5p; miR-129-5p; miR-1229-3p; and miR-1224-3p; in the regulation of mitochondrial function exhibited strong associations with type 2 diabetes. These results might aid in the creation of novel targets for therapy and type 2 diabetes biomarkers.
Collapse
Affiliation(s)
- Kritika Sharma
- CSIR-Institute of Genomics and Integrative Biology, Mall Road, New Delhi 110007, India; Department of Biotechnology, Delhi Technological University, Delhi 110042, India
| | - Neeru Saini
- CSIR-Institute of Genomics and Integrative Biology, Mall Road, New Delhi 110007, India; Academy of Scientific & Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Yasha Hasija
- Department of Biotechnology, Delhi Technological University, Delhi 110042, India.
| |
Collapse
|
21
|
Fu T, Zeng S, Zheng Q, Zhu F. The Important Role of Transporter Structures in Drug Disposition, Efficacy, and Toxicity. Drug Metab Dispos 2023; 51:1316-1323. [PMID: 37295948 DOI: 10.1124/dmd.123.001275] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 05/27/2023] [Accepted: 06/02/2023] [Indexed: 06/12/2023] Open
Abstract
The ATP-binding cassette (ABC) and solute carrier (SLC) transporters are critical determinants of drug disposition, clinical efficacy, and toxicity as they specifically mediate the influx and efflux of various substrates and drugs. ABC transporters can modulate the pharmacokinetics of many drugs via mediating the translocation of drugs across biologic membranes. SLC transporters are important drug targets involved in the uptake of a broad range of compounds across the membrane. However, high-resolution experimental structures have been reported for a very limited number of transporters, which limits the study of their physiologic functions. In this review, we collected structural information on ABC and SLC transporters and described the application of computational methods in structure prediction. Taking P-glycoprotein (ABCB1) and serotonin transporter (SLC6A4) as examples, we assessed the pivotal role of structure in transport mechanisms, details of ligand-receptor interactions, drug selectivity, the molecular mechanisms of drug-drug interactions, and differences caused by genetic polymorphisms. The data collected contributes toward safer and more effective pharmacological treatments. SIGNIFICANCE STATEMENT: The experimental structure of ATP-binding cassette and solute carrier transporters was collected, and the application of computational methods in structure prediction was described. P-glycoprotein and serotonin transporter were used as examples to reveal the pivotal role of structure in transport mechanisms, drug selectivity, the molecular mechanisms of drug-drug interactions, and differences caused by genetic polymorphisms.
Collapse
Affiliation(s)
- Tingting Fu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, China (F.Z.); School of Pharmaceutical Sciences, Jilin University, Changchun, China (T.F., Q.Z.); College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China (S.Z., F.Z.); and Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, China (F.Z.)
| | - Su Zeng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, China (F.Z.); School of Pharmaceutical Sciences, Jilin University, Changchun, China (T.F., Q.Z.); College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China (S.Z., F.Z.); and Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, China (F.Z.)
| | - Qingchuan Zheng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, China (F.Z.); School of Pharmaceutical Sciences, Jilin University, Changchun, China (T.F., Q.Z.); College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China (S.Z., F.Z.); and Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, China (F.Z.)
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, China (F.Z.); School of Pharmaceutical Sciences, Jilin University, Changchun, China (T.F., Q.Z.); College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China (S.Z., F.Z.); and Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, China (F.Z.)
| |
Collapse
|
22
|
Omer A. MicroRNAs as powerful tool against COVID-19: Computational perspective. WIREs Mech Dis 2023; 15:e1621. [PMID: 37345625 DOI: 10.1002/wsbm.1621] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 04/13/2023] [Accepted: 05/23/2023] [Indexed: 06/23/2023]
Abstract
Severe acute respiratory syndrome coronavirus 2 is the virus that is responsible for the current pandemic, COVID-19 (SARS-CoV-2). MiRNAs, a component of RNAi technology, belong to the family of short, noncoding ssRNAs, and may be crucial in the battle against this global threat since they are involved in regulating complex biochemical pathways and may prevent viral proliferation, translation, and host expression. The complicated metabolic pathways are modulated by the activity of many proteins, mRNAs, and miRNAs working together in miRNA-mediated genetic control. The amount of omics data has increased dramatically in recent years. This massive, linked, yet complex metabolic regulatory network data offers a wealth of opportunity for iterative analysis; hence, extensive, in-depth, but time-efficient screening is necessary to acquire fresh discoveries; this is readily performed with the use of bioinformatics. We have reviewed the literature on microRNAs, bioinformatics, and COVID-19 infection to summarize (1) the function of miRNAs in combating COVID-19, and (2) the use of computational methods in combating COVID-19 in certain noteworthy studies, and (3) computational tools used by these studies against COVID-19 in several purposes. This article is categorized under: Infectious Diseases > Computational Models.
Collapse
Affiliation(s)
- Ankur Omer
- Government College Silodi, MPHED, Katni, Madhya Pradesh, India
| |
Collapse
|
23
|
Mijit A, Wang X, Li Y, Xu H, Chen Y, Xue W. Mapping synthetic binding proteins epitopes on diverse protein targets by protein structure prediction and protein-protein docking. Comput Biol Med 2023; 163:107183. [PMID: 37352638 DOI: 10.1016/j.compbiomed.2023.107183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 06/12/2023] [Accepted: 06/13/2023] [Indexed: 06/25/2023]
Abstract
Synthetic binding proteins (SBPs) are a class of artificial proteins engineered from privileged protein scaffolds, which can form highly specific molecular recognition interfaces with a variety of targets. Due to the characteristics of small size, high stability, and good tissue permeability, SBPs have important applications in biomedical research, disease diagnosis and treatment. However, knowledge of SBPs epitopes on the structures of target proteins is still limited, which hinder the development of novel SBPs. In this study, based on the currently available information of SBPs and their targets, 96 pairs of interacting proteins referring to 96 representative SBPs and 80 different targets, were systemically investigated using the state-of-the-art computational modeling techniques including AlphaFold2 protein structure prediction and Rosetta protein-protein docking. As a result, 71 out of the 96 pairs were successfully docked, of which 18, 33, and 20 pairs were defined as models with high, medium, and acceptable quality, respectively. In addition, the interface information was analyzed to decipher the interaction types driven SBPs and targets recognition. Overall, this work not only provides important structural information for understanding the mechanism of action of other SBPs with same protein scaffold, but also for aiding the rational protein engineering and to design of novel SBPs with biomedical applications.
Collapse
Affiliation(s)
- Arzu Mijit
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China
| | - Xiaona Wang
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China
| | - Yanlin Li
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China
| | - Hangwei Xu
- School of Medicine, Hangzhou City University, Hangzhou, 310000, China
| | - Yingjun Chen
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China.
| | - Weiwei Xue
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China.
| |
Collapse
|
24
|
Sahoo BR, Bardwell JCA. SERF, a family of tiny highly conserved, highly charged proteins with enigmatic functions. FEBS J 2023; 290:4150-4162. [PMID: 35694898 DOI: 10.1111/febs.16555] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 06/07/2022] [Accepted: 06/10/2022] [Indexed: 11/27/2022]
Abstract
Amyloid formation is a misfolding process that has been linked to age-related diseases, including Alzheimer's and Huntington's. Understanding how cellular factors affect this process in vivo is vital in realizing the dream of controlling this insidious process that robs so many people of their humanity. SERF (small EDRK-rich factor) was initially isolated as a factor that accelerated polyglutamine amyloid formation in a C. elegans model. SERF knockouts inhibit amyloid formation of a number of proteins that include huntingtin, α-synuclein and β-amyloid which are associated with Huntington's, Parkinson's and Alzheimer's disease, respectively, and purified SERF protein speeds their amyloid formation in vitro. SERF proteins are highly conserved, highly charged and conformationally dynamic proteins that form a fuzzy complex with amyloid precursors. They appear to act by specifically accelerating the primary step of amyloid nucleation. Brain-specific SERF knockout mice, though viable, appear to be more prone to deposition of amyloids, and show modified fibril morphology. Whole-body knockouts are perinatally lethal due to an apparently unrelated developmental issue. Recently, it was found that SERF binds RNA and is localized to nucleic acid-rich membraneless compartments. SERF-related sequences are commonly found fused to zinc finger sequences. These results point towards a nucleic acid-binding function. How this function relates to their ability to accelerate amyloid formation is currently obscure. In this review, we discuss the possible biological functions of SERF family proteins in the context of their structural fuzziness, modulation of amyloid pathway, nucleic acid binding and their fusion to folded proteins.
Collapse
Affiliation(s)
- Bikash R Sahoo
- Department of Molecular, Cellular and Developmental Biology, Howard Hughes Medical Institute, University of Michigan, Ann Arbor, MI, USA
| | - James C A Bardwell
- Department of Molecular, Cellular and Developmental Biology, Howard Hughes Medical Institute, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
25
|
Liang S, Zhao Y, Jin J, Qiao J, Wang D, Wang Y, Wei L. Rm-LR: A long-range-based deep learning model for predicting multiple types of RNA modifications. Comput Biol Med 2023; 164:107238. [PMID: 37515874 DOI: 10.1016/j.compbiomed.2023.107238] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 06/16/2023] [Accepted: 07/07/2023] [Indexed: 07/31/2023]
Abstract
Recent research has highlighted the pivotal role of RNA post-transcriptional modifications in the regulation of RNA expression and function. Accurate identification of RNA modification sites is important for understanding RNA function. In this study, we propose a novel RNA modification prediction method, namely Rm-LR, which leverages a long-range-based deep learning approach to accurately predict multiple types of RNA modifications using RNA sequences only. Rm-LR incorporates two large-scale RNA language pre-trained models to capture discriminative sequential information and learn local important features, which are subsequently integrated through a bilinear attention network. Rm-LR supports a total of ten RNA modification types (m6A, m1A, m5C, m5U, m6Am, Ψ, Am, Cm, Gm, and Um) and significantly outperforms the state-of-the-art methods in terms of predictive capability on benchmark datasets. Experimental results show the effectiveness and superiority of Rm-LR in prediction of various RNA modifications, demonstrating the strong adaptability and robustness of our proposed model. We demonstrate that RNA language pretrained models enable to learn dense biological sequential representations from large-scale long-range RNA corpus, and meanwhile enhance the interpretability of the models. This work contributes to the development of accurate and reliable computational models for RNA modification prediction, providing insights into the complex landscape of RNA modifications.
Collapse
Affiliation(s)
- Sirui Liang
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Yanxi Zhao
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Junru Jin
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Jianbo Qiao
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Ding Wang
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Yu Wang
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Leyi Wei
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China.
| |
Collapse
|
26
|
Spiers AJ, Dorfmueller HC, Jerdan R, McGregor J, Nicoll A, Steel K, Cameron S. Bioinformatics characterization of BcsA-like orphan proteins suggest they form a novel family of pseudomonad cyclic-β-glucan synthases. PLoS One 2023; 18:e0286540. [PMID: 37267309 PMCID: PMC10237404 DOI: 10.1371/journal.pone.0286540] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 05/18/2023] [Indexed: 06/04/2023] Open
Abstract
Bacteria produce a variety of polysaccharides with functional roles in cell surface coating, surface and host interactions, and biofilms. We have identified an 'Orphan' bacterial cellulose synthase catalytic subunit (BcsA)-like protein found in four model pseudomonads, P. aeruginosa PA01, P. fluorescens SBW25, P. putida KT2440 and P. syringae pv. tomato DC3000. Pairwise alignments indicated that the Orphan and BcsA proteins shared less than 41% sequence identity suggesting they may not have the same structural folds or function. We identified 112 Orphans among soil and plant-associated pseudomonads as well as in phytopathogenic and human opportunistic pathogenic strains. The wide distribution of these highly conserved proteins suggest they form a novel family of synthases producing a different polysaccharide. In silico analysis, including sequence comparisons, secondary structure and topology predictions, and protein structural modelling, revealed a two-domain transmembrane ovoid-like structure for the Orphan protein with a periplasmic glycosyl hydrolase family GH17 domain linked via a transmembrane region to a cytoplasmic glycosyltransferase family GT2 domain. We suggest the GT2 domain synthesises β-(1,3)-glucan that is transferred to the GH17 domain where it is cleaved and cyclised to produce cyclic-β-(1,3)-glucan (CβG). Our structural models are consistent with enzymatic characterisation and recent molecular simulations of the PaPA01 and PpKT2440 GH17 domains. It also provides a functional explanation linking PaPAK and PaPA14 Orphan (also known as NdvB) transposon mutants with CβG production and biofilm-associated antibiotic resistance. Importantly, cyclic glucans are also involved in osmoregulation, plant infection and induced systemic suppression, and our findings suggest this novel family of CβG synthases may provide similar range of adaptive responses for pseudomonads.
Collapse
Affiliation(s)
- Andrew J. Spiers
- School of Applied Sciences, Abertay University, Dundee, United Kingdom
| | - Helge C. Dorfmueller
- Division of Molecular Microbiology, School of Life Sciences, University of Dundee, Dundee, United Kingdom
| | - Robyn Jerdan
- School of Applied Sciences, Abertay University, Dundee, United Kingdom
| | - Jessica McGregor
- Nuffield Research Placement Students, School of Applied Sciences, Abertay University, Dundee, United Kingdom
| | - Abbie Nicoll
- Nuffield Research Placement Students, School of Applied Sciences, Abertay University, Dundee, United Kingdom
| | - Kenzie Steel
- Nuffield Research Placement Students, School of Applied Sciences, Abertay University, Dundee, United Kingdom
| | - Scott Cameron
- School of Applied Sciences, Abertay University, Dundee, United Kingdom
| |
Collapse
|
27
|
Luo X, Wang Y, Zou Q, Xu L. Recall DNA methylation levels at low coverage sites using a CNN model in WGBS. PLoS Comput Biol 2023; 19:e1011205. [PMID: 37315069 DOI: 10.1371/journal.pcbi.1011205] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 05/22/2023] [Indexed: 06/16/2023] Open
Abstract
DNA methylation is an important regulator of gene transcription. WGBS is the gold-standard approach for base-pair resolution quantitative of DNA methylation. It requires high sequencing depth. Many CpG sites with insufficient coverage in the WGBS data, resulting in inaccurate DNA methylation levels of individual sites. Many state-of-arts computation methods were proposed to predict the missing value. However, many methods required either other omics datasets or other cross-sample data. And most of them only predicted the state of DNA methylation. In this study, we proposed the RcWGBS, which can impute the missing (or low coverage) values from the DNA methylation levels on the adjacent sides. Deep learning techniques were employed for the accurate prediction. The WGBS datasets of H1-hESC and GM12878 were down-sampled. The average difference between the DNA methylation level at 12× depth predicted by RcWGBS and that at >50× depth in the H1-hESC and GM2878 cells are less than 0.03 and 0.01, respectively. RcWGBS performed better than METHimpute even though the sequencing depth was as low as 12×. Our work would help to process methylation data of low sequencing depth. It is beneficial for researchers to save sequencing costs and improve data utilization through computational methods.
Collapse
Affiliation(s)
- Ximei Luo
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, Guangdong, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Yansu Wang
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, Guangdong, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, Guangdong, China
| |
Collapse
|
28
|
Naorem LD, Sharma N, Raghava GPS. A web server for predicting and scanning of IL-5 inducing peptides using alignment-free and alignment-based method. Comput Biol Med 2023; 158:106864. [PMID: 37058758 DOI: 10.1016/j.compbiomed.2023.106864] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 03/06/2023] [Accepted: 03/30/2023] [Indexed: 04/16/2023]
Abstract
Interleukin-5 (IL-5) can act as an enticing therapeutic target due to its pivotal role in several eosinophil-mediated diseases. The aim of this study is to develop a model for predicting IL-5 inducing antigenic regions in a protein with high precision. All models in this study have been trained, tested and validated on experimentally validated 1907 IL-5 inducing and 7759 non-IL-5 inducing peptides obtained from IEDB. Our primary analysis indicates that IL-5 inducing peptides are dominated by certain residues like Ile, Asn, and Tyr. It was also observed that binders of a wide range of HLA alleles can induce IL-5. Initially, alignment-based methods have been developed using similarity and motif search. These alignment-based methods provide high precision but poor coverage. In order to overcome this limitation, we explore alignment-free methods which are mainly machine learning-based models. Firstly, models have been developed using binary profiles and eXtreme Gradient Boosting-based model achieved a maximum AUC of 0.59. Secondly, composition-based models have been developed and our dipeptide-based random forest model achieved a maximum AUC of 0.74. Thirdly, random forest model developed using selected 250 dipeptides and achieved AUC 0.75 and MCC 0.29 on validation dataset; best among alignment-free models. In order to improve the performance, we developed an ensemble or hybrid method that combined alignment-based and alignment-free methods. Our hybrid method achieved AUC 0.94 with MCC 0.60 on a validation/independent dataset. The best hybrid model developed in this study has been incorporated into the user-friendly web server and a standalone package named 'IL5pred' (https://webs.iiitd.edu.in/raghava/il5pred/).
Collapse
Affiliation(s)
- Leimarembi Devi Naorem
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Neelam Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| |
Collapse
|
29
|
Zhu F, Deng L, Dai Y, Zhang G, Meng F, Luo C, Hu G, Liang Z. PPICT: an integrated deep neural network for predicting inter-protein PTM cross-talk. Brief Bioinform 2023; 24:7035113. [PMID: 36781207 DOI: 10.1093/bib/bbad052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 01/11/2023] [Accepted: 01/26/2023] [Indexed: 02/15/2023] Open
Abstract
Post-translational modifications (PTMs) fine-tune various signaling pathways not only by the modification of a single residue, but also by the interplay of different modifications on residue pairs within or between proteins, defined as PTM cross-talk. As a challenging question, less attention has been given to PTM dynamics underlying cross-talk residue pairs and structural information underlying protein-protein interaction (PPI) graph, limiting the progress in this PTM functional research. Here we propose a novel integrated deep neural network PPICT (Predictor for PTM Inter-protein Cross-Talk), which predicts PTM cross-talk by combining protein sequence-structure-dynamics information and structural information for PPI graph. We find that cross-talk events preferentially occur among residues with high co-evolution and high potential in allosteric regulation. To make full use of the complex associations between protein evolutionary and biophysical features, and protein pair features, a heterogeneous feature combination net is introduced in the final prediction of PPICT. The comprehensive test results show that the proposed PPICT method significantly improves the prediction performance with an AUC value of 0.869, outperforming the existing state-of-the-art methods. Additionally, the PPICT method can capture the potential PTM cross-talks involved in the functional regulatory PTMs on modifying enzymes and their catalyzed PTM substrates. Therefore, PPICT represents an effective tool for identifying PTM cross-talk between proteins at the proteome level and highlights the hints for cross-talk between different signal pathways introduced by PTMs.
Collapse
Affiliation(s)
- Fei Zhu
- School of Computer Science and Technology, Soochow University, 215006, Suzhou, China
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, 215123, Suzhou, China
| | - Lei Deng
- School of Computer Science and Technology, Soochow University, 215006, Suzhou, China
| | - Yuhao Dai
- School of Computer Science and Technology, Soochow University, 215006, Suzhou, China
| | - Guangyu Zhang
- School of Computer Science and Technology, Soochow University, 215006, Suzhou, China
| | - Fanwang Meng
- Department of Chemistry and Chemical Biology, McMaster University, L8S 4L8, Ontario, Canada
| | - Cheng Luo
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 201203, Shanghai, China
| | - Guang Hu
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, 215123, Suzhou, China
| | - Zhongjie Liang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, 215123, Suzhou, China
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 201203, Shanghai, China
- Key Laboratory of Systems Biomedicine (Ministry of Education), Center for Systems Biomedicine, Shanghai Jiao Tong University, 200240, Shanghai, China
| |
Collapse
|
30
|
Luo H, Shan W, Chen C, Ding P, Luo L. Improving language model of human genome for DNA-protein binding prediction based on task-specific pre-training. Interdiscip Sci 2023; 15:32-43. [PMID: 36136096 DOI: 10.1007/s12539-022-00537-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 08/30/2022] [Accepted: 09/07/2022] [Indexed: 11/27/2022]
Abstract
The DNA-protein binding plays a pivotal role in regulating gene expression and evolution, and computational identification of DNA-protein has drawn more and more attention in bioinformatics. Recently, variants of BERT are also used to capture the semantic information of DNA sequences for predicting DNA-protein bindings. In this study, we leverage a task-specific pre-training strategy on BERT using large-scale multi-source DNA-protein binding data and present TFBert. TFBert treats DNA sequences as natural sentences and k-mer nucleotides as words. It can effectively extract upstream and downstream nucleotide context information by pre-training the 690 unlabeled ChIP-seq datasets. Experiments show that the pre-trained model can achieve promising performance on every single dataset in the 690 ChIP-seq datasets after simple fine tuning, especially on small datasets. The average AUC is 94.7%, outperforming existing popular methods. In conclusion, this study provides a variant of BERT based on pre-training and achieved state-of-the-art results in predicting DNA-protein bindings. We believe that TFBert can provide insights into other biological sequence classification problems.
Collapse
Affiliation(s)
- Hanyu Luo
- School of Computer Science, University of South China, Hengyang, Hunan, 421001, People's Republic of China
| | - Wenyu Shan
- School of Computer Science, University of South China, Hengyang, Hunan, 421001, People's Republic of China
| | - Cheng Chen
- School of Computer Science, University of South China, Hengyang, Hunan, 421001, People's Republic of China
| | - Pingjian Ding
- School of Computer Science, University of South China, Hengyang, Hunan, 421001, People's Republic of China
| | - Lingyun Luo
- School of Computer Science, University of South China, Hengyang, Hunan, 421001, People's Republic of China. .,Hunan Medical Big Data International Science and Technology Innovation Cooperation Base, Hengyang, Hunan, 421001, People's Republic of China.
| |
Collapse
|
31
|
Gene function and cell surface protein association analysis based on single-cell multiomics data. Comput Biol Med 2023; 157:106733. [PMID: 36924730 DOI: 10.1016/j.compbiomed.2023.106733] [Citation(s) in RCA: 70] [Impact Index Per Article: 70.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 02/08/2023] [Accepted: 02/28/2023] [Indexed: 03/05/2023]
Abstract
Single-cell transcriptomics provides researchers with a powerful tool to resolve the transcriptome heterogeneity of individual cells. However, this method falls short in revealing cellular heterogeneity at the protein level. Previous single-cell multiomics studies have focused on data integration rather than exploiting the full potential of multiomics data. Here we introduce a new analysis framework, gene function and protein association (GFPA), that mines reliable associations between gene function and cell surface protein from single-cell multimodal data. Applying GFPA to human peripheral blood mononuclear cells (PBMCs), we observe an association of epithelial mesenchymal transition (EMT) with the CD99 protein in CD4 T cells, which is consistent with previous findings. Our results show that GFPA is reliable across multiple cell subtypes and PBMC samples. The GFPA python packages and detailed tutorials are freely available at https://github.com/studentiz/GFPA.
Collapse
|
32
|
Zheng Y, Young ND, Song J, Chang BC, Gasser RB. An informatic workflow for the enhanced annotation of excretory/secretory proteins of Haemonchus contortus. Comput Struct Biotechnol J 2023; 21:2696-2704. [PMID: 37143762 PMCID: PMC10151223 DOI: 10.1016/j.csbj.2023.03.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 03/16/2023] [Accepted: 03/16/2023] [Indexed: 03/19/2023] Open
Abstract
Major advances in genomic and associated technologies have demanded reliable bioinformatic tools and workflows for the annotation of genes and their products via comparative analyses using well-curated reference data sets, accessible in public repositories. However, the accurate in silico annotation of molecules (proteins) encoded in organisms (e.g., multicellular parasites) which are evolutionarily distant from those for which these extensive reference data sets are available, including invertebrate model organisms (e.g., Caenorhabditis elegans - free-living nematode, and Drosophila melanogaster - the vinegar fly) and vertebrate species (e.g., Homo sapiens and Mus musculus), remains a major challenge. Here, we constructed an informatic workflow for the enhanced annotation of biologically-important, excretory/secretory (ES) proteins ("secretome") encoded in the genome of a parasitic roundworm, called Haemonchus contortus (commonly known as the barber's pole worm). We critically evaluated the performance of five distinct methods, refined some of them, and then combined the use of all five methods to comprehensively annotate ES proteins, according to gene ontology, biological pathways and/or metabolic (enzymatic) processes. Then, using optimised parameter settings, we applied this workflow to comprehensively annotate 2591 of all 3353 proteins (77.3%) in the secretome of H. contortus. This result is a substantial improvement (10-25%) over previous annotations using individual, "off-the-shelf" algorithms and default settings, indicating the ready applicability of the present, refined workflow to gene/protein sequence data sets from a wide range of organisms in the Tree-of-Life.
Collapse
|
33
|
He H, Duo H, Hao Y, Zhang X, Zhou X, Zeng Y, Li Y, Li B. Computational drug repurposing by exploiting large-scale gene expression data: Strategy, methods and applications. Comput Biol Med 2023; 155:106671. [PMID: 36805225 DOI: 10.1016/j.compbiomed.2023.106671] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 02/05/2023] [Accepted: 02/10/2023] [Indexed: 02/18/2023]
Abstract
De novo drug development is an extremely complex, time-consuming and costly task. Urgent needs for therapies of various diseases have greatly accelerated searches for more effective drug development methods. Luckily, drug repurposing provides a new and effective perspective on disease treatment. Rapidly increased large-scale transcriptome data paints a detailed prospect of gene expression during disease onset and thus has received wide attention in the field of computational drug repurposing. However, how to efficiently mine transcriptome data and identify new indications for old drugs remains a critical challenge. This review discussed the irreplaceable role of transcriptome data in computational drug repurposing and summarized some representative databases, tools and strategies. More importantly, it proposed a practical guideline through establishing the correspondence between three gene expression data types and five strategies, which would facilitate researchers to adopt appropriate strategies to deeply mine large-scale transcriptome data and discover more effective therapies.
Collapse
Affiliation(s)
- Hao He
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China; State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Institutes of Brain Science, Fudan University, Shanghai, 200032, PR China
| | - Hongrui Duo
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Youjin Hao
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Xiaoxi Zhang
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Xinyi Zhou
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Yujie Zeng
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Yinghong Li
- The Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, PR China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China.
| |
Collapse
|
34
|
Yan TC, Yue ZX, Xu HQ, Liu YH, Hong YF, Chen GX, Tao L, Xie T. A systematic review of state-of-the-art strategies for machine learning-based protein function prediction. Comput Biol Med 2023; 154:106446. [PMID: 36680931 DOI: 10.1016/j.compbiomed.2022.106446] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 12/07/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]
Abstract
New drug discovery is inseparable from the discovery of drug targets, and the vast majority of the known targets are proteins. At the same time, proteins are essential structural and functional elements of living cells necessary for the maintenance of all forms of life. Therefore, protein functions have become the focus of many pharmacological and biological studies. Traditional experimental techniques are no longer adequate for rapidly growing annotation of protein sequences, and approaches to protein function prediction using computational methods have emerged and flourished. A significant trend has been to use machine learning to achieve this goal. In this review, approaches to protein function prediction based on the sequence, structure, protein-protein interaction (PPI) networks, and fusion of multi-information sources are discussed. The current status of research on protein function prediction using machine learning is considered, and existing challenges and prominent breakthroughs are discussed to provide ideas and methods for future studies.
Collapse
Affiliation(s)
- Tian-Ci Yan
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Zi-Xuan Yue
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Hong-Quan Xu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yu-Hong Liu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yan-Feng Hong
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Gong-Xing Chen
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| | - Tian Xie
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| |
Collapse
|
35
|
Saxena S, Jena B, Mohapatra B, Gupta N, Kalra M, Scartozzi M, Saba L, Suri JS. Fused deep learning paradigm for the prediction of o6-methylguanine-DNA methyltransferase genotype in glioblastoma patients: A neuro-oncological investigation. Comput Biol Med 2023; 153:106492. [PMID: 36621191 DOI: 10.1016/j.compbiomed.2022.106492] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2022] [Revised: 11/29/2022] [Accepted: 12/27/2022] [Indexed: 01/06/2023]
Abstract
BACKGROUND The O6-methylguanine-DNA methyltransferase (MGMT) is a deoxyribonucleic acid (DNA) repairing enzyme that has been established as an essential clinical brain tumor biomarker for Glioblastoma Multiforme (GBM). Knowing the status of MGMT methylation biomarkers using multi-parametric MRI (mp-MRI) helps neuro-oncologists to analyze GBM and its treatment plan. METHOD The hand-crafted radiomics feature extraction of GBM's subregions, such as edema(ED), tumor core (TC), and enhancing tumor (ET) in the machine learning (ML) framework, was investigated using support vector machine(SVM), K-Nearest Neighbours (KNN), random forest (RF), LightGBM, and extreme gradient boosting (XGB). For tissue-level analysis of the promotor genes in GBM, we used the deep residual neural network (ResNet-18) with 3D architecture, followed by EfficientNet-based investigation for variants as B0 and B1. Lastly, we analyzed the fused deep learning (FDL) framework that combines ML and DL frameworks. RESULT Structural mp-MRI consisting of T1, T2, FLAIR, and T1GD having a size of 400 and 185 patients, respectively, for discovery and replication cohorts. Using the CV protocol in the ResNet-3D framework, MGMT methylation status prediction in mp-MRI gave the AUC of 0.753 (p < 0.0001) and 0.72 (p < 0.0001) for the discovery and replication cohort, respectively. We presented that the FDL is ∼7% superior to solo DL and ∼15% to solo ML. CONCLUSION The proposed study aims to provide solutions for building an efficient predictive model of MGMT for GBM patients using deep radiomics features obtained from mp-MRI with the end-to-end ResNet-18 3D and FDL imaging signatures.
Collapse
Affiliation(s)
- Sanjay Saxena
- Department of Computer Science & Engineering, International Institute of Information Technology, Bhubaneswar, Odisha, India
| | - Biswajit Jena
- Department of Computer Science & Engineering, Institute of Technical Education and Research, SOA Deemed to be University, Bhubaneswar, India
| | - Bibhabasu Mohapatra
- Department of Computer Science & Engineering, International Institute of Information Technology, Bhubaneswar, Odisha, India
| | - Neha Gupta
- Bharati Vidyapeeth's College of Engineering, Paschim Vihar, New Delhi, India
| | - Manudeep Kalra
- Department of Radiology, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Mario Scartozzi
- Department of Radiology, A.O.U, di Cagliari-Polo di Monserrato s.s, 09124, Cagliari, Italy
| | - Luca Saba
- Department of Radiology, A.O.U, di Cagliari-Polo di Monserrato s.s, 09124, Cagliari, Italy
| | - Jasjit S Suri
- Stroke Monitoring and Diagnostic Division, AtheroPoint™ LLC, Roseville, CA, USA; Knowledge Engineering Centre, Global Biomedical Technologies, Inc, Roseville, CA, USA.
| |
Collapse
|
36
|
Luo Y, Wang P, Mou M, Zheng H, Hong J, Tao L, Zhu F. A novel strategy for designing the magic shotguns for distantly related target pairs. Brief Bioinform 2023; 24:6984790. [PMID: 36631399 DOI: 10.1093/bib/bbac621] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Revised: 11/09/2022] [Accepted: 12/17/2022] [Indexed: 01/13/2023] Open
Abstract
Due to its promising capacity in improving drug efficacy, polypharmacology has emerged to be a new theme in the drug discovery of complex disease. In the process of novel multi-target drugs (MTDs) discovery, in silico strategies come to be quite essential for the advantage of high throughput and low cost. However, current researchers mostly aim at typical closely related target pairs. Because of the intricate pathogenesis networks of complex diseases, many distantly related targets are found to play crucial role in synergistic treatment. Therefore, an innovational method to develop drugs which could simultaneously target distantly related target pairs is of utmost importance. At the same time, reducing the false discovery rate in the design of MTDs remains to be the daunting technological difficulty. In this research, effective small molecule clustering in the positive dataset, together with a putative negative dataset generation strategy, was adopted in the process of model constructions. Through comprehensive assessment on 10 target pairs with hierarchical similarity-levels, the proposed strategy turned out to reduce the false discovery rate successfully. Constructed model types with much smaller numbers of inhibitor molecules gained considerable yields and showed better false-hit controllability than before. To further evaluate the generalization ability, an in-depth assessment of high-throughput virtual screening on ChEMBL database was conducted. As a result, this novel strategy could hierarchically improve the enrichment factors for each target pair (especially for those distantly related/unrelated target pairs), corresponding to target pair similarity-levels.
Collapse
Affiliation(s)
- Yongchao Luo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Panpan Wang
- College of Chemistry and Pharmaceutical Engineering, Huanghuai University, Zhumadian 463000, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Hanqi Zheng
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Jiajun Hong
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicine of Zhejiang Province, School of Medicine, Hangzhou Normal University, Hangzhou 310036, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
37
|
Yue ZX, Yan TC, Xu HQ, Liu YH, Hong YF, Chen GX, Xie T, Tao L. A systematic review on the state-of-the-art strategies for protein representation. Comput Biol Med 2023; 152:106440. [PMID: 36543002 DOI: 10.1016/j.compbiomed.2022.106440] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/08/2022] [Accepted: 12/15/2022] [Indexed: 12/23/2022]
Abstract
The study of drug-target protein interaction is a key step in drug research. In recent years, machine learning techniques have become attractive for research, including drug research, due to their automated nature, predictive power, and expected efficiency. Protein representation is a key step in the study of drug-target protein interaction by machine learning, which plays a fundamental role in the ultimate accomplishment of accurate research. With the progress of machine learning, protein representation methods have gradually attracted attention and have consequently developed rapidly. Therefore, in this review, we systematically classify current protein representation methods, comprehensively review them, and discuss the latest advances of interest. According to the information extraction methods and information sources, these representation methods are generally divided into structure and sequence-based representation methods. Each primary class can be further divided into specific subcategories. As for the particular representation methods involve both traditional and the latest approaches. This review contains a comprehensive assessment of the various methods which researchers can use as a reference for their specific protein-related research requirements, including drug research.
Collapse
Affiliation(s)
- Zi-Xuan Yue
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Tian-Ci Yan
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Hong-Quan Xu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yu-Hong Liu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yan-Feng Hong
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Gong-Xing Chen
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Tian Xie
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| |
Collapse
|
38
|
Niu K, Guo Z, Peng X, Pei S. P-ResUnet: Segmentation of brain tissue with Purified Residual Unet. Comput Biol Med 2022; 151:106294. [PMID: 36435055 DOI: 10.1016/j.compbiomed.2022.106294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 10/14/2022] [Accepted: 11/06/2022] [Indexed: 11/13/2022]
Abstract
Brain tissue of Magnetic Resonance Imaging is precisely segmented and quantified, which aids in the diagnosis of neurological diseases such as epilepsy, Alzheimer's, and multiple sclerosis. Recently, UNet-like architectures are widely used for medical image segmentation, which achieved promising performance by using the skip connection to fuse the low-level and high-level information. However, In the process of integrating the low-level and high-level information, the non-object information (noise) will be added, which reduces the accuracy of medical image segmentation. Likewise, the same problem also exists in the residual unit. Since the output and input of the residual unit are fused, the non-object information (noise) of the input of the residual unit will be in the integration. To address this challenging problem, in this paper we propose a Purified Residual U-net for the segmentation of brain tissue. This model encodes the image to obtain deep semantic information and purifies the information of low-level features and the residual unit from the image, and acquires the result through a decoder at last. We use the Dilated Pyramid Separate Block (DPSB) as the first block to purify the features for each layer in the encoder without the first layer, which expands the receptive field of the convolution kernel with only a few parameters added. In the first layer, we have explored the best performance achieved with DPB. We find the most non-object information (noise) in the initial image, so it is good for the accuracy to exchange the information to the max degree. We have conducted experiments with the widely used IBSR-18 dataset composed of T-1 weighted MRI volumes from 18 subjects. The results show that compared with some of the cutting-edge methods, our method enhances segmentation performance with the mean dice score reaching 91.093% and the mean Hausdorff distance decreasing to 3.2606.
Collapse
Affiliation(s)
- Ke Niu
- Beijing Information Science and Technology University, Beijing, China.
| | - Zhongmin Guo
- Beijing Information Science and Technology University, Beijing, China.
| | - Xueping Peng
- Australian Artificial Intelligence Institute, Faculty of Engineering and Information Technology, University of Technology Sydney, Australia.
| | - Su Pei
- Beijing Information Science and Technology University, Beijing, China.
| |
Collapse
|
39
|
Singh D, Roy J. A large-scale benchmark study of tools for the classification of protein-coding and non-coding RNAs. Nucleic Acids Res 2022; 50:12094-12111. [PMID: 36420898 PMCID: PMC9757047 DOI: 10.1093/nar/gkac1092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 10/22/2022] [Accepted: 10/28/2022] [Indexed: 11/27/2022] Open
Abstract
Identification of protein-coding and non-coding transcripts is paramount for understanding their biological roles. Computational approaches have been addressing this task for over a decade; however, generalized and high-performance models are still unreliable. This benchmark study assessed the performance of 24 tools producing >55 models on the datasets covering a wide range of species. We have collected 135 small and large transcriptomic datasets from existing studies for comparison and identified the potential bottlenecks hampering the performance of current tools. The key insights of this study include lack of standardized training sets, reliance on homogeneous training data, gradual changes in annotated data, lack of augmentation with homology searches, the presence of false positives and negatives in datasets and the lower performance of end-to-end deep learning models. We also derived a new dataset, RNAChallenge, from the benchmark considering hard instances that may include potential false alarms. The best and least well performing models under- and overfit the dataset, respectively, thereby serving a dual purpose. For computational approaches, it will be valuable to develop accurate and unbiased models. The identification of false alarms will be of interest for genome annotators, and experimental study of hard RNAs will help to untangle the complexity of the RNA world.
Collapse
Affiliation(s)
- Dalwinder Singh
- To whom correspondence should be addressed. Tel: +91 172 5221206;
| | - Joy Roy
- Correspondence may also be addressed to Joy Roy.
| |
Collapse
|
40
|
Zhang H, Wang Y, Pan Z, Sun X, Mou M, Zhang B, Li Z, Li H, Zhu F. ncRNAInter: a novel strategy based on graph neural network to discover interactions between lncRNA and miRNA. Brief Bioinform 2022; 23:6747810. [PMID: 36198065 DOI: 10.1093/bib/bbac411] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 08/04/2022] [Accepted: 08/23/2022] [Indexed: 12/14/2022] Open
Abstract
In recent years, many studies have illustrated the significant role that non-coding RNA (ncRNA) plays in biological activities, in which lncRNA, miRNA and especially their interactions have been proved to affect many biological processes. Some in silico methods have been proposed and applied to identify novel lncRNA-miRNA interactions (LMIs), but there are still imperfections in their RNA representation and information extraction approaches, which imply there is still room for further improving their performances. Meanwhile, only a few of them are accessible at present, which limits their practical applications. The construction of a new tool for LMI prediction is thus imperative for the better understanding of their relevant biological mechanisms. This study proposed a novel method, ncRNAInter, for LMI prediction. A comprehensive strategy for RNA representation and an optimized deep learning algorithm of graph neural network were utilized in this study. ncRNAInter was robust and showed better performance of 26.7% higher Matthews correlation coefficient than existing reputable methods for human LMI prediction. In addition, ncRNAInter proved its universal applicability in dealing with LMIs from various species and successfully identified novel LMIs associated with various diseases, which further verified its effectiveness and usability. All source code and datasets are freely available at https://github.com/idrblab/ncRNAInter.
Collapse
Affiliation(s)
- Hanyu Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Yunxia Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Xiuna Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Bing Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Zhaorong Li
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Honglin Li
- School of Computer Science and Technology, East China Normal University, Shanghai 200062, China.,Shanghai Key Laboratory of New Drug Design, East China University of Science and Technology, Shanghai 200237, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
41
|
Yang Q, Li B, Wang P, Xie J, Feng Y, Liu Z, Zhu F. LargeMetabo: an out-of-the-box tool for processing and analyzing large-scale metabolomic data. Brief Bioinform 2022; 23:6768054. [PMID: 36274234 DOI: 10.1093/bib/bbac455] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Revised: 09/06/2022] [Accepted: 09/24/2022] [Indexed: 12/14/2022] Open
Abstract
Large-scale metabolomics is a powerful technique that has attracted widespread attention in biomedical studies focused on identifying biomarkers and interpreting the mechanisms of complex diseases. Despite a rapid increase in the number of large-scale metabolomic studies, the analysis of metabolomic data remains a key challenge. Specifically, diverse unwanted variations and batch effects in processing many samples have a substantial impact on identifying true biological markers, and it is a daunting challenge to annotate a plethora of peaks as metabolites in untargeted mass spectrometry-based metabolomics. Therefore, the development of an out-of-the-box tool is urgently needed to realize data integration and to accurately annotate metabolites with enhanced functions. In this study, the LargeMetabo package based on R code was developed for processing and analyzing large-scale metabolomic data. This package is unique because it is capable of (1) integrating multiple analytical experiments to effectively boost the power of statistical analysis; (2) selecting the appropriate biomarker identification method by intelligent assessment for large-scale metabolic data and (3) providing metabolite annotation and enrichment analysis based on an enhanced metabolite database. The LargeMetabo package can facilitate flexibility and reproducibility in large-scale metabolomics. The package is freely available from https://github.com/LargeMetabo/LargeMetabo.
Collapse
Affiliation(s)
- Qingxia Yang
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China.,College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing, Chongqing 401331, China
| | - Panpan Wang
- College of Chemistry and Pharmaceutical Engineering, Huanghuai University, Zhumadian 463000, China
| | - Jicheng Xie
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
| | - Yuhao Feng
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
| | - Ziqiang Liu
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
42
|
Liao J, Chen H, Wei L, Wei L. GSAML-DTA: An interpretable drug-target binding affinity prediction model based on graph neural networks with self-attention mechanism and mutual information. Comput Biol Med 2022; 150:106145. [PMID: 37859276 DOI: 10.1016/j.compbiomed.2022.106145] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 08/23/2022] [Accepted: 09/24/2022] [Indexed: 11/03/2022]
Abstract
Identifying drug-target affinity (DTA) has great practical importance in the process of designing efficacious drugs for known diseases. Recently, numerous deep learning-based computational methods have been developed to predict drug-target affinity and achieved impressive performance. However, most of them construct the molecule (drug or target) encoder without considering the weights of features of each node (atom or residue). Besides, they generally combine drug and target representations directly, which may contain irrelevant-task information. In this study, we develop GSAML-DTA, an interpretable deep learning framework for DTA prediction. GSAML-DTA integrates a self-attention mechanism and graph neural networks (GNNs) to build representations of drugs and target proteins from the structural information. In addition, mutual information is introduced to filter out redundant information and retain relevant information in the combined representations of drugs and targets. Extensive experimental results demonstrate that GSAML-DTA outperforms state-of-the-art methods for DTA prediction on two benchmark datasets. Furthermore, GSAML-DTA has the interpretation ability to analyze binding atoms and residues, which may be conducive to chemical biology studies from data. Overall, GSAML-DTA can serve as a powerful and interpretable tool suitable for DTA modelling.
Collapse
Affiliation(s)
- Jiaqi Liao
- School of Software, Shandong University, Jinan, China
| | - Haoyang Chen
- School of Software, Shandong University, Jinan, China
| | - Lesong Wei
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan.
| | - Leyi Wei
- School of Software, Shandong University, Jinan, China.
| |
Collapse
|
43
|
Mondal A, Shrivastava VK. A novel Parametric Flatten-p Mish activation function based deep CNN model for brain tumor classification. Comput Biol Med 2022; 150:106183. [PMID: 37859281 DOI: 10.1016/j.compbiomed.2022.106183] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 08/22/2022] [Accepted: 09/14/2022] [Indexed: 11/03/2022]
Abstract
The brain tumor is one of the deadliest diseases of all cancers. Influenced by the recent developments of convolutional neural networks (CNNs) in medical imaging, we have formed a CNN based model called BMRI-Net for brain tumor classification. As the activation function is one of the important modules of CNN, we have proposed a novel parametric activation function named Parametric Flatten-p Mish (PFpM) to improve the performance. PFpM can tackle the significant disadvantages of the pre-existing activation functions like neuron death and bias shift effect. The parametric approach of PFpM also offers the model some extra flexibility to learn the complex patterns more accurately from the data. To validate our proposed methodology, we have used two brain tumor datasets namely Figshare and Br35H. We have compared the performance of our model with state-of-the-art deep CNN models like DenseNet201, InceptionV3, MobileNetV2, ResNet50 and VGG19. Further, the comparative performance of PFpM has been presented with various activation functions like ReLU, Leaky ReLU, GELU, Swish and Mish. We have performed record-wise and subject-wise (patient-level) experiments for Figshare dataset whereas only record-wise experiments have been performed in case of Br35H dataset due to unavailability of subject-wise information. Further, the model has been validated using hold-out and 5-fold cross-validation techniques. On Figshare dataset, our model has achieved 99.57% overall accuracy with hold-out validation and 98.45% overall accuracy with 5-fold cross validation in case of record-wise data split. On the other hand, the model has achieved 97.91% overall accuracy with hold-out validation and 97.26% overall accuracy with 5-fold cross validation in case of subject-wise data split. Similarly, for Br35H dataset, our model has attained 99% overall accuracy with hold-out validation and 98.33% overall accuracy with 5-fold cross validation using record-wise data split. Hence, our findings can introduce a secondary procedure in the clinical diagnosis of brain tumors.
Collapse
Affiliation(s)
- Ayan Mondal
- School of Electronics Engineering, Kalinga Institute of Industrial Technology (KIIT), Bhubaneswar, India.
| | - Vimal K Shrivastava
- School of Electronics Engineering, Kalinga Institute of Industrial Technology (KIIT), Bhubaneswar, India.
| |
Collapse
|
44
|
Sun X, Zhang Y, Li H, Zhou Y, Shi S, Chen Z, He X, Zhang H, Li F, Yin J, Mou M, Wang Y, Qiu Y, Zhu F. DRESIS: the first comprehensive landscape of drug resistance information. Nucleic Acids Res 2022; 51:D1263-D1275. [PMID: 36243960 PMCID: PMC9825618 DOI: 10.1093/nar/gkac812] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 08/22/2022] [Accepted: 10/11/2022] [Indexed: 01/30/2023] Open
Abstract
Widespread drug resistance has become the key issue in global healthcare. Extensive efforts have been made to reveal not only diverse diseases experiencing drug resistance, but also the six distinct types of molecular mechanisms underlying this resistance. A database that describes a comprehensive list of diseases with drug resistance (not just cancers/infections) and all types of resistance mechanisms is now urgently needed. However, no such database has been available to date. In this study, a comprehensive database describing drug resistance information named 'DRESIS' was therefore developed. It was introduced to (i) systematically provide, for the first time, all existing types of molecular mechanisms underlying drug resistance, (ii) extensively cover the widest range of diseases among all existing databases and (iii) explicitly describe the clinically/experimentally verified resistance data for the largest number of drugs. Since drug resistance has become an ever-increasing clinical issue, DRESIS is expected to have great implications for future new drug discovery and clinical treatment optimization. It is now publicly accessible without any login requirement at: https://idrblab.org/dresis/.
Collapse
Affiliation(s)
| | | | | | | | - Shuiyang Shi
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Zhen Chen
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Xin He
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China,Zhejiang University–University of Edinburgh Institute, Zhejiang University, Haining 314499, China
| | - Hanyu Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Fengcheng Li
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Jiayi Yin
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Yunzhu Wang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Yunqing Qiu
- The First Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Feng Zhu
- To whom correspondence should be addressed.
| |
Collapse
|
45
|
Li F, Yin J, Lu M, Mou M, Li Z, Zeng Z, Tan Y, Wang S, Chu X, Dai H, Hou T, Zeng S, Chen Y, Zhu F. DrugMAP: molecular atlas and pharma-information of all drugs. Nucleic Acids Res 2022; 51:D1288-D1299. [PMID: 36243961 PMCID: PMC9825453 DOI: 10.1093/nar/gkac813] [Citation(s) in RCA: 49] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 08/30/2022] [Accepted: 10/12/2022] [Indexed: 02/06/2023] Open
Abstract
The efficacy and safety of drugs are widely known to be determined by their interactions with multiple molecules of pharmacological importance, and it is therefore essential to systematically depict the molecular atlas and pharma-information of studied drugs. However, our understanding of such information is neither comprehensive nor precise, which necessitates the construction of a new database providing a network containing a large number of drugs and their interacting molecules. Here, a new database describing the molecular atlas and pharma-information of drugs (DrugMAP) was therefore constructed. It provides a comprehensive list of interacting molecules for >30 000 drugs/drug candidates, gives the differential expression patterns for >5000 interacting molecules among different disease sites, ADME (absorption, distribution, metabolism and excretion)-relevant organs and physiological tissues, and weaves a comprehensive and precise network containing >200 000 interactions among drugs and molecules. With the great efforts made to clarify the complex mechanism underlying drug pharmacokinetics and pharmacodynamics and rapidly emerging interests in artificial intelligence (AI)-based network analyses, DrugMAP is expected to become an indispensable supplement to existing databases to facilitate drug discovery. It is now fully and freely accessible at: https://idrblab.org/drugmap/.
Collapse
Affiliation(s)
| | | | - Mingkun Lu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Zhaorong Li
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba–Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Zhenyu Zeng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba–Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Ying Tan
- State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China
| | - Shanshan Wang
- Qian Xuesen Collaborative Research Center of Astrochemistry and Space Life Sciences, Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
| | - Xinyi Chu
- Qian Xuesen Collaborative Research Center of Astrochemistry and Space Life Sciences, Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
| | - Haibin Dai
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Su Zeng
- Correspondence may also be addressed to Su Zeng.
| | - Yuzong Chen
- Correspondence may also be addressed to Yuzong Chen.
| | - Feng Zhu
- To whom correspondence should be addressed.
| |
Collapse
|
46
|
Liu S, Chen L, Zhang Y, Zhou Y, He Y, Chen Z, Qi S, Zhu J, Chen X, Zhang H, Luo Y, Qiu Y, Tao L, Zhu F. M6AREG: m6A-centered regulation of disease development and drug response. Nucleic Acids Res 2022; 51:D1333-D1344. [PMID: 36134713 PMCID: PMC9825441 DOI: 10.1093/nar/gkac801] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 08/27/2022] [Accepted: 09/06/2022] [Indexed: 01/30/2023] Open
Abstract
As the most prevalent internal modification in eukaryotic RNAs, N6-methyladenosine (m6A) has been discovered to play an essential role in cellular proliferation, metabolic homeostasis, embryonic development, etc. With the rapid accumulation of research interest in m6A, its crucial roles in the regulations of disease development and drug response are gaining more and more attention. Thus, a database offering such valuable data on m6A-centered regulation is greatly needed; however, no such database is as yet available. Herein, a new database named 'M6AREG' is developed to (i) systematically cover, for the first time, data on the effects of m6A-centered regulation on both disease development and drug response, (ii) explicitly describe the molecular mechanism underlying each type of regulation and (iii) fully reference the collected data by cross-linking to existing databases. Since the accumulated data are valuable for researchers in diverse disciplines (such as pathology and pathophysiology, clinical laboratory diagnostics, medicinal biochemistry and drug design), M6AREG is expected to have many implications for the future conduct of m6A-based regulation studies. It is currently accessible by all users at: https://idrblab.org/m6areg/.
Collapse
Affiliation(s)
- Shuiping Liu
- Correspondence may also be addressed to Shuiping Liu.
| | | | | | | | - Ying He
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Zhen Chen
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Shasha Qi
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Jinyu Zhu
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Xudong Chen
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Hao Zhang
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Yongchao Luo
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Yunqing Qiu
- State Key Laboratory for Diagnosis and Treatment of Infectious Disease, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou, 310000, China
| | - Lin Tao
- Correspondence may also be addressed to Lin Tao.
| | - Feng Zhu
- To whom correspondence should be addressed. Tel: +86 189 8946 6518; Fax: +86 571 8820 8444;
| |
Collapse
|
47
|
Iqbal N, Kumar P. Integrated COVID-19 Predictor: Differential expression analysis to reveal potential biomarkers and prediction of coronavirus using RNA-Seq profile data. Comput Biol Med 2022; 147:105684. [PMID: 35687925 PMCID: PMC9162937 DOI: 10.1016/j.compbiomed.2022.105684] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Revised: 05/27/2022] [Accepted: 05/30/2022] [Indexed: 02/01/2023]
Abstract
Background The world has been battling the continuous COVID-19 pandemic spread by the SARS-CoV-2 virus for last two years. The issue of viral disease prediction is constantly a matter of interest in virology and the study of disease transmission over the long years. Objective In this study, we aimed to implement genome association studies using RNA-Seq of COVID-19 and reveal highly expressed gene biomarkers and prediction based on the machine learning model of COVID-19 analysis to combat this pandemic. Method We collected RNA-Seq gene count data for both healthy (Control) and non-healthy (Treated) COVID-19 cases. In this experiment, a sequence of bioinformatics strategies and statistical techniques, such as fold-change and adjusted p-value, were processed to identify differentially expressed genes (DEGs). We filtered biomarker sets of high DEGs, moderate DEGs, and low DEGs using DESeq2, Limma Trend, and Limma Voom methods based on intersection and union operations and applied machine learning techniques to predict COVID-19. Result Through experimental analysis, 67 potential biomarkers were extracted, comprising 49 up-regulated and 18 down-regulated genes, using statistical techniques and a set-theory consensus strategy. We trained the machine learning models on 12 different biomarker sets and found that the SVM model performed better than the other classifiers with 99.07% classification accuracy for moderate DEGs. Conclusion Our study revealed that identified differentially expressed genes of the moderate DEGs biomarker set, |log2FC| ≥ 2 with adjusted p-value < 0.05, work significantly as input features to implement a machine learning model using a kernel-based SVM technique to predict COVID-19.
Collapse
|
48
|
Huang Z, Tang S, Chen Z, Wang G, Shen H, Zhou Y, Wang H, Fan W, Liang D, Hu Y, Hu Z. TG-Net: Combining transformer and GAN for nasopharyngeal carcinoma tumor segmentation based on total-body uEXPLORER PET/CT scanner. Comput Biol Med 2022; 148:105869. [PMID: 35905660 DOI: 10.1016/j.compbiomed.2022.105869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 06/20/2022] [Accepted: 07/09/2022] [Indexed: 11/17/2022]
Abstract
Nasopharyngeal carcinoma (NPC) is a malignant tumor, and the main treatment is radiotherapy. Accurate delineation of the target tumor is essential for radiotherapy of NPC. NPC tumors are small in size and vary widely in shape and structure, making it a time-consuming and laborious task for even experienced radiologists to manually outline tumors. However, the segmentation performance of current deep learning models is not satisfactory, mainly manifested by poor segmentation boundaries. To solve this problem, this paper proposes a segmentation method for nasopharyngeal carcinoma based on dynamic PET-CT image data, whose input data include CT, PET, and parametric images (Ki images). This method uses a generative adversarial network with a modified UNet integrated with a Transformer as the generator (TG-Net) to achieve automatic segmentation of NPC on combined CT-PET-Ki images. In the coding stage, TG-Net uses moving windows to replace traditional pooling operations to obtain patches of different sizes, which can reduce information loss in the coding process. Moreover, the introduction of Transformer can make the network learn more representative features and improve the discriminant ability of the model, especially for tumor boundaries. Finally, the results of fivefold cross validation with an average Dice similarity coefficient score of 0.9135 show that our method has good segmentation performance. Comparative experiments also show that our network structure is superior to the most advanced methods in the segmentation of NPC. In addition, this work is the first to use Ki images to assist tumor segmentation. We also demonstrated the usefulness of adding Ki images to aid in tumor segmentation.
Collapse
Affiliation(s)
- Zhengyong Huang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Si Tang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China; Department of Nuclear Medicine, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Zixiang Chen
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Guoshuai Wang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Hao Shen
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Yun Zhou
- Central Research Institute, United Imaging Healthcare Group, Shanghai, 201807, China
| | - Haining Wang
- Central Research Institute, United Imaging Healthcare Group, Shanghai, 201807, China
| | - Wei Fan
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China; Department of Nuclear Medicine, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Dong Liang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Yingying Hu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China; Department of Nuclear Medicine, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China.
| | - Zhanli Hu
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| |
Collapse
|
49
|
Liu S, Wang A, Deng X, Yang C. MGNN: A multiscale grouped convolutional neural network for efficient atrial fibrillation detection. Comput Biol Med 2022; 148:105863. [PMID: 35849950 DOI: 10.1016/j.compbiomed.2022.105863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Revised: 06/29/2022] [Accepted: 07/03/2022] [Indexed: 11/03/2022]
Abstract
The reliable detection of atrial fibrillation (AF) is of great significance for monitoring disease progression and developing tailored care paths. In this work, we proposed a novel and robust method based on deep learning for the accurate detection of AF. Using RR interval sequences, a multiscale grouped convolutional neural network (MGNN) combined with self-attention was designed for automatic feature extraction, and AF and non-AF classification. An average accuracy of 97.07% was obtained in the 5-fold cross-validation. The generalization ability of the proposed MGNN was further independently tested on four other unseen datasets, and the accuracy was 92.23%, 96.86%, 94.23% and 95.91%. Moreover, comparison of the network structures indicated that the MGNN had not only better detection performance but also lower computational complexity. In conclusion, the proposed model is shown to be an efficient AF detector that has great potential for use in clinical auxiliary diagnosis and long-term home monitoring based on wearable devices.
Collapse
Affiliation(s)
- Sen Liu
- Center for Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai, 200433, PR China
| | - Aiguo Wang
- Department of Cardiology, Xinghua City People's Hospital, Jiangsu, 225700, PR China
| | - Xintao Deng
- Department of Cardiology, Xinghua City People's Hospital, Jiangsu, 225700, PR China.
| | - Cuiwei Yang
- Center for Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai, 200433, PR China; Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, 200093, PR China.
| |
Collapse
|
50
|
Zhang S, Sun X, Mou M, Amahong K, Sun H, Zhang W, Shi S, Li Z, Gao J, Zhu F. REGLIV: Molecular regulation data of diverse living systems facilitating current multiomics research. Comput Biol Med 2022; 148:105825. [PMID: 35872412 DOI: 10.1016/j.compbiomed.2022.105825] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 06/29/2022] [Accepted: 07/03/2022] [Indexed: 12/24/2022]
Abstract
Multiomics is a powerful technique in molecular biology that facilitates the identification of new associations among different molecules (genes, proteins & metabolites). It has attracted tremendous research interest from the scientists worldwide and has led to an explosive number of published studies. Most of these studies are based on the regulation data provided in available databases. Therefore, it is essential to have molecular regulation data that are strictly validated in the living systems of various cell lines and in vivo models. However, no database has been developed yet to provide comprehensive molecular regulation information validated by living systems. Herein, a new database, Molecular Regulation Data of Living System Facilitating Multiomics Study (REGLIV) is introduced to describe various types of molecular regulation tested by the living systems. (1) A total of 2996 regulations describe the changes in 1109 metabolites triggered by alterations in 284 genes or proteins, and (2) 1179 regulations describe the variations in 926 proteins induced by 125 endogenous metabolites. Overall, REGLIV is unique in (a) providing the molecular regulation of a clearly defined regulatory direction other than simple correlation, (b) focusing on molecular regulations that are validated in a living system not simply in an in vitro test, and (c) describing the disease/tissue/species specific property underlying each regulation. Therefore, REGLIV has important implications for the future practice of not only multiomics, but also other fields relevant to molecular regulation. REGLIV is freely accessible at: https://idrblab.org/regliv/.
Collapse
Affiliation(s)
- Song Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Xiuna Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Kuerbannisha Amahong
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Huaicheng Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Wei Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Shuiyang Shi
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Zhaorong Li
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Jianqing Gao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China; Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China; Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China.
| |
Collapse
|