1
|
Pratiwi NKC, Tayara H, Chong KT. An Ensemble Classifiers for Improved Prediction of Native-Non-Native Protein-Protein Interaction. Int J Mol Sci 2024; 25:5957. [PMID: 38892144 PMCID: PMC11172808 DOI: 10.3390/ijms25115957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 05/27/2024] [Accepted: 05/27/2024] [Indexed: 06/21/2024] Open
Abstract
In this study, we present an innovative approach to improve the prediction of protein-protein interactions (PPIs) through the utilization of an ensemble classifier, specifically focusing on distinguishing between native and non-native interactions. Leveraging the strengths of various base models, including random forest, gradient boosting, extreme gradient boosting, and light gradient boosting, our ensemble classifier integrates these diverse predictions using a logistic regression meta-classifier. Our model was evaluated using a comprehensive dataset generated from molecular dynamics simulations. While the gains in AUC and other metrics might seem modest, they contribute to a model that is more robust, consistent, and adaptable. To assess the effectiveness of various approaches, we compared the performance of logistic regression to four baseline models. Our results indicate that logistic regression consistently underperforms across all evaluated metrics. This suggests that it may not be well-suited to capture the complex relationships within this dataset. Tree-based models, on the other hand, appear to be more effective for problems involving molecular dynamics simulations. Extreme gradient boosting (XGBoost) and light gradient boosting (LightGBM) are optimized for performance and speed, handling datasets effectively and incorporating regularizations to avoid over-fitting. Our findings indicate that the ensemble method enhances the predictive capability of PPIs, offering a promising tool for computational biology and drug discovery by accurately identifying potential interaction sites and facilitating the understanding of complex protein functions within biological systems.
Collapse
Affiliation(s)
- Nor Kumalasari Caecar Pratiwi
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea;
- Department of Electrical Engineering, Telkom University, Bandung 40257, West Java, Indonesia
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, Republic of Korea
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea;
- Advances Electronics and Information Research Centre, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
2
|
Chen X, Liu J, Park N, Cheng J. A Survey of Deep Learning Methods for Estimating the Accuracy of Protein Quaternary Structure Models. Biomolecules 2024; 14:574. [PMID: 38785981 PMCID: PMC11117562 DOI: 10.3390/biom14050574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 04/07/2024] [Accepted: 05/09/2024] [Indexed: 05/25/2024] Open
Abstract
The quality prediction of quaternary structure models of a protein complex, in the absence of its true structure, is known as the Estimation of Model Accuracy (EMA). EMA is useful for ranking predicted protein complex structures and using them appropriately in biomedical research, such as protein-protein interaction studies, protein design, and drug discovery. With the advent of more accurate protein complex (multimer) prediction tools, such as AlphaFold2-Multimer and ESMFold, the estimation of the accuracy of protein complex structures has attracted increasing attention. Many deep learning methods have been developed to tackle this problem; however, there is a noticeable absence of a comprehensive overview of these methods to facilitate future development. Addressing this gap, we present a review of deep learning EMA methods for protein complex structures developed in the past several years, analyzing their methodologies, data and feature construction. We also provide a prospective summary of some potential new developments for further improving the accuracy of the EMA methods.
Collapse
Affiliation(s)
- Xiao Chen
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Jian Liu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
- NextGen Precision Health Institute, University of Missouri, Columbia, MO 65211, USA
| | - Nolan Park
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
- NextGen Precision Health Institute, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
3
|
Cui C, Huo Q, Xiong X, Li K, Fishel ML, Li B, Yokota H. Anticancer Peptides Derived from Aldolase A and Induced Tumor-Suppressing Cells Inhibit Pancreatic Ductal Adenocarcinoma Cells. Pharmaceutics 2023; 15:2447. [PMID: 37896207 PMCID: PMC10610494 DOI: 10.3390/pharmaceutics15102447] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 09/29/2023] [Accepted: 10/07/2023] [Indexed: 10/29/2023] Open
Abstract
PDAC (pancreatic ductal adenocarcinoma) is a highly aggressive malignant tumor. We have previously developed induced tumor-suppressing cells (iTSCs) that secrete a group of tumor-suppressing proteins. Here, we examined a unique procedure to identify anticancer peptides (ACPs), using trypsin-digested iTSCs-derived protein fragments. Among the 10 ACP candidates, P04 (IGEHTPSALAIMENANVLAR) presented the most efficient anti-PDAC activities. P04 was derived from aldolase A (ALDOA), a glycolytic enzyme. Extracellular ALDOA, as well as P04, was predicted to interact with epidermal growth factor receptor (EGFR), and P04 downregulated oncoproteins such as Snail and Src. Importantly, P04 has no inhibitory effect on mesenchymal stem cells (MSCs). We also generated iTSCs by overexpressing ALDOA in MSCs and peripheral blood mononuclear cells (PBMCs). iTSC-derived conditioned medium (CM) inhibited the progression of PDAC cells as well as PDAC tissue fragments. The inhibitory effect of P04 was additive to that of CM and chemotherapeutic drugs such as 5-Flu and gemcitabine. Notably, applying mechanical vibration to PBMCs elevated ALDOA and converted PBMCs into iTSCs. Collectively, this study presented a unique procedure for selecting anticancer P04 from ALDOA in an iTSCs-derived proteome for the treatment of PDAC.
Collapse
Affiliation(s)
- Changpeng Cui
- Department of Pharmacology, School of Pharmacy, Harbin Medical University, Harbin 150081, China; (C.C.); (Q.H.); (X.X.); (K.L.)
- Department of Biomedical Engineering, Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Qingji Huo
- Department of Pharmacology, School of Pharmacy, Harbin Medical University, Harbin 150081, China; (C.C.); (Q.H.); (X.X.); (K.L.)
- Department of Biomedical Engineering, Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Xue Xiong
- Department of Pharmacology, School of Pharmacy, Harbin Medical University, Harbin 150081, China; (C.C.); (Q.H.); (X.X.); (K.L.)
- Department of Biomedical Engineering, Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Kexin Li
- Department of Pharmacology, School of Pharmacy, Harbin Medical University, Harbin 150081, China; (C.C.); (Q.H.); (X.X.); (K.L.)
- Department of Biomedical Engineering, Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Melissa L. Fishel
- Department of Pediatrics, Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA;
- Department of Pharmacology and Toxicology, Indiana University School of Medicine, Indianapolis, IN 46202, USA
- Indiana University Simon Comprehensive Cancer Center, Indianapolis, IN 46202, USA
| | - Baiyan Li
- Department of Pharmacology, School of Pharmacy, Harbin Medical University, Harbin 150081, China; (C.C.); (Q.H.); (X.X.); (K.L.)
| | - Hiroki Yokota
- Department of Biomedical Engineering, Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, USA
- Indiana University Simon Comprehensive Cancer Center, Indianapolis, IN 46202, USA
- Department of Pediatrics, Indiana Center for Musculoskeletal Health, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| |
Collapse
|
4
|
Cui CP, Huo QJ, Xiong X, Li KX, Ma P, Qiang GF, Pandya PH, Saadatzadeh MR, Bijangi Vishehsaraei K, Kacena MA, Aryal UK, Pollok KE, Li BY, Yokota H. Anticancer peptides from induced tumor-suppressing cells for inhibiting osteosarcoma cells. Am J Cancer Res 2023; 13:4057-4072. [PMID: 37818062 PMCID: PMC10560922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 08/09/2023] [Indexed: 10/12/2023] Open
Abstract
Osteosarcoma (OS) is the most frequent primary bone cancer, which is mainly suffered by children and young adults. While the current surgical treatment combined with chemotherapy is effective for the early stage of OS, advanced OS preferentially metastasizes to the lung and is difficult to treat. Here, we examined the efficacy of ten anti-OS peptide candidates from a trypsin-digested conditioned medium that was derived from the secretome of induced tumor-suppressing cells (iTSCs). Using OS cell lines, the antitumor capabilities of the peptide candidates were evaluated by assaying the alterations in metabolic activities, proliferation, motility, and invasion of OS cells. Among ten candidates, peptide P05 (ADDGRPFPQVIK), a fragment of aldolase A (ALDOA), presented the most potent OS-suppressing capabilities. Its efficacy was additive with standard-of-care chemotherapeutic agents such as cisplatin and doxorubicin, and it downregulated oncoproteins such as epidermal growth factor receptor (EGFR), Snail, and Src in OS cells. Interestingly, P05 did not present inhibitory effects on non-OS skeletal cells such as mesenchymal stem cells and osteoblast cells. Collectively, this study demonstrated that iTSC-derived secretomes may provide a source for identifying anticancer peptides, and P05 may warrant further evaluations for the treatment of OS.
Collapse
Affiliation(s)
- Chang-Peng Cui
- Department of Pharmacology, School of Pharmacy, Harbin Medical UniversityHarbin 150081, Heilongjiang, China
- Department of Biomedical Engineering, Indiana University Purdue University IndianapolisIndianapolis, IN 46202, USA
| | - Qing-Ji Huo
- Department of Pharmacology, School of Pharmacy, Harbin Medical UniversityHarbin 150081, Heilongjiang, China
- Department of Biomedical Engineering, Indiana University Purdue University IndianapolisIndianapolis, IN 46202, USA
| | - Xue Xiong
- Department of Pharmacology, School of Pharmacy, Harbin Medical UniversityHarbin 150081, Heilongjiang, China
- Department of Biomedical Engineering, Indiana University Purdue University IndianapolisIndianapolis, IN 46202, USA
| | - Ke-Xin Li
- Department of Pharmacology, School of Pharmacy, Harbin Medical UniversityHarbin 150081, Heilongjiang, China
- Department of Biomedical Engineering, Indiana University Purdue University IndianapolisIndianapolis, IN 46202, USA
| | - Peng Ma
- State Key Laboratory of Bioactive Substance and Function for Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College and Beijing Key Laboratory of Drug Target and Screening ResearchBeijing 100050, China
| | - Gui-Fen Qiang
- State Key Laboratory of Bioactive Substance and Function for Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College and Beijing Key Laboratory of Drug Target and Screening ResearchBeijing 100050, China
| | - Pankita H Pandya
- Indiana University Simon Comprehensive Cancer Center, Indiana University School of MedicineIndianapolis, IN 46202, USA
- Department of Pediatrics, Indiana University School of MedicineIndianapolis, IN 46202, USA
| | - Mohammad R Saadatzadeh
- Indiana University Simon Comprehensive Cancer Center, Indiana University School of MedicineIndianapolis, IN 46202, USA
- Department of Pediatrics, Indiana University School of MedicineIndianapolis, IN 46202, USA
| | | | - Melissa A Kacena
- Indiana University Simon Comprehensive Cancer Center, Indiana University School of MedicineIndianapolis, IN 46202, USA
- Department of Orthopaedic Surgery, Indiana University School of MedicineIndianapolis, IN 46202, USA
- Indiana Center for Musculoskeletal Health, Indiana University School of MedicineIndianapolis, IN 46202, USA
| | - Uma K Aryal
- Department of Basic Medical Sciences, Interdisciplinary Biomedical Sciences Program, Purdue UniversityWest Lafayette, IN 47907, USA
| | - Karen E Pollok
- Indiana University Simon Comprehensive Cancer Center, Indiana University School of MedicineIndianapolis, IN 46202, USA
- Department of Pediatrics, Indiana University School of MedicineIndianapolis, IN 46202, USA
| | - Bai-Yan Li
- Department of Pharmacology, School of Pharmacy, Harbin Medical UniversityHarbin 150081, Heilongjiang, China
| | - Hiroki Yokota
- Department of Biomedical Engineering, Indiana University Purdue University IndianapolisIndianapolis, IN 46202, USA
- Indiana University Simon Comprehensive Cancer Center, Indiana University School of MedicineIndianapolis, IN 46202, USA
- Indiana Center for Musculoskeletal Health, Indiana University School of MedicineIndianapolis, IN 46202, USA
| |
Collapse
|
5
|
Schweke H, Xu Q, Tauriello G, Pantolini L, Schwede T, Cazals F, Lhéritier A, Fernandez-Recio J, Rodríguez-Lumbreras LÁ, Schueler-Furman O, Varga JK, Jiménez-García B, Réau MF, Bonvin A, Savojardo C, Martelli PL, Casadio R, Tubiana J, Wolfson H, Oliva R, Barradas-Bautista D, Ricciardelli T, Cavallo L, Venclovas Č, Olechnovič K, Guerois R, Andreani J, Martin J, Wang X, Kihara D, Marchand A, Correia B, Zou X, Dey S, Dunbrack R, Levy E, Wodak S. Discriminating physiological from non-physiological interfaces in structures of protein complexes: A community-wide study. Proteomics 2023; 23:e2200323. [PMID: 37365936 PMCID: PMC10937251 DOI: 10.1002/pmic.202200323] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2023] [Revised: 05/11/2023] [Accepted: 05/11/2023] [Indexed: 06/28/2023]
Abstract
Reliably scoring and ranking candidate models of protein complexes and assigning their oligomeric state from the structure of the crystal lattice represent outstanding challenges. A community-wide effort was launched to tackle these challenges. The latest resources on protein complexes and interfaces were exploited to derive a benchmark dataset consisting of 1677 homodimer protein crystal structures, including a balanced mix of physiological and non-physiological complexes. The non-physiological complexes in the benchmark were selected to bury a similar or larger interface area than their physiological counterparts, making it more difficult for scoring functions to differentiate between them. Next, 252 functions for scoring protein-protein interfaces previously developed by 13 groups were collected and evaluated for their ability to discriminate between physiological and non-physiological complexes. A simple consensus score generated using the best performing score of each of the 13 groups, and a cross-validated Random Forest (RF) classifier were created. Both approaches showed excellent performance, with an area under the Receiver Operating Characteristic (ROC) curve of 0.93 and 0.94, respectively, outperforming individual scores developed by different groups. Additionally, AlphaFold2 engines recalled the physiological dimers with significantly higher accuracy than the non-physiological set, lending support to the reliability of our benchmark dataset annotations. Optimizing the combined power of interface scoring functions and evaluating it on challenging benchmark datasets appears to be a promising strategy.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Julia K. Varga
- Hebrew University of Jerusalem Institute for Medical Research Israel-Canada
| | | | | | | | | | | | | | - Jérôme Tubiana
- Tel Aviv University Blavatnik School of Computer Science
| | - Haim Wolfson
- Tel Aviv University Blavatnik School of Computer Science
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Xiaoqin Zou
- Dalton Cardiovascular Research Center, Institute for Data Science and Informatics, University of Missouri
| | | | | | | | | |
Collapse
|
6
|
Qian W, Zhou J, Shou S. Exploration of m 6A methylation regulators as epigenetic targets for immunotherapy in advanced sepsis. BMC Bioinformatics 2023; 24:257. [PMID: 37330481 DOI: 10.1186/s12859-023-05379-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 06/06/2023] [Indexed: 06/19/2023] Open
Abstract
BACKGROUND This study aims to deeply explore the relationship between m6A methylation modification and peripheral immune cells in patients with advanced sepsis and mine potential epigenetic therapeutic targets by analyzing the differential expression patterns of m6A-related genes in healthy subjects and advanced sepsis patients. METHODS A single cell expression dataset of peripheral immune cells containing blood samples from 4 patients with advanced sepsis and 5 healthy subjects was obtained from the gene expression comprehensive database (GSE175453). Differential expression analysis and cluster analysis were performed on 21 m6A-related genes. The characteristic gene was identified based on random forest algorithm, and the correlation between the characteristic gene METTL16 and 23 immune cells in patients with advanced sepsis was evaluated using single-sample gene set enrichment analysis. RESULTS IGFBP1, IGFBP2, IGF2BP1, and WTAP were highly expressed in patients with advanced sepsis and m6A cluster B. IGFBP1, IGFBP2, and IGF2BP1 were positively correlated with Th17 helper T cells. The characteristic gene METTL16 exhibited a significant positive correlation with the proportion of various immune cells. CONCLUSION IGFBP1, IGFBP2, IGF2BP1, WTAP, and METTL16 may accelerate the development of advanced sepsis by regulating m6A methylation modification and promoting immune cell infiltration. The discovery of these characteristic genes related to advanced sepsis provides potential therapeutic targets for the diagnosis and treatment of sepsis.
Collapse
Affiliation(s)
- Weiwei Qian
- Tianjin Medical University, Tianjin, 300203, China
- Department of Emergency, Shangjin Nanfu Hospital, West China Hospital, Sichuan University, Chengdu, 610044, Sichuan, China
| | - Jian Zhou
- Department of Immunology, International Cancer Center, Shenzhen University Health Science Center, Shenzhen, 518060, China
| | - Songtao Shou
- Department of Emergency, Tianjin Medical University General Hospital, 154 Anshan Road, Heping District, Tianjin, 300052, China.
| |
Collapse
|
7
|
Barradas-Bautista D, Almajed A, Oliva R, Kalnis P, Cavallo L. Improving classification of correct and incorrect protein-protein docking models by augmenting the training set. BIOINFORMATICS ADVANCES 2023; 3:vbad012. [PMID: 36789292 PMCID: PMC9923443 DOI: 10.1093/bioadv/vbad012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 01/20/2023] [Accepted: 02/01/2023] [Indexed: 02/04/2023]
Abstract
Motivation Protein-protein interactions drive many relevant biological events, such as infection, replication and recognition. To control or engineer such events, we need to access the molecular details of the interaction provided by experimental 3D structures. However, such experiments take time and are expensive; moreover, the current technology cannot keep up with the high discovery rate of new interactions. Computational modeling, like protein-protein docking, can help to fill this gap by generating docking poses. Protein-protein docking generally consists of two parts, sampling and scoring. The sampling is an exhaustive search of the tridimensional space. The caveat of the sampling is that it generates a large number of incorrect poses, producing a highly unbalanced dataset. This limits the utility of the data to train machine learning classifiers. Results Using weak supervision, we developed a data augmentation method that we named hAIkal. Using hAIkal, we increased the labeled training data to train several algorithms. We trained and obtained different classifiers; the best classifier has 81% accuracy and 0.51 Matthews' correlation coefficient on the test set, surpassing the state-of-the-art scoring functions. Availability and implementation Docking models from Benchmark 5 are available at https://doi.org/10.5281/zenodo.4012018. Processed tabular data are available at https://repository.kaust.edu.sa/handle/10754/666961. Google colab is available at https://colab.research.google.com/drive/1vbVrJcQSf6\_C3jOAmZzgQbTpuJ5zC1RP?usp=sharing. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | - Ali Almajed
- Computer, Electrical and Mathematical Science and Engineering Division, Kaust Extreme Computing Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Romina Oliva
- Department of Sciences and Technologies, University of Naples “Parthenope”, I-80143 Naples, Italy
| | - Panos Kalnis
- Computer, Electrical and Mathematical Science and Engineering Division, Kaust Extreme Computing Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Luigi Cavallo
- Physical Sciences and Engineering Division, Kaust Catalysis Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| |
Collapse
|