1
|
Zhang Y, Li R, Zou G, Guo Y, Wu R, Zhou Y, Chen H, Zhou R, Lavigne R, Bergen PJ, Li J, Li J. Discovery of Antimicrobial Lysins from the "Dark Matter" of Uncharacterized Phages Using Artificial Intelligence. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2404049. [PMID: 38899839 PMCID: PMC11348152 DOI: 10.1002/advs.202404049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 05/29/2024] [Indexed: 06/21/2024]
Abstract
The rapid rise of antibiotic resistance and slow discovery of new antibiotics have threatened global health. While novel phage lysins have emerged as potential antibacterial agents, experimental screening methods for novel lysins pose significant challenges due to the enormous workload. Here, the first unified software package, namely DeepLysin, is developed to employ artificial intelligence for mining the vast genome reservoirs ("dark matter") for novel antibacterial phage lysins. Putative lysins are computationally screened from uncharacterized Staphylococcus aureus phages and 17 novel lysins are randomly selected for experimental validation. Seven candidates exhibit excellent in vitro antibacterial activity, with LLysSA9 exceeding that of the best-in-class alternative. The efficacy of LLysSA9 is further demonstrated in mouse bloodstream and wound infection models. Therefore, this study demonstrates the potential of integrating computational and experimental approaches to expedite the discovery of new antibacterial proteins for combating increasing antimicrobial resistance.
Collapse
Affiliation(s)
- Yue Zhang
- National Key Laboratory of Agricultural MicrobiologyKey Laboratory of Environment Correlative DietologyCollege of Biomedicine and HealthShenzhen Institute of Nutrition and HealthHuazhong Agricultural UniversityWuhan430070China
- Hubei Hongshan LaboratoryCollege of Food Science and TechnologyHuazhong Agricultural UniversityWuhan430070China
| | - Runze Li
- National Key Laboratory of Agricultural MicrobiologyKey Laboratory of Environment Correlative DietologyCollege of Biomedicine and HealthShenzhen Institute of Nutrition and HealthHuazhong Agricultural UniversityWuhan430070China
- Hubei Hongshan LaboratoryCollege of Food Science and TechnologyHuazhong Agricultural UniversityWuhan430070China
| | - Geng Zou
- National Key Laboratory of Agricultural MicrobiologyKey Laboratory of Environment Correlative DietologyCollege of Biomedicine and HealthShenzhen Institute of Nutrition and HealthHuazhong Agricultural UniversityWuhan430070China
- Hubei Hongshan LaboratoryCollege of Food Science and TechnologyHuazhong Agricultural UniversityWuhan430070China
| | - Yating Guo
- National Key Laboratory of Agricultural MicrobiologyKey Laboratory of Environment Correlative DietologyCollege of Biomedicine and HealthShenzhen Institute of Nutrition and HealthHuazhong Agricultural UniversityWuhan430070China
- College of Veterinary MedicineHuazhong Agricultural UniversityWuhan430070China
| | - Renwei Wu
- National Key Laboratory of Agricultural MicrobiologyKey Laboratory of Environment Correlative DietologyCollege of Biomedicine and HealthShenzhen Institute of Nutrition and HealthHuazhong Agricultural UniversityWuhan430070China
- College of Veterinary MedicineHuazhong Agricultural UniversityWuhan430070China
| | - Yang Zhou
- National Key Laboratory of Agricultural MicrobiologyKey Laboratory of Environment Correlative DietologyCollege of Biomedicine and HealthShenzhen Institute of Nutrition and HealthHuazhong Agricultural UniversityWuhan430070China
| | - Huanchun Chen
- National Key Laboratory of Agricultural MicrobiologyKey Laboratory of Environment Correlative DietologyCollege of Biomedicine and HealthShenzhen Institute of Nutrition and HealthHuazhong Agricultural UniversityWuhan430070China
- College of Veterinary MedicineHuazhong Agricultural UniversityWuhan430070China
| | - Rui Zhou
- National Key Laboratory of Agricultural MicrobiologyKey Laboratory of Environment Correlative DietologyCollege of Biomedicine and HealthShenzhen Institute of Nutrition and HealthHuazhong Agricultural UniversityWuhan430070China
- College of Veterinary MedicineHuazhong Agricultural UniversityWuhan430070China
| | - Rob Lavigne
- Department of BiosystemsLaboratory of Gene TechnologyKU LeuvenLeuven3001Belgium
| | - Phillip J. Bergen
- Monash Biomedicine Discovery InstituteDepartment of MicrobiologyFaculty of MedicineNursing and Health SciencesMonash UniversityMelbourne3800Australia
| | - Jian Li
- Monash Biomedicine Discovery InstituteDepartment of MicrobiologyFaculty of MedicineNursing and Health SciencesMonash UniversityMelbourne3800Australia
| | - Jinquan Li
- National Key Laboratory of Agricultural MicrobiologyKey Laboratory of Environment Correlative DietologyCollege of Biomedicine and HealthShenzhen Institute of Nutrition and HealthHuazhong Agricultural UniversityWuhan430070China
- Hubei Hongshan LaboratoryCollege of Food Science and TechnologyHuazhong Agricultural UniversityWuhan430070China
- College of Veterinary MedicineHuazhong Agricultural UniversityWuhan430070China
- Shenzhen BranchGuangdong Laboratory for Lingnan Modern AgricultureGenome Analysis Laboratory of the Ministry of Agriculture and Rural AffairsAgricultural Genomics Institute at ShenzhenChinese Academy of Agricultural SciencesShenzhen518000China
| |
Collapse
|
2
|
Liao YH, Chen SZ, Bin YN, Zhao JP, Feng XL, Zheng CH. UsIL-6: An unbalanced learning strategy for identifying IL-6 inducing peptides by undersampling technique. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108176. [PMID: 38677081 DOI: 10.1016/j.cmpb.2024.108176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 03/26/2024] [Accepted: 04/11/2024] [Indexed: 04/29/2024]
Abstract
BACKGROUND AND OBJECTIVE Interleukin-6 (IL-6) is the critical factor of early warning, monitoring, and prognosis in the inflammatory storm of COVID-19 cases. IL-6 inducing peptides, which can induce cytokine IL-6 production, are very important for the development of diagnosis and immunotherapy. Although the existing methods have some success in predicting IL-6 inducing peptides, there is still room for improvement in the performance of these models in practical application. METHODS In this study, we proposed UsIL-6, a high-performance bioinformatics tool for identifying IL-6 inducing peptides. First, we extracted five groups of physicochemical properties and sequence structural information from IL-6 inducing peptide sequences, and obtained a 636-dimensional feature vector, we also employed NearMiss3 undersampling method and normalization method StandardScaler to process the data. Then, a 40-dimensional optimal feature vector was obtained by Boruta feature selection method. Finally, we combined this feature vector with extreme randomization tree classifier to build the final model UsIL-6. RESULTS The AUC value of UsIL-6 on the independent test dataset was 0.87, and the BACC value was 0.808, which indicated that UsIL-6 had better performance than the existing methods in IL-6 inducing peptide recognition. CONCLUSIONS The performance comparison on independent test dataset confirmed that UsIL-6 could achieve the highest performance, best robustness, and most excellent generalization ability. We hope that UsIL-6 will become a valuable method to identify, annotate and characterize new IL-6 inducing peptides.
Collapse
Affiliation(s)
- Yan-Hong Liao
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China
| | - Shou-Zhi Chen
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China
| | - Yan-Nan Bin
- School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China
| | - Jian-Ping Zhao
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China.
| | - Xin-Long Feng
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China.
| | - Chun-Hou Zheng
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China; School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China
| |
Collapse
|
3
|
Jia P, Zhang F, Wu C, Li M. A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond. Brief Bioinform 2024; 25:bbae162. [PMID: 38739759 PMCID: PMC11089422 DOI: 10.1093/bib/bbae162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 02/17/2024] [Accepted: 03/31/2024] [Indexed: 05/16/2024] Open
Abstract
Proteins interact with diverse ligands to perform a large number of biological functions, such as gene expression and signal transduction. Accurate identification of these protein-ligand interactions is crucial to the understanding of molecular mechanisms and the development of new drugs. However, traditional biological experiments are time-consuming and expensive. With the development of high-throughput technologies, an increasing amount of protein data is available. In the past decades, many computational methods have been developed to predict protein-ligand interactions. Here, we review a comprehensive set of over 160 protein-ligand interaction predictors, which cover protein-protein, protein-nucleic acid, protein-peptide and protein-other ligands (nucleotide, heme, ion) interactions. We have carried out a comprehensive analysis of the above four types of predictors from several significant perspectives, including their inputs, feature profiles, models, availability, etc. The current methods primarily rely on protein sequences, especially utilizing evolutionary information. The significant improvement in predictions is attributed to deep learning methods. Additionally, sequence-based pretrained models and structure-based approaches are emerging as new trends.
Collapse
Affiliation(s)
- Pengzhen Jia
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Fuhao Zhang
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
- College of Information Engineering, Northwest A&F University, No. 3 Taicheng Road, Yangling, Shaanxi 712100, China
| | - Chaojin Wu
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| |
Collapse
|
4
|
Ding H, Li X, Han P, Tian X, Jing F, Wang S, Song T, Fu H, Kang N. MEG-PPIS: a fast protein-protein interaction site prediction method based on multi-scale graph information and equivariant graph neural network. Bioinformatics 2024; 40:btae269. [PMID: 38640481 PMCID: PMC11252844 DOI: 10.1093/bioinformatics/btae269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 03/19/2024] [Accepted: 04/17/2024] [Indexed: 04/21/2024] Open
Abstract
MOTIVATION Protein-protein interaction sites (PPIS) are crucial for deciphering protein action mechanisms and related medical research, which is the key issue in protein action research. Recent studies have shown that graph neural networks have achieved outstanding performance in predicting PPIS. However, these studies often neglect the modeling of information at different scales in the graph and the symmetry of protein molecules within three-dimensional space. RESULTS In response to this gap, this article proposes the MEG-PPIS approach, a PPIS prediction method based on multi-scale graph information and E(n) equivariant graph neural network (EGNN). There are two channels in MEG-PPIS: the original graph and the subgraph obtained by graph pooling. The model can iteratively update the features of the original graph and subgraph through the weight-sharing EGNN. Subsequently, the max-pooling operation aggregates the updated features of the original graph and subgraph. Ultimately, the model feeds node features into the prediction layer to obtain prediction results. Comparative assessments against other methods on benchmark datasets reveal that MEG-PPIS achieves optimal performance across all evaluation metrics and gets the fastest runtime. Furthermore, specific case studies demonstrate that our method can predict more true positive and true negative sites than the current best method, proving that our model achieves better performance in the PPIS prediction task. AVAILABILITY AND IMPLEMENTATION The data and code are available at https://github.com/dhz234/MEG-PPIS.git.
Collapse
Affiliation(s)
- Hongzhen Ding
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Xue Li
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Peifu Han
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Xu Tian
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Fengrui Jing
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Shuang Wang
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Tao Song
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Hanjiao Fu
- School of Humanities and Law, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Na Kang
- The Ninth Department of Health Care Administration, the Second Medical Center, Chinese PLA General Hospital, Beijing, 100853, China
| |
Collapse
|
5
|
Zeng X, Meng FF, Li X, Zhong KY, Jiang B, Li Y. GHGPR-PPIS: A graph convolutional network for identifying protein-protein interaction site using heat kernel with Generalized PageRank techniques and edge self-attention feature processing block. Comput Biol Med 2024; 168:107683. [PMID: 37984202 DOI: 10.1016/j.compbiomed.2023.107683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 10/10/2023] [Accepted: 11/06/2023] [Indexed: 11/22/2023]
Abstract
Accurately pinpointing protein-protein interaction site (PPIS) on the molecular level is of utmost significance for annotating protein function and comprehending the mechanisms underpinning various diseases. While numerous computational methods for predicting PPIS have emerged, they have indeed mitigated the labor and time constraints associated with traditional experimental methods. However, the predictive accuracy of these methods has yet to reach the desired threshold. In this context, we proposed a groundbreaking graph-based computational model called GHGPR-PPIS. This innovative model leveraged a graph convolutional network using heat kernel (GraphHeat) in conjunction with Generalized PageRank techniques (GHGPR) to predict PPIS. Additionally, building upon the GHGPR framework, we devised an edge self-attention feature processing block, further augmenting the performance of the model. Experimental findings conclusively demonstrated that GHGPR-PPIS surpassed all competing state-of-the-art models when evaluated on the benchmark test set. Impressively, on two distinct independent test sets and a specific protein chain, GHGPR-PPIS consistently demonstrated superior generalization performance and practical applicability compared to the comparative model, AGAT-PPIS. Lastly, leveraging the t-SNE dimensionality reduction algorithm and clustering visualization technique, we delved into an interpretability analysis of the effectiveness of GHGPR-PPIS by meticulously comparing the outputs from different stages of the model.
Collapse
Affiliation(s)
- Xin Zeng
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China
| | - Fan-Fang Meng
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China
| | - Xin Li
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China
| | - Kai-Yang Zhong
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China
| | - Bei Jiang
- Yunnan Key Laboratory of Screening and Research on Anti-pathogenic Plant Resources from Western Yunnan, Dali University, Dali, 671000, China
| | - Yi Li
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China.
| |
Collapse
|
6
|
Xu J, Xu J, Tong Z, Yu S, Liu B, Mu X, Du B, Liu Z, Wang J, Liu D. Investigating the impact of attenuated fluorescence spectra on protein discrimination. OPTICS EXPRESS 2023; 31:35507-35518. [PMID: 38017719 DOI: 10.1364/oe.499362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 09/18/2023] [Indexed: 11/30/2023]
Abstract
The optical remote sensing techniques are promising for the real-time detection, and identification of different types of hazardous biological materials. However, the received fluorescent spectra from a remote distance suffer from the atmospheric attenuation effect upon the spectral shape. To investigate the influence of atmospheric attenuation on characterizing, and classifying biological agents, the laboratory-measured fluorescence data of fourteen proteins combined with the atmospheric transmission factors of the MODTRAN model were conducted with different detection ranges. The multivariate analysis techniques of principal component analysis (PCA) and linear discriminant analysis (LDA), and the predictors of Random Forest and XGBoost were employed to assess the separability and distinguishability of different spectra recorded. The results showed that the spectral-shift effect on attenuated spectra varied as a function of the detection range, the atmospheric visibility, and the spectral distribution. According to the PCA and LDA analysis, the distribution of decomposed factors changed in the spectral explanatory power with the increasing attenuation effect, which was consistent with the hierarchical clustering results. Random Forest exhibited higher performance in classifying protein samples than that of XGBoost, while the two methods performed similarly in identifying harmful and harmless subgroups of proteins. Fewer subgroups decreased the sensitivity of the classification accuracy to the attenuation effect. Our analysis demonstrated that combining atmospheric transport models to build a fluorescence spectral database is essential for fast identification between spectra, and reduced classification criteria could facilitate the compatibility of spectral database and classification algorithms.
Collapse
|
7
|
Zhang T, Jia J, Chen C, Zhang Y, Yu B. BiGRUD-SA: Protein S-sulfenylation sites prediction based on BiGRU and self-attention. Comput Biol Med 2023; 163:107145. [PMID: 37336062 DOI: 10.1016/j.compbiomed.2023.107145] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 05/18/2023] [Accepted: 06/06/2023] [Indexed: 06/21/2023]
Abstract
S-sulfenylation is a vital post-translational modification (PTM) of proteins, which is an intermediate in other redox reactions and has implications for signal transduction and protein function regulation. However, there are many restrictions on the experimental identification of S-sulfenylation sites. Therefore, predicting S-sulfoylation sites by computational methods is fundamental to studying protein function and related biological mechanisms. In this paper, we propose a method named BiGRUD-SA based on bi-directional gated recurrent unit (BiGRU) and self-attention mechanism to predict protein S-sulfenylation sites. We first use AAC, BLOSUM62, AAindex, EAAC and GAAC to extract features, and do feature fusion to obtain original feature space. Next, we use SMOTE-Tomek method to handle data imbalance. Then, we input the processed data to the BiGRU and use self-attention mechanism to do further feature extraction. Finally, we input the data obtained to the deep neural networks (DNN) to identify S-sulfenylation sites. The accuracies of training set and independent test set are 96.66% and 95.91% respectively, which indicates that our method is conducive to identifying S-sulfenylation sites. Furthermore, we use a data set of S-sulfenylation sites in Arabidopsis thaliana to effectively verify the generalization ability of BiGRUD-SA method, and obtain better prediction results.
Collapse
Affiliation(s)
- Tingting Zhang
- College of Computer Science and Technology, Shandong University, Qingdao, 266237, China; College of Information Science and Technology, School of Data Science, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Jihua Jia
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Cheng Chen
- College of Computer Science and Technology, Shandong University, Qingdao, 266237, China
| | - Yaqun Zhang
- College of Mathematics and Big Data, Dezhou University, Dezhou, 253023, China.
| | - Bin Yu
- College of Information Science and Technology, School of Data Science, Qingdao University of Science and Technology, Qingdao, 266061, China; School of Data Science, University of Science and Technology of China, Hefei, 230027, China.
| |
Collapse
|
8
|
Chen S, Liao Y, Zhao J, Bin Y, Zheng C. PACVP: Prediction of Anti-Coronavirus Peptides Using a Stacking Learning Strategy With Effective Feature Representation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3106-3116. [PMID: 37022025 DOI: 10.1109/tcbb.2023.3238370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Due to the global outbreak of COVID-19 and its variants, antiviral peptides with anti-coronavirus activity (ACVPs) represent a promising new drug candidate for the treatment of coronavirus infection. At present, several computational tools have been developed to identify ACVPs, but the overall prediction performance is still not enough to meet the actual therapeutic application. In this study, we constructed an efficient and reliable prediction model PACVP (Prediction of Anti-CoronaVirus Peptides) for identifying ACVPs based on effective feature representation and a two-layer stacking learning framework. In the first layer, we use nine feature encoding methods with different feature representation angles to characterize the rich sequence information and fuse them into a feature matrix. Secondly, data normalization and unbalanced data processing are carried out. Next, 12 baseline models are constructed by combining three feature selection methods and four machine learning classification algorithms. In the second layer, we input the optimal probability features into the logistic regression algorithm (LR) to train the final model PACVP. The experiments show that PACVP achieves favorable prediction performance on independent test dataset, with ACC of 0.9208 and AUC of 0.9465. We hope that PACVP will become a useful method for identifying, annotating and characterizing novel ACVPs.
Collapse
|
9
|
Roche R, Moussad B, Shuvo MH, Bhattacharya D. E(3) equivariant graph neural networks for robust and accurate protein-protein interaction site prediction. PLoS Comput Biol 2023; 19:e1011435. [PMID: 37651442 PMCID: PMC10499216 DOI: 10.1371/journal.pcbi.1011435] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 09/13/2023] [Accepted: 08/15/2023] [Indexed: 09/02/2023] Open
Abstract
Artificial intelligence-powered protein structure prediction methods have led to a paradigm-shift in computational structural biology, yet contemporary approaches for predicting the interfacial residues (i.e., sites) of protein-protein interaction (PPI) still rely on experimental structures. Recent studies have demonstrated benefits of employing graph convolution for PPI site prediction, but ignore symmetries naturally occurring in 3-dimensional space and act only on experimental coordinates. Here we present EquiPPIS, an E(3) equivariant graph neural network approach for PPI site prediction. EquiPPIS employs symmetry-aware graph convolutions that transform equivariantly with translation, rotation, and reflection in 3D space, providing richer representations for molecular data compared to invariant convolutions. EquiPPIS substantially outperforms state-of-the-art approaches based on the same experimental input, and exhibits remarkable robustness by attaining better accuracy with predicted structural models from AlphaFold2 than what existing methods can achieve even with experimental structures. Freely available at https://github.com/Bhattacharya-Lab/EquiPPIS, EquiPPIS enables accurate PPI site prediction at scale.
Collapse
Affiliation(s)
- Rahmatullah Roche
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Bernard Moussad
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Md Hossain Shuvo
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Debswapna Bhattacharya
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| |
Collapse
|
10
|
Qin L, Qi Q, Aikeliyaer A, Hou WQ, Zuo CX, Ma X. Machine learning algorithm can provide assistance for the diagnosis of non-ST-segment elevation myocardial infarction. Postgrad Med J 2023; 99:442-454. [PMID: 37294714 DOI: 10.1136/postgradmedj-2021-141329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 01/28/2022] [Indexed: 11/04/2022]
Abstract
INTRODUCTION Our aim was to use the constructed machine learning (ML) models as auxiliary diagnostic tools to improve the diagnostic accuracy of non-ST-elevation myocardial infarction (NSTEMI). MATERIALS AND METHODS A total of 2878 patients were included in this retrospective study, including 1409 patients with NSTEMI and 1469 patients with unstable angina pectoris. The clinical and biochemical characteristics of the patients were used to construct the initial attribute set. SelectKBest algorithm was used to determine the most important features. A feature engineering method was applied to create new features correlated strongly to train ML models and obtain promising results. Based on the experimental dataset, the ML models of extreme gradient boosting, support vector machine, random forest, naïve Bayesian, gradient boosting machines and logistic regression were constructed. Each model was verified by test set data, and the diagnostic performance of each model was comprehensively evaluated. RESULTS The six ML models based on the training set all play an auxiliary role in the diagnosis of NSTEMI. Although all models taken for comparison performed differences, the extreme gradient boosting ML model performed the best in terms of accuracy rate (0.95±0.014), precision rate (0.94±0.011), recall rate (0.98±0.003) and F-1 score (0.96±0.007) in NSTEMI. CONCLUSIONS The ML model constructed based on clinical data can be used as an auxiliary tool to improve the accuracy of NSTEMI diagnosis. According to our comprehensive evaluation, the performance of the extreme gradient boosting model was the best.
Collapse
Affiliation(s)
- Lian Qin
- Department of Cardiology, Xinjiang Medical University Affiliated First Hospital, Urumqi, Xinjiang, China
| | - Quan Qi
- College of Information Science and Technology, Shihezi University, Shihezi, Xinjiang, China
| | - Ainiwaer Aikeliyaer
- Department of Cardiology, Xinjiang Medical University Affiliated First Hospital, Urumqi, Xinjiang, China
| | - Wen Qing Hou
- College of Information Science and Technology, Shihezi University, Shihezi, Xinjiang, China
| | - Chang Xin Zuo
- College of Information Science and Technology, Shihezi University, Shihezi, Xinjiang, China
| | - Xiang Ma
- Department of Cardiology, Xinjiang Medical University Affiliated First Hospital, Urumqi, Xinjiang, China
| |
Collapse
|
11
|
Zhu Q, Luo R. Recent Advances in Biomolecular Recognition. Int J Mol Sci 2023; 24:ijms24098310. [PMID: 37176015 PMCID: PMC10179535 DOI: 10.3390/ijms24098310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 04/19/2023] [Indexed: 05/15/2023] Open
Abstract
Living cells are extremely complicated systems and composed of hundreds of thousands of diverse biomolecules, such as proteins, nucleic acids, and carbohydrates [...].
Collapse
Affiliation(s)
- Qiang Zhu
- Departments of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering, Materials Science and Engineering, and Biomedical Engineering, University of California, Irvine, CA 92697, USA
| | - Ray Luo
- Departments of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering, Materials Science and Engineering, and Biomedical Engineering, University of California, Irvine, CA 92697, USA
| |
Collapse
|
12
|
Aybey E, Gümüş Ö. SENSDeep: An Ensemble Deep Learning Method for Protein-Protein Interaction Sites Prediction. Interdiscip Sci 2023; 15:55-87. [PMID: 36346583 DOI: 10.1007/s12539-022-00543-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 10/15/2022] [Accepted: 10/17/2022] [Indexed: 11/09/2022]
Abstract
PURPOSE The determination of which amino acid in a protein interacts with other proteins is important in understanding the functional mechanism of that protein. Although there are experimental methods to detect protein-protein interaction sites (PPISs), these are costly, time-consuming, and require expertise. Therefore, many computational methods have been proposed to accelerate this type of research, but they are generally insufficient to predict PPISs accurately. There is a need for development in this field. METHODS In this study, we introduce a new PPISs prediction method. This method is a sequence-based Stacking ENSemble Deep (SENSDeep) learning method that has an ensemble learning model including the models of RNN, CNN, GRU sequence to sequence (GRUs2s), GRU sequence to sequence with an attention layer (GRUs2satt) and a multilayer perceptron. Two embedded features, secondary structure, and protein sequence information are added to the training data set in addition to twelve existing features to improve the prediction performance of the method. RESULTS SENSDeep trained on the training data set without two extra features obtains a better performance on some of the independent testing data sets than that of the other methods in the literature, especially on scoring metrics of sensitivity, F1, MCC, and AUPRC, having increments up to 63.5%, 19.3%, 18.5%, 11.4%, respectively. It is shown that the added extra features improve the performance of the method by having almost the same performance with less data as the method trained on the data set without these added features. On the other hand, different sizes of the sliding window are tried on the data sets and an optimal sliding window size for SENSDeep is found. Moreover, SENSDeep has also been compared to structure-based methods. Some of these methods have been found to perform better. Using SENSDeep obtained by training with both training data sets, PPISs prediction examples of various proteins that are not in these training data sets are also presented. Furthermore, execution times for SENSDeep and its submodels are shown. AVAILABILITY AND IMPLEMENTATION https://github.com/enginaybey/SENSDeep.
Collapse
Affiliation(s)
- Engin Aybey
- Department of Health Bioinformatics, Ege University, 35100, Bornova, Izmir, Turkey.
- Rectorate, Marmara University, 34722, Kadıköy, Istanbul, Turkey.
| | - Özgür Gümüş
- Department of Computer Engineering, Ege University, 35100, Bornova, Izmir, Turkey
| |
Collapse
|
13
|
Wang S, Chen W, Han P, Li X, Song T. RGN: Residue-Based Graph Attention and Convolutional Network for Protein-Protein Interaction Site Prediction. J Chem Inf Model 2022; 62:5961-5974. [PMID: 36398714 DOI: 10.1021/acs.jcim.2c01092] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The prediction of a protein-protein interaction site (PPI site) plays a very important role in the biochemical process, and lots of computational methods have been proposed in the past. However, the majority of the past methods are time consuming and lack accuracy. Hence, coming up with an effective computational method is necessary. In this article, we present a novel computational model called RGN (residue-based graph attention and convolutional network) to predict PPI sites. In our paper, the protein is treated as a graph. The amino acid can be seen as the node in the graph structure. The position-specific scoring matrix, hidden Markov model, hydrogen bond estimation algorithm, and ProtBert are applied as node features. The edges are decided by the spatial distance between the amino acids. Then, we utilize a residue-based graph convolutional network and graph attention network to further extract the deeper feature. Finally, the processed node feature is fed into the prediction layer. We show the superiority of our model by comparing it with the other four protein structure-based methods and five protein sequence-based methods. Our model obtains the best performance on all the evaluation metrics (accuracy, precision, recall, F1 score, Matthews correlation coefficient, area under the receiver operating characteristic curve, and area under the precision recall curve). We also conduct a case study to demonstrate that extracting the protein information from the protein structure perspective is effective and points out the difficult aspect of PPI site prediction.
Collapse
Affiliation(s)
- Shuang Wang
- College of Computer Science and Technology, China University of Petroleum, QingDao266580, China
| | - Wenqi Chen
- College of Computer Science and Technology, China University of Petroleum, QingDao266580, China
| | - Peifu Han
- College of Computer Science and Technology, China University of Petroleum, QingDao266580, China
| | - Xue Li
- College of Computer Science and Technology, China University of Petroleum, QingDao266580, China
| | - Tao Song
- College of Computer Science and Technology, China University of Petroleum, QingDao266580, China.,Department of Artificial Intelligence, Faculty of Computer Science, Polytechnical University of Madrid, Madrid28031, Spain
| |
Collapse
|
14
|
Li M, Wu Z, Wang W, Lu K, Zhang J, Zhou Y, Chen Z, Li D, Zheng S, Chen P, Wang B. Protein-Protein Interaction Sites Prediction Based on an Under-Sampling Strategy and Random Forest Algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3646-3654. [PMID: 34705656 DOI: 10.1109/tcbb.2021.3123269] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The computational methods of protein-protein interaction sites prediction can effectively avoid the shortcomings of high cost and time in traditional experimental approaches. However, the serious class imbalance between interface and non-interface residues on the protein sequences limits the prediction performance of these methods. This work therefore proposed a new strategy, NearMiss-based under-sampling for unbalancing datasets and Random Forest classification (NM-RF), to predict protein interaction sites. Herein, the residues on protein sequences were represented by the PSSM-derived features, hydropathy index (HI) and relative solvent accessibility (RSA). In order to resolve the class imbalance problem, an under-sampling method based on NearMiss algorithm is adopted to remove some non-interface residues, and then the random forest algorithm is used to perform binary classification on the balanced feature datasets. Experiments show that the accuracy of NM-RF model reaches 87.6% and 84.3% on Dtestset72 and PDBtestset164 respectively, which demonstrate the effectiveness of the proposed NM-RF method in differentiating the interface or non-interface residues.
Collapse
|
15
|
PITHIA: Protein Interaction Site Prediction Using Multiple Sequence Alignments and Attention. Int J Mol Sci 2022; 23:ijms232112814. [PMID: 36361606 PMCID: PMC9657891 DOI: 10.3390/ijms232112814] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 10/06/2022] [Accepted: 10/07/2022] [Indexed: 11/22/2022] Open
Abstract
Cellular functions are governed by proteins, and, while some proteins work independently, most work by interacting with other proteins. As a result it is crucially important to know the interaction sites that facilitate the interactions between the proteins. Since the experimental methods are costly and time consuming, it is essential to develop effective computational methods. We present PITHIA, a sequence-based deep learning model for protein interaction site prediction that exploits the combination of multiple sequence alignments and learning attention. We demonstrate that our new model clearly outperforms the state-of-the-art models on a wide range of metrics. In order to provide meaningful comparison, we update existing test datasets with new information regarding interaction site, as well as introduce an additional new testing dataset which resolves the shortcomings of the existing ones.
Collapse
|
16
|
Wang Y, Tang H, Gao C, Ge M, Li Z, Dong Z, Zhao L. Flexibility-aware graph model for accurate epitope identification. Comput Biol Med 2022; 149:106064. [DOI: 10.1016/j.compbiomed.2022.106064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 08/05/2022] [Accepted: 08/27/2022] [Indexed: 11/25/2022]
|
17
|
Pozzati G, Kundrotas P, Elofsson A. Scoring of protein–protein docking models utilizing predicted interface residues. Proteins 2022; 90:1493-1505. [PMID: 35246997 PMCID: PMC9314140 DOI: 10.1002/prot.26330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 02/23/2022] [Accepted: 02/28/2022] [Indexed: 11/08/2022]
Abstract
Scoring docking solutions is a difficult task, and many methods have been developed for this purpose. In docking, only a handful of the hundreds of thousands of models generated by docking algorithms are acceptable, causing difficulties when developing scoring functions. Today's best scoring functions can significantly increase the number of top‐ranked models but still fail for most targets. Here, we examine the possibility of utilizing predicted interface residues to score docking models generated during the scan stage of a docking algorithm. Many methods have been developed to infer the regions of a protein surface that interact with another protein, but most have not been benchmarked using docking algorithms. This study systematically tests different interface prediction methods for scoring >300.000 low‐resolution rigid‐body template free docking decoys. Overall we find that contact‐based interface prediction by BIPSPI is the best method to score docking solutions, with >12% of first ranked docking models being acceptable. Additional experiments indicated precision as a high‐importance metric when estimating interface prediction quality, focusing on docking constraints production. Finally, we discussed several limitations for adopting interface predictions as constraints in a docking protocol.
Collapse
Affiliation(s)
- Gabriele Pozzati
- Department of Biochemistry and Biophysics and Science for Life Laboratory Stockholm University Solna Sweden
| | - Petras Kundrotas
- Department of Biochemistry and Biophysics and Science for Life Laboratory Stockholm University Solna Sweden
- Center for Bioinformatics and Department of Molecular Biosciences University of Kansas Lawrence Kansas USA
| | - Arne Elofsson
- Department of Biochemistry and Biophysics and Science for Life Laboratory Stockholm University Solna Sweden
| |
Collapse
|
18
|
Ellethy, ME H, Chandra SS, Nasrallah FA. Deep Neural Networks Predict the Need for CT in Pediatric Mild Traumatic Brain Injury: A Corroboration of the PECARN Rule. J Am Coll Radiol 2022; 19:769-778. [DOI: 10.1016/j.jacr.2022.02.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Revised: 02/24/2022] [Accepted: 02/25/2022] [Indexed: 11/28/2022]
|
19
|
Yuan Q, Chen J, Zhao H, Zhou Y, Yang Y. Structure-aware protein-protein interaction site prediction using deep graph convolutional network. Bioinformatics 2021; 38:125-132. [PMID: 34498061 DOI: 10.1093/bioinformatics/btab643] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 08/03/2021] [Accepted: 09/03/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Protein-protein interactions (PPI) play crucial roles in many biological processes, and identifying PPI sites is an important step for mechanistic understanding of diseases and design of novel drugs. Since experimental approaches for PPI site identification are expensive and time-consuming, many computational methods have been developed as screening tools. However, these methods are mostly based on neighbored features in sequence, and thus limited to capture spatial information. RESULTS We propose a deep graph-based framework deep Graph convolutional network for Protein-Protein-Interacting Site prediction (GraphPPIS) for PPI site prediction, where the PPI site prediction problem was converted into a graph node classification task and solved by deep learning using the initial residual and identity mapping techniques. We showed that a deeper architecture (up to eight layers) allows significant performance improvement over other sequence-based and structure-based methods by more than 12.5% and 10.5% on AUPRC and MCC, respectively. Further analyses indicated that the predicted interacting sites by GraphPPIS are more spatially clustered and closer to the native ones even when false-positive predictions are made. The results highlight the importance of capturing spatially neighboring residues for interacting site prediction. AVAILABILITY AND IMPLEMENTATION The datasets, the pre-computed features, and the source codes along with the pre-trained models of GraphPPIS are available at https://github.com/biomed-AI/GraphPPIS. The GraphPPIS web server is freely available at https://biomed.nscc-gz.cn/apps/GraphPPIS. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qianmu Yuan
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Jianwen Chen
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Huiying Zhao
- Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou 510000, China
| | - Yaoqi Zhou
- Peking University Shenzhen Graduate School, Shenzhen 518055, China.,Shenzhen Bay Laboratory, Shenzhen 518055, China.,Institute for Glycomics, Griffith University, Parklands Drive, Southport, QLD 4215, Australia
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China.,Key Laboratory of Machine Intelligence and Advanced Computing of MOE, Sun Yat-sen University, Guangzhou 510000, China
| |
Collapse
|
20
|
Wang P, Zhang G, Yu ZG, Huang G. A Deep Learning and XGBoost-Based Method for Predicting Protein-Protein Interaction Sites. Front Genet 2021; 12:752732. [PMID: 34764983 PMCID: PMC8576272 DOI: 10.3389/fgene.2021.752732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 09/20/2021] [Indexed: 11/29/2022] Open
Abstract
Knowledge about protein-protein interactions is beneficial in understanding cellular mechanisms. Protein-protein interactions are usually determined according to their protein-protein interaction sites. Due to the limitations of current techniques, it is still a challenging task to detect protein-protein interaction sites. In this article, we presented a method based on deep learning and XGBoost (called DeepPPISP-XGB) for predicting protein-protein interaction sites. The deep learning model served as a feature extractor to remove redundant information from protein sequences. The Extreme Gradient Boosting algorithm was used to construct a classifier for predicting protein-protein interaction sites. The DeepPPISP-XGB achieved the following results: area under the receiver operating characteristic curve of 0.681, a recall of 0.624, and area under the precision-recall curve of 0.339, being competitive with the state-of-the-art methods. We also validated the positive role of global features in predicting protein-protein interaction sites.
Collapse
Affiliation(s)
- Pan Wang
- School of Electrical Engineering, Shaoyang University, Shaoyang, China
| | - Guiyang Zhang
- School of Electrical Engineering, Shaoyang University, Shaoyang, China
| | - Zu-Guo Yu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, China
| | - Guohua Huang
- School of Electrical Engineering, Shaoyang University, Shaoyang, China
| |
Collapse
|
21
|
Li M, Wang Y, Li F, Zhao Y, Liu M, Zhang S, Bin Y, Smith AI, Webb GI, Li J, Song J, Xia J. A Deep Learning-Based Method for Identification of Bacteriophage-Host Interaction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1801-1810. [PMID: 32813660 PMCID: PMC8703204 DOI: 10.1109/tcbb.2020.3017386] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Multi-drug resistance (MDR) has become one of the greatest threats to human health worldwide, and novel treatment methods of infections caused by MDR bacteria are urgently needed. Phage therapy is a promising alternative to solve this problem, to which the key is correctly matching target pathogenic bacteria with the corresponding therapeutic phage. Deep learning is powerful for mining complex patterns to generate accurate predictions. In this study, we develop PredPHI (Predicting Phage-Host Interactions), a deep learning-based tool capable of predicting the host of phages from sequence data. We collect >3000 phage-host pairs along with their protein sequences from PhagesDB and GenBank databases and extract a set of features. Then we select high-quality negative samples based on the K-Means clustering method and construct a balanced training set. Finally, we employ a deep convolutional neural network to build the predictive model. The results indicate that PredPHI can achieve a predictive performance of 81 percent in terms of the area under the receiver operating characteristic curve on the test set, and the clustering-based method is significantly more robust than that based on randomly selecting negative samples. These results highlight that PredPHI is a useful and accurate tool for identifying phage-host interactions from sequence data.
Collapse
|
22
|
Hong Z, Liu J, Chen Y. An interpretable machine learning method for homo-trimeric protein interface residue-residue interaction prediction. Biophys Chem 2021; 278:106666. [PMID: 34418678 DOI: 10.1016/j.bpc.2021.106666] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 08/09/2021] [Accepted: 08/09/2021] [Indexed: 12/29/2022]
Abstract
Protein-protein interaction plays an important role in life activities. A more fine-grained analysis, such as residues and atoms level, will better benefit us to understand the mechanism for inter-protein interaction and drug design. The development of efficient computational methods to reduce trials and errors, as well as assisting experimental researchers to determine the complex structure are some of the ongoing studies in the field. The research of trimer protein interface, especially homotrimer, has been rarely studied. In this paper, we proposed an interpretable machine learning method for homo-trimeric protein interface residue pairs prediction. The structure, sequence, and physicochemical information are intergraded as feature input fed to model for training. Graph model is utilized to present spatial information for intra-protein. Matrix factorization captures the different features' interactions. Kernel function is designed to auto-acquire the adjacent information of our target residue pairs. The accuracy rate achieves 54.5% in an independent test set. Sequence and structure alignment exhibit the ability of model self-study. Our model indicates the biological significance between sequence and structure, and could be auxiliary for reducing trials and errors in the fields of protein complex determination and protein-protein docking, etc. SIGNIFICANCE: Protein complex structures are significant for understanding protein function and promising functional protein design. With data increasing, some computational tools have been developed for protein complex residue contact prediction, which is one of the most significant steps for complex structure prediction. But for homo-trimeric protein, the sequence-based deep learning predictors are infeasible for homologous sequences, and the algorithm black box prevents us from understanding of each step operation. In this way, we propose an interpreting machine learning method for homo-trimeric protein interface residue-residue interaction prediction, and the predictor shows a good performance. Our work provides a computational auxiliary way for determining the homo-trimeric proteins interface residue pairs which will be further verified by wet experiments, and and gives a hand for the downstream works, such as protein-protein docking, protein complex structure prediction and drug design.
Collapse
Affiliation(s)
- Zhonghua Hong
- Jiaxing Hospital of Traditional Chinese Medicine, Jiaxing University, Jiaxing 314001, PR China.
| | - Jiale Liu
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, PR China
| | - Yinggao Chen
- Shantou Central Hospital, Shantou 515041, PR China.
| |
Collapse
|
23
|
Jiang M, Zhao B, Luo S, Wang Q, Chu Y, Chen T, Mao X, Liu Y, Wang Y, Jiang X, Wei DQ, Xiong Y. NeuroPpred-Fuse: an interpretable stacking model for prediction of neuropeptides by fusing sequence information and feature selection methods. Brief Bioinform 2021; 22:6350884. [PMID: 34396388 DOI: 10.1093/bib/bbab310] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 07/01/2021] [Accepted: 07/18/2021] [Indexed: 12/13/2022] Open
Abstract
Neuropeptides acting as signaling molecules in the nervous system of various animals play crucial roles in a wide range of physiological functions and hormone regulation behaviors. Neuropeptides offer many opportunities for the discovery of new drugs and targets for the treatment of neurological diseases. In recent years, there have been several data-driven computational predictors of various types of bioactive peptides, but the relevant work about neuropeptides is little at present. In this work, we developed an interpretable stacking model, named NeuroPpred-Fuse, for the prediction of neuropeptides through fusing a variety of sequence-derived features and feature selection methods. Specifically, we used six types of sequence-derived features to encode the peptide sequences and then combined them. In the first layer, we ensembled three base classifiers and four feature selection algorithms, which select non-redundant important features complementarily. In the second layer, the output of the first layer was merged and fed into logistic regression (LR) classifier to train the model. Moreover, we analyzed the selected features and explained the feasibility of the selected features. Experimental results show that our model achieved 90.6% accuracy and 95.8% AUC on the independent test set, outperforming the state-of-the-art models. In addition, we exhibited the distribution of selected features by these tree models and compared the results on the training set to that on the test set. These results fully showed that our model has a certain generalization ability. Therefore, we expect that our model would provide important advances in the discovery of neuropeptides as new drugs for the treatment of neurological diseases.
Collapse
Affiliation(s)
- Mingming Jiang
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Bowen Zhao
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Shenggan Luo
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Qiankun Wang
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yanyi Chu
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Tianhang Chen
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xueying Mao
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yatong Liu
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yanjing Wang
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xue Jiang
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
24
|
Zhang S, Wang L, Zhao L, Li M, Liu M, Li K, Bin Y, Xia J. An improved DNA-binding hot spot residues prediction method by exploring interfacial neighbor properties. BMC Bioinformatics 2021; 22:253. [PMID: 34000983 PMCID: PMC8130120 DOI: 10.1186/s12859-020-03871-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 11/09/2020] [Indexed: 11/29/2022] Open
Abstract
Background DNA-binding hot spots are dominant and fundamental residues that contribute most of the binding free energy yet accounting for a small portion of protein–DNA interfaces. As experimental methods for identifying hot spots are time-consuming and costly, high-efficiency computational approaches are emerging as alternative pathways to experimental methods. Results Herein, we present a new computational method, termed inpPDH, for hot spot prediction. To improve the prediction performance, we extract hybrid features which incorporate traditional features and new interfacial neighbor properties. To remove redundant and irrelevant features, feature selection is employed using a two-step feature selection strategy. Finally, a subset of 7 optimal features are chosen to construct the predictor using support vector machine. The results on the benchmark dataset show that this proposed method yields significantly better prediction accuracy than those previously published methods in the literature. Moreover, a user-friendly web server for inpPDH is well established and is freely available at http://bioinfo.ahu.edu.cn/inpPDH. Conclusions We have developed an accurate improved prediction model, inpPDH, for hot spot residues in protein–DNA binding interfaces by given the structure of a protein–DNA complex. Moreover, we identify a comprehensive and useful feature subset including the proposed interfacial neighbor features that has an important strength for identifying hot spot residues. Our results indicate that these features are more effective than the conventional features considered previously, and that the combination of interfacial neighbor features and traditional features may support the creation of a discriminative feature set for efficient prediction of hot spot residues in protein–DNA complexes. Supplementary information Supplementary information accompanies this paper at 10.1186/s12859-020-03871-1.
Collapse
Affiliation(s)
- Sijia Zhang
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China
| | - Lihua Wang
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Le Zhao
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Menglu Li
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Mengya Liu
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Ke Li
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Yannan Bin
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China. .,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China.
| | - Junfeng Xia
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China. .,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China.
| |
Collapse
|
25
|
Zhu Q, Gu Y, Hu L, Gaudin T, Fan M, Ma J. Shear viscosity prediction of alcohols, hydrocarbons, halogenated, carbonyl, nitrogen-containing, and sulfur compounds using the variable force fields. J Chem Phys 2021; 154:074502. [PMID: 33607909 DOI: 10.1063/5.0038267] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Viscosity of organic liquids is an important physical property in applications of printing, pharmaceuticals, oil extracting, engineering, and chemical processes. Experimental measurement is a direct but time-consuming process. Accurately predicting the viscosity with a broad range of chemical diversity is still a great challenge. In this work, a protocol named Variable Force Field (VaFF) was implemented to efficiently vary the force field parameters, especially λvdW, for the van der Waals term for the shear viscosity prediction of 75 organic liquid molecules with viscosity ranging from -9 to 0 in their nature logarithm and containing diverse chemical functional groups, such as alcoholic hydroxyl, carbonyl, and halogenated groups. Feature learning was applied for the viscosity prediction, and the selected features indicated that the hydrogen bonding interactions and the number of atoms and rings play important roles in the property of viscosity. The shear viscosity prediction of alcohols is very difficult owing to the existence of relative strong intermolecular hydrogen bonding interaction as reflected by density functional theory binding energies. From radial and spatial distribution functions of methanol, we found that the van der Waals related parameters λvdW are more crucial to the viscosity prediction than the rotation related parameters, λtor. With the variable λvdW-based all-atom optimized potentials for liquid simulations force field, a great improvement was observed in the viscosity prediction for alcohols. The simplicity and uniformity of VaFF make it an efficient tool for the prediction of viscosity and other related properties in the rational design of materials with the specific properties.
Collapse
Affiliation(s)
- Qiang Zhu
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education Institute of Theoretical and Computational Chemistry School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People's Republic of China
| | - Yuming Gu
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education Institute of Theoretical and Computational Chemistry School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People's Republic of China
| | - Limu Hu
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education Institute of Theoretical and Computational Chemistry School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People's Republic of China
| | - Théophile Gaudin
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education Institute of Theoretical and Computational Chemistry School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People's Republic of China
| | - Mengting Fan
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education Institute of Theoretical and Computational Chemistry School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People's Republic of China
| | - Jing Ma
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education Institute of Theoretical and Computational Chemistry School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People's Republic of China
| |
Collapse
|
26
|
LncLocation: Efficient Subcellular Location Prediction of Long Non-Coding RNA-Based Multi-Source Heterogeneous Feature Fusion. Int J Mol Sci 2020; 21:ijms21197271. [PMID: 33019721 PMCID: PMC7582431 DOI: 10.3390/ijms21197271] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Revised: 09/27/2020] [Accepted: 09/28/2020] [Indexed: 12/13/2022] Open
Abstract
Recent studies uncover that subcellular location of long non-coding RNAs (lncRNAs) can provide significant information on its function. Due to the lack of experimental data, the number of lncRNAs is very limited, experimentally verified subcellular localization, and the numbers of lncRNAs located in different organelle are wildly imbalanced. The prediction of subcellular location of lncRNAs is actually a multi-classification small sample imbalance problem. The imbalance of data results in the poor recognition effect of machine learning models on small data subsets, which is a puzzling and challenging problem in the existing research. In this study, we integrate multi-source features to construct a sequence-based computational tool, lncLocation, to predict the subcellular location of lncRNAs. Autoencoder is used to enhance part of the features, and the binomial distribution-based filtering method and recursive feature elimination (RFE) are used to filter some of the features. It improves the representation ability of data and reduces the problem of unbalanced multi-classification data. By comprehensive experiments on different feature combinations and machine learning models, we select the optimal features and classifier model scheme to construct a subcellular location prediction tool, lncLocation. LncLocation can obtain an 87.78% accuracy using 5-fold cross validation on the benchmark data, which is higher than the state-of-the-art tools, and the classification performance, especially for small class sets, is improved significantly.
Collapse
|
27
|
Bin Y, Zhang W, Tang W, Dai R, Li M, Zhu Q, Xia J. Prediction of Neuropeptides from Sequence Information Using Ensemble Classifier and Hybrid Features. J Proteome Res 2020; 19:3732-3740. [DOI: 10.1021/acs.jproteome.0c00276] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Yannan Bin
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China
- School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China
| | - Wei Zhang
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China
| | - Wending Tang
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China
| | - Ruyu Dai
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China
| | - Menglu Li
- School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China
| | - Qizhi Zhu
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China
| | - Junfeng Xia
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China
- School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China
| |
Collapse
|
28
|
Liang Y, Wang H, Yang J, Li X, Dai C, Shao P, Tian G, Wang B, Wang Y. A Deep Learning Framework to Predict Tumor Tissue-of-Origin Based on Copy Number Alteration. Front Bioeng Biotechnol 2020; 8:701. [PMID: 32850687 PMCID: PMC7419421 DOI: 10.3389/fbioe.2020.00701] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Accepted: 06/04/2020] [Indexed: 12/18/2022] Open
Abstract
Cancer of unknown primary site (CUPS) is a type of metastatic tumor for which the sites of tumor origin cannot be determined. Precise diagnosis of the tissue origin for metastatic CUPS is crucial for developing treatment schemes to improve patient prognosis. Recently, there have been many studies using various cancer biomarkers to predict the tissue-of-origin (TOO) of CUPS. However, only a very few of them use copy number alteration (CNA) to trance TOO. In this paper, a two-step computational framework called CNA_origin is introduced to predict the tissue-of-origin of a tumor from its gene CNA levels. CNA_origin set up an intellectual deep-learning network mainly composed of an autoencoder and a convolution neural network (CNN). Based on real datasets released from the public database, CNA_origin had an overall accuracy of 83.81% on 10-fold cross-validation and 79% on independent datasets for predicting tumor origin, which improved the accuracy by 7.75 and 9.72% compared with the method published in a previous paper. Our results suggested that the autoencoder model can extract key characteristics of CNA and that the CNN classifier model developed in this study can predict the origin of tumors robustly and effectively. CNA_origin was written in Python and can be downloaded from https://github.com/YingLianghnu/CNA_origin.
Collapse
Affiliation(s)
- Ying Liang
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, China
| | - Haifeng Wang
- Department of Urology, Shanghai East Hospital, Tongji University School of Medicine, Shanghai, China
| | | | - Xiong Li
- School of Software, East China Jiaotong University, Nanchang, China
| | - Chan Dai
- Geneis (Beijing) Co. Ltd., Beijing, China
| | - Peng Shao
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, China
| | - Geng Tian
- Geneis (Beijing) Co. Ltd., Beijing, China
| | - Bo Wang
- Geneis (Beijing) Co. Ltd., Beijing, China
| | - Yinglong Wang
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, China
| |
Collapse
|
29
|
Wang Y, Wang M, Yu P, Zuo L, Zhou Q, Zhou X, Zhu H. MicroRNA-126 Modulates Palmitate-Induced Migration in HUVECs by Downregulating Myosin Light Chain Kinase via the ERK/MAPK Pathway. Front Bioeng Biotechnol 2020; 8:913. [PMID: 32850751 PMCID: PMC7411007 DOI: 10.3389/fbioe.2020.00913] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 07/15/2020] [Indexed: 12/23/2022] Open
Abstract
MicroRNA-126 (miR-126) is an endothelial-specific microRNA that has shown beneficial effects on endothelial dysfunction. However, the underlying molecular mechanism is unclear. The present study evaluated the effects of miR-126 on the cell migration and underlying mechanism in HUVECs treated with palmitate. The present results demonstrated that overexpression of miR-126 was found to decrease cell migration in palmitate-treated HUVECs, with decreased MLCK expression and subsequent decreased phosphorylated MLC level. miR-126 also decreased the phosphorylation of MYPT1 in palmitate-treated HUVECs. In addition, it was demonstrated that miR-126 decreases expression of the NADPH oxidase subunits, p67 and Rac family small GTPase 1 with a subsequent decrease in cell apoptosis. Moreover, the phosphorylation of ERK was reduced by miR-126 in palmitate-induced HUVECs. Taken together, the present study showed that the effect of miR-126 on cell migration and cell apoptosis is mediated through downregulation of MLCK via the ERK/MAPK pathway.
Collapse
Affiliation(s)
- Yi Wang
- Department of Biological Engineering, School of Life Sciences, Anhui Medical University, Hefei, China.,Laboratory of Molecular Biology and Department of Biochemistry, Anhui Medical University, Hefei, China
| | - Mei Wang
- General Department of Hyperbaric Oxygen, Hefei Hospital Affiliated to Anhui Medical University, Hefei, China
| | - Pei Yu
- Laboratory of Molecular Biology and Department of Biochemistry, Anhui Medical University, Hefei, China
| | - Li Zuo
- Laboratory of Molecular Biology and Department of Biochemistry, Anhui Medical University, Hefei, China
| | - Qing Zhou
- Laboratory of Molecular Biology and Department of Biochemistry, Anhui Medical University, Hefei, China
| | - Xiaomei Zhou
- General Department of Hyperbaric Oxygen, Hefei Hospital Affiliated to Anhui Medical University, Hefei, China
| | - Huaqing Zhu
- Laboratory of Molecular Biology and Department of Biochemistry, Anhui Medical University, Hefei, China
| |
Collapse
|
30
|
Zhou M, Bian K, Hu F, Lai W. A New Method Based on CEEMD Combined With Iterative Feature Reduction for Aided Diagnosis of Epileptic EEG. Front Bioeng Biotechnol 2020; 8:669. [PMID: 32695761 PMCID: PMC7338793 DOI: 10.3389/fbioe.2020.00669] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 05/28/2020] [Indexed: 11/26/2022] Open
Abstract
In the clinical diagnosis of epileptic diseases, the intelligent diagnosis of epileptic electroencephalogram (EEG) signals has become a research focus in the field of brain diseases. In order to solve the problem of time-consuming and easily influenced by human subjective factors, artificial intelligence pattern recognition algorithm has been applied to EEG signals recognition. However, at present, the common empirical mode decomposition (EMD) signal decomposition algorithm does not consider the problem of mode aliasing. The EEG features obtained by feature extraction may be mixed with some unimportant features that affect the classification accuracy. In this paper, we proposed a new method based on complementary ensemble empirical mode decomposition (CEEMD) combined with iterative feature reduction for aided diagnosis of epileptic EEG. First of all, the evaluation indexes of decomposing and reconstructing signals by several methods were compared. The CEEMD was selected as the decomposition method of the signals. Then, the support vector machine recursive elimination (SVM-RFE) was used to reduce 9 features extracted from EEG data. The support vector classification of the gray wolf optimizer (GWO-SVC) recognition model was established for different feature subsets. By comparing the classification accuracy of training set and test set of different feature subsets, and considering the complexity of the model reflected by the number of features selected by SVM-RFE, the analysis showed that the 6 feature subsets with fewer features and higher classification accuracy could reflect the key information of epileptic EEG. The accuracy of the training set classification was 99.38% and the test set was as high as 100%. The recognition time was only 1.6551 s. Finally, in order to verify the reliability of the algorithm proposed in this paper, the proposed algorithm compared with the classification model established by the raw EEG signals and the optimization model established by other intelligent optimization algorithms. It is found that the algorithm used in this paper has higher classification accuracy and faster recognition time than other processing methods. The experimental results show that CEEMD combined with SVM-RFE is feasible for rapid and accurate recognition of EEG signals, which provides a theoretical basis for the aided diagnosis of epilepsy.
Collapse
Affiliation(s)
- Mengran Zhou
- School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, China.,State Key Laboratory of Mining Response and Disaster Prevention and Control in Deep Coal Mines, Anhui University of Science and Technology, Huainan, China
| | - Kai Bian
- School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, China
| | - Feng Hu
- School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, China
| | - Wenhao Lai
- School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, China
| |
Collapse
|
31
|
Zhu C, Zhang X, Kourkoumelis N, Shen Y, Huang W. Integrated Analysis of DEAD-Box Helicase 56: A Potential Oncogene in Osteosarcoma. Front Bioeng Biotechnol 2020; 8:588. [PMID: 32671031 PMCID: PMC7332757 DOI: 10.3389/fbioe.2020.00588] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Accepted: 05/14/2020] [Indexed: 01/04/2023] Open
Abstract
Background: Osteosarcoma is a solid tumor common in the musculoskeletal system. The DEAD-box helicase (DDX) families play an important role in tumor genesis and proliferation. Objective:To screen potential molecular targets in osteosarcoma and elucidate its relationship with DDX56. Methods: We employed the Gene Expression Omnibus and The Cancer Genome Atlas datasets for preliminary screening. DDX56 expression was measured by RT-qPCR in three osteosarcoma cell lines. Biological roles of DDX56 were explored by Gene ontology, Kyoto Encyclopedia of Genes and Genomes and Ingenuity Pathway Analysis. Cell proliferation, cycle, and apoptosis assays were performed using Lentivirus™ knockdown technique. Results: It was found that DDX56 expression was regularly upregulated in osteosarcoma tissue and cell lines, while DDX56 knockdown inhibited cell proliferation and promoted cell apoptosis. Conclusions: The findings suggest DDX56 as a potential therapeutic target for the treatment of osteosarcoma.
Collapse
Affiliation(s)
- Chen Zhu
- Division of Life Sciences and Medicine, Department of Orthopedics, The First Affiliated Hospital of USTC, University of Science and Technology of China, Hefei, China
| | - Xianzuo Zhang
- Division of Life Sciences and Medicine, Department of Orthopedics, The First Affiliated Hospital of USTC, University of Science and Technology of China, Hefei, China
| | - Nikolaos Kourkoumelis
- Department of Medical Physics, School of Health Sciences, University of Ioannina, Ioannina, Greece
| | - Yong Shen
- Institute on Aging and Brain Disorders, The First Affiliated Hospital of University of Science and Technology of China, Hefei, China.,Division of Life Sciences and Medicine, Neurodegenerative Disorder Research Center, University of Science and Technology of China, Hefei, China
| | - Wei Huang
- Division of Life Sciences and Medicine, Department of Orthopedics, The First Affiliated Hospital of USTC, University of Science and Technology of China, Hefei, China
| |
Collapse
|
32
|
Li XX, Lin TT, Liu B, Wei W. Diagnosis of Cervical Cancer With Parametrial Invasion on Whole-Tumor Dynamic Contrast-Enhanced Magnetic Resonance Imaging Combined With Whole-Lesion Texture Analysis Based on T2- Weighted Images. Front Bioeng Biotechnol 2020; 8:590. [PMID: 32596230 PMCID: PMC7300256 DOI: 10.3389/fbioe.2020.00590] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Accepted: 05/14/2020] [Indexed: 12/17/2022] Open
Abstract
Purpose: To evaluate the diagnostic value of the combination of whole-tumor dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) and whole-lesion texture features based on T2-weighted images for cervical cancer with parametrial invasion. Materials and Methods: Sixty-two patients with cervical cancer (27 with parametrial invasion and 35 without invasion) preoperatively underwent routine MRI and DCE-MRI examinations. DCE-MRI parameters (Ktrans, Kep, and Ve) and texture features (mean, skewness, kurtosis, uniformity, energy, and entropy) based on T2-weighted images were acquired by two observers. All parameters of parametrial invasion and non-invasion were analyzed by one-way analysis of variance. The diagnostic efficiency of significant variables was assessed using receiver operating characteristic analysis. Results: The invasion group of cervical cancer demonstrated significantly higher Ktrans (0.335 ± 0.050 vs. 0.269 ± 0.079; p < 0.001), lower energy values (0.503 ± 0.093 vs. 0.602 ± 0.087; p < 0.001), and higher entropy values (1.391 ± 0.193 vs. 1.24 ± 0.129; p < 0.001) than those in the non-invasion group. Optimal diagnostic performance [area under curve [AUC], 0.925; sensitivity, 0.935; specificity, 0.829] could be obtained by the combination of Ktrans, energy, and entropy values. The AUC values of Ktrans (0.788), energy (0.761), entropy (0.749), the combination of Ktrans and energy (0.814), the combination of Ktrans and entropy (0.727), and the combination of energy and entropy (0.619) were lower than those of the combination of Ktrans, energy, and entropy values. Conclusion: The combination of DCE-MRI and texture analysis is a promising method for diagnosis cervical cancer with parametrial infiltration. Moreover, the combination of Ktrans, energy, and entropy is more valuable than any one alone, especially in improving diagnostic sensitivity.
Collapse
Affiliation(s)
- Xin-Xiang Li
- Jiangsu Key Laboratory of Molecular and Functional Imaging, Department of Radiology, Zhongda Hospital, Medical School, Southeast University, Nanjing, China
| | - Ting-Ting Lin
- Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Bin Liu
- Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Wei Wei
- Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| |
Collapse
|