Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Zhu Y, He J, Wei R, Liu J. Construction and experimental validation of a novel ferroptosis-related gene signature for myelodysplastic syndromes. Immun Inflamm Dis 2024;12:e1221. [PMID: 38578040 PMCID: PMC10996383 DOI: 10.1002/iid3.1221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 01/26/2024] [Accepted: 03/03/2024] [Indexed: 04/06/2024] Open

Abstract

BACKGROUND

Myelodysplastic syndromes (MDS) are clonal hematopoietic disorders characterized by morphological abnormalities and peripheral blood cytopenias, carrying a risk of progression to acute myeloid leukemia. Although ferroptosis is a promising target for MDS treatment, the specific roles of ferroptosis-related genes (FRGs) in MDS diagnosis have not been elucidated.

METHODS

MDS-related microarray data were obtained from the Gene Expression Omnibus database. A comprehensive analysis of FRG expression levels in patients with MDS and controls was conducted, followed by the use of multiple machine learning methods to establish prediction models. The predictive ability of the optimal model was evaluated using nomogram analysis and an external data set. Functional analysis was applied to explore the underlying mechanisms. The mRNA levels of the model genes were verified in MDS clinical samples by quantitative real-time polymerase chain reaction (qRT-PCR).

RESULTS

The extreme gradient boosting model demonstrated the best performance, leading to the identification of a panel of six signature genes: SREBF1, PTPN6, PARP9, MAP3K11, MDM4, and EZH2. Receiver operating characteristic curves indicated that the model exhibited high accuracy in predicting MDS diagnosis, with area under the curve values of 0.989 and 0.962 for the training and validation cohorts, respectively. Functional analysis revealed significant associations between these genes and the infiltrating immune cells. The expression levels of these genes were successfully verified in MDS clinical samples.

CONCLUSION

Our study is the first to identify a novel model using FRGs to predict the risk of developing MDS. FRGs may be implicated in MDS pathogenesis through immune-related pathways. These findings highlight the intricate correlation between ferroptosis and MDS, offering insights that may aid in identifying potential therapeutic targets for this debilitating disorder.

Collapse

Zhao R, Xie R, Ren N, Li Z, Zhang S, Liu Y, Dong Y, Yin AA, Zhao Y, Bai S. Correlation between intraosseous thermal change and drilling impulse data during osteotomy within autonomous dental implant robotic system: An in vitro study. Clin Oral Implants Res 2024;35:258-267. [PMID: 38031528 DOI: 10.1111/clr.14222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Revised: 09/05/2023] [Accepted: 11/16/2023] [Indexed: 12/01/2023]

Abstract

OBJECTIVES

This study aims at examining the correlation of intraosseous temperature change with drilling impulse data during osteotomy and establishing real-time temperature prediction models.

MATERIALS AND METHODS

A combination of in vitro bovine rib model and Autonomous Dental Implant Robotic System (ADIR) was set up, in which intraosseous temperature and drilling impulse data were measured using an infrared camera and a six-axis force/torque sensor respectively. A total of 800 drills with different parameters (e.g., drill diameter, drill wear, drilling speed, and thickness of cortical bone) were experimented, along with an independent test set of 200 drills. Pearson correlation analysis was done for linear relationship. Four machining learning (ML) algorithms (e.g., support vector regression [SVR], ridge regression [RR], extreme gradient boosting [XGboost], and artificial neural network [ANN]) were run for building prediction models.

RESULTS

By incorporating different parameters, it was found that lower drilling speed, smaller drill diameter, more severe wear, and thicker cortical bone were associated with higher intraosseous temperature changes and longer time exposure and were accompanied with alterations in drilling impulse data. Pearson correlation analysis further identified highly linear correlation between drilling impulse data and thermal changes. Finally, four ML prediction models were established, among which XGboost model showed the best performance with the minimum error measurements in test set.

CONCLUSION

The proof-of-concept study highlighted close correlation of drilling impulse data with intraosseous temperature change during osteotomy. The ML prediction models may inspire future improvement on prevention of thermal bone injury and intelligent design of robot-assisted implant surgery.

Collapse

Affiliation(s)

Ruifeng Zhao Digital Center, School of Stomatology, The Fourth Military Medical University, State Key Laboratory of Oral & Maxillofacial Reconstruction and Regeneration & National Clinical Research Center for Oral Diseases & Shaanxi Key Laboratory of Stomatology, Xi'an, Shaanxi, China Department of Stomatology, 960 Hospital of the Chinese People's Liberation Army, Jinan, Shandong, China
Rui Xie Digital Center, School of Stomatology, The Fourth Military Medical University, State Key Laboratory of Oral & Maxillofacial Reconstruction and Regeneration & National Clinical Research Center for Oral Diseases & Shaanxi Key Laboratory of Stomatology, Xi'an, Shaanxi, China
Nan Ren Digital Center, School of Stomatology, The Fourth Military Medical University, State Key Laboratory of Oral & Maxillofacial Reconstruction and Regeneration & National Clinical Research Center for Oral Diseases & Shaanxi Key Laboratory of Stomatology, Xi'an, Shaanxi, China
Zhiwen Li Digital Center, School of Stomatology, The Fourth Military Medical University, State Key Laboratory of Oral & Maxillofacial Reconstruction and Regeneration & National Clinical Research Center for Oral Diseases & Shaanxi Key Laboratory of Stomatology, Xi'an, Shaanxi, China
Shengrui Zhang Digital Center, School of Stomatology, The Fourth Military Medical University, State Key Laboratory of Oral & Maxillofacial Reconstruction and Regeneration & National Clinical Research Center for Oral Diseases & Shaanxi Key Laboratory of Stomatology, Xi'an, Shaanxi, China
Yuchen Liu Digital Center, School of Stomatology, The Fourth Military Medical University, State Key Laboratory of Oral & Maxillofacial Reconstruction and Regeneration & National Clinical Research Center for Oral Diseases & Shaanxi Key Laboratory of Stomatology, Xi'an, Shaanxi, China
Yu Dong Digital Center, School of Stomatology, The Fourth Military Medical University, State Key Laboratory of Oral & Maxillofacial Reconstruction and Regeneration & National Clinical Research Center for Oral Diseases & Shaanxi Key Laboratory of Stomatology, Xi'an, Shaanxi, China Department of Stomatology, Xi'an No.3 Hospital, the Affiliated Hospital of Northwest University, Xi'an, Shaanxi, China
An-An Yin Department of Plastic and Reconstructive Surgery, Department of Neurosurgery, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi, China
Yimin Zhao Digital Center, School of Stomatology, The Fourth Military Medical University, State Key Laboratory of Oral & Maxillofacial Reconstruction and Regeneration & National Clinical Research Center for Oral Diseases & Shaanxi Key Laboratory of Stomatology, Xi'an, Shaanxi, China
Shizhu Bai Digital Center, School of Stomatology, The Fourth Military Medical University, State Key Laboratory of Oral & Maxillofacial Reconstruction and Regeneration & National Clinical Research Center for Oral Diseases & Shaanxi Key Laboratory of Stomatology, Xi'an, Shaanxi, China

Collapse

Wang J, Zhou H, Wang Y, Xu M, Yu Y, Wang J, Liu Y. Prediction of submitochondrial proteins localization based on Gene Ontology. Comput Biol Med 2023;167:107589. [PMID: 37883850 DOI: 10.1016/j.compbiomed.2023.107589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 09/28/2023] [Accepted: 10/17/2023] [Indexed: 10/28/2023]

Kusuma WA, Fadli A, Fatriani R, Sofyantoro F, Yudha DS, Lischer K, Nuringtyas TR, Putri WA, Purwestri YA, Swasono RT. Prediction of the interaction between Calloselasma rhodostoma venom-derived peptides and cancer-associated hub proteins: A computational study. Heliyon 2023;9:e21149. [PMID: 37954374 PMCID: PMC10637925 DOI: 10.1016/j.heliyon.2023.e21149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 09/04/2023] [Accepted: 10/17/2023] [Indexed: 11/14/2023] Open

Zhu Y, Kong L, Han T, Yan Q, Liu J. Machine learning identification and immune infiltration of disulfidptosis-related Alzheimer's disease molecular subtypes. Immun Inflamm Dis 2023;11:e1037. [PMID: 37904698 PMCID: PMC10566450 DOI: 10.1002/iid3.1037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 09/08/2023] [Accepted: 09/09/2023] [Indexed: 11/01/2023] Open

Abstract

BACKGROUND

Alzheimer's disease (AD) is a common neurodegenerative disorder. Disulfidptosis is a newly discovered form of programmed cell death that holds promise as a therapeutic strategy for various disorders. However, the functional roles of disulfidptosis-related genes (DRGs) in AD remain unknown.

METHODS

Microarray data and clinical information from patients with AD and healthy controls were downloaded from the Gene Expression Omnibus database. A thorough examination of DRG expression and immune characteristics in both groups was performed. Based on the identified DRGs, we performed an unsupervised clustering analysis to categorize the AD samples into various disulfidptosis-related molecular clusters. Weighted gene co-expression network analysis was performed to select hub genes specific to disulfidptosis-related AD clusters. The performances of various machine learning models were compared to determine the optimal predictive model. The predictive ability of the optimal model was assessed using nomogram analysis and five external datasets.

RESULTS

Eight DRGs showed differential expression between the AD and control samples. Two different molecular clusters were identified. The immune cell infiltration analysis revealed distinct differences in the immune microenvironment of the two clusters. The support vector machine model showed the highest performance, and a panel of five signature genes was identified, which showed excellent performance on the external validation datasets. The nomogram analysis also showed high accuracy in predicting AD.

CONCLUSION

We identified disulfidptosis-related molecular clusters in AD and established a novel risk model to assess the likelihood of developing AD. These findings revealed a complex association between disulfidptosis and AD, which may aid in identifying potential therapeutic targets for this debilitating disorder.

Collapse

Sui J, Chen J, Chen Y, Iwamori N, Sun J. Identification of plant vacuole proteins by using graph neural network and contact maps. BMC Bioinformatics 2023;24:357. [PMID: 37740195 PMCID: PMC10517492 DOI: 10.1186/s12859-023-05475-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Accepted: 09/12/2023] [Indexed: 09/24/2023] Open

Zhang T, Jia J, Chen C, Zhang Y, Yu B. BiGRUD-SA: Protein S-sulfenylation sites prediction based on BiGRU and self-attention. Comput Biol Med 2023;163:107145. [PMID: 37336062 DOI: 10.1016/j.compbiomed.2023.107145] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 05/18/2023] [Accepted: 06/06/2023] [Indexed: 06/21/2023]

Yi F, Yang H, Chen D, Qin Y, Han H, Cui J, Bai W, Ma Y, Zhang R, Yu H. XGBoost-SHAP-based interpretable diagnostic framework for alzheimer's disease. BMC Med Inform Decis Mak 2023;23:137. [PMID: 37491248 PMCID: PMC10369804 DOI: 10.1186/s12911-023-02238-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Accepted: 07/13/2023] [Indexed: 07/27/2023] Open

Abstract

BACKGROUND

Due to the class imbalance issue faced when Alzheimer's disease (AD) develops from normal cognition (NC) to mild cognitive impairment (MCI), present clinical practice is met with challenges regarding the auxiliary diagnosis of AD using machine learning (ML). This leads to low diagnosis performance. We aimed to construct an interpretable framework, extreme gradient boosting-Shapley additive explanations (XGBoost-SHAP), to handle the imbalance among different AD progression statuses at the algorithmic level. We also sought to achieve multiclassification of NC, MCI, and AD.

METHODS

We obtained patient data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database, including clinical information, neuropsychological test results, neuroimaging-derived biomarkers, and APOE-ε4 gene statuses. First, three feature selection algorithms were applied, and they were then included in the XGBoost algorithm. Due to the imbalance among the three classes, we changed the sample weight distribution to achieve multiclassification of NC, MCI, and AD. Then, the SHAP method was linked to XGBoost to form an interpretable framework. This framework utilized attribution ideas that quantified the impacts of model predictions into numerical values and analysed them based on their directions and sizes. Subsequently, the top 10 features (optimal subset) were used to simplify the clinical decision-making process, and their performance was compared with that of a random forest (RF), Bagging, AdaBoost, and a naive Bayes (NB) classifier. Finally, the National Alzheimer's Coordinating Center (NACC) dataset was employed to assess the impact path consistency of the features within the optimal subset.

RESULTS

Compared to the RF, Bagging, AdaBoost, NB and XGBoost (unweighted), the interpretable framework had higher classification performance with accuracy improvements of 0.74%, 0.74%, 1.46%, 13.18%, and 0.83%, respectively. The framework achieved high sensitivity (81.21%/74.85%), specificity (92.18%/89.86%), accuracy (87.57%/80.52%), area under the receiver operating characteristic curve (AUC) (0.91/0.88), positive clinical utility index (0.71/0.56), and negative clinical utility index (0.75/0.68) on the ADNI and NACC datasets, respectively. In the ADNI dataset, the top 10 features were found to have varying associations with the risk of AD onset based on their SHAP values. Specifically, the higher SHAP values of CDRSB, ADAS13, ADAS11, ventricle volume, ADASQ4, and FAQ were associated with higher risks of AD onset. Conversely, the higher SHAP values of LDELTOTAL, mPACCdigit, RAVLT_immediate, and MMSE were associated with lower risks of AD onset. Similar results were found for the NACC dataset.

CONCLUSIONS

The proposed interpretable framework contributes to achieving excellent performance in imbalanced AD multiclassification tasks and provides scientific guidance (optimal subset) for clinical decision-making, thereby facilitating disease management and offering new research ideas for optimizing AD prevention and treatment programs.

Collapse

Zhou T, Ren Z, Ma Y, He L, Liu J, Tang J, Zhang H. Early identification of bloodstream infection in hemodialysis patients by machine learning. Heliyon 2023;9:e18263. [PMID: 37519767 PMCID: PMC10375788 DOI: 10.1016/j.heliyon.2023.e18263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 07/08/2023] [Accepted: 07/12/2023] [Indexed: 08/01/2023] Open

Chen ZH, Zhao BW, Li JQ, Guo ZH, You ZH. GraphCPIs: A novel graph-based computational model for potential compound-protein interactions. MOLECULAR THERAPY. NUCLEIC ACIDS 2023;32:721-728. [PMID: 37251691 PMCID: PMC10209012 DOI: 10.1016/j.omtn.2023.04.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 04/28/2023] [Indexed: 05/31/2023]

Zhang M, Gao H, Liao X, Ning B, Gu H, Yu B. DBGRU-SE: predicting drug-drug interactions based on double BiGRU and squeeze-and-excitation attention mechanism. Brief Bioinform 2023:7176312. [PMID: 37225428 DOI: 10.1093/bib/bbad184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 04/03/2023] [Accepted: 04/23/2023] [Indexed: 05/26/2023] Open

Wang M, Yan L, Jia J, Lai J, Zhou H, Yu B. DE-MHAIPs: Identification of SARS-CoV-2 phosphorylation sites based on differential evolution multi-feature learning and multi-head attention mechanism. Comput Biol Med 2023;160:106935. [PMID: 37120990 PMCID: PMC10140648 DOI: 10.1016/j.compbiomed.2023.106935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 03/12/2023] [Accepted: 04/13/2023] [Indexed: 05/02/2023]

Yu Y, Ding P, Gao H, Liu G, Zhang F, Yu B. Cooperation of local features and global representations by a dual-branch network for transcription factor binding sites prediction. Brief Bioinform 2023;24:7030619. [PMID: 36748992 DOI: 10.1093/bib/bbad036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 01/03/2023] [Accepted: 01/18/2023] [Indexed: 02/08/2023] Open

Ullah M, Hadi F, Song J, Yu DJ. PScL-2LSAESM: bioimage-based prediction of protein subcellular localization by integrating heterogeneous features with the two-level SAE-SM and mean ensemble method. Bioinformatics 2023;39:6839969. [PMID: 36413068 PMCID: PMC9947927 DOI: 10.1093/bioinformatics/btac727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 11/02/2022] [Accepted: 11/21/2022] [Indexed: 11/23/2022] Open

Abstract

MOTIVATION

Over the past decades, a variety of in silico methods have been developed to predict protein subcellular localization within cells. However, a common and major challenge in the design and development of such methods is how to effectively utilize the heterogeneous feature sets extracted from bioimages. In this regards, limited efforts have been undertaken.

RESULTS

We propose a new two-level stacked autoencoder network (termed 2L-SAE-SM) to improve its performance by integrating the heterogeneous feature sets. In particular, in the first level of 2L-SAE-SM, each optimal heterogeneous feature set is fed to train our designed stacked autoencoder network (SAE-SM). All the trained SAE-SMs in the first level can output the decision sets based on their respective optimal heterogeneous feature sets, known as 'intermediate decision' sets. Such intermediate decision sets are then ensembled using the mean ensemble method to generate the 'intermediate feature' set for the second-level SAE-SM. Using the proposed framework, we further develop a novel predictor, referred to as PScL-2LSAESM, to characterize image-based protein subcellular localization. Extensive benchmarking experiments on the latest benchmark training and independent test datasets collected from the human protein atlas databank demonstrate the effectiveness of the proposed 2L-SAE-SM framework for the integration of heterogeneous feature sets. Moreover, performance comparison of the proposed PScL-2LSAESM with current state-of-the-art methods further illustrates that PScL-2LSAESM clearly outperforms the existing state-of-the-art methods for the task of protein subcellular localization.

AVAILABILITY AND IMPLEMENTATION

https://github.com/csbio-njust-edu/PScL-2LSAESM.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Alabed SJ, Zihlif M, Taha M. Discovery of new potent lysine specific histone demythelase-1 inhibitors (LSD-1) using structure based and ligand based molecular modelling and machine learning. RSC Adv 2022;12:35873-35895. [PMID: 36545090 PMCID: PMC9751883 DOI: 10.1039/d2ra05102h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 12/05/2022] [Indexed: 12/23/2022] Open

Accurate Prediction of Anti-hypertensive Peptides Based on Convolutional Neural Network and Gated Recurrent unit. Interdiscip Sci 2022;14:879-894. [PMID: 35474167 DOI: 10.1007/s12539-022-00521-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 03/30/2022] [Accepted: 04/06/2022] [Indexed: 12/30/2022]

Abstract

Hypertension (HT) is a general disease, and also one of the most ordinary and major causes of cardiovascular disease. Some diseases are caused by high blood pressure, including impairment of heart and kidney function, cerebral hemorrhage and myocardial infarction. Due to the limitations of laboratory methods, bioactive peptides for the treatment of HT need a long time to be identified. Therefore, it is of great immediate significance for the identification of anti-hypertensive peptides (AHTPs). With the prevalence of machine learning, it is suggested to use it as a supplementary method for AHTPs classification. Therefore, we develop a new model to identify AHTPs based on multiple features and deep learning. And the deep model is constructed by combining a convolutional neural network (CNN) and a gated recurrent unit (GRU). The unique convolution structure is used to reduce the feature dimension and running time. The data processed by CNN is input into the recurrent structure GRU, and important information is filtered out through the reset gate and update gate. Finally, the output layer adopts Sigmoid activation function. Firstly, we use Kmer, the deviation between the dipeptide frequency and the expected mean (DDE), encoding based on grouped weight (EBGW), enhanced grouped amino acid composition (EGAAC) and dipeptide binary profile and frequency (DBPF) to extract features. For Kmer, DDE, EBGW and EGAAC, it is widely used in the field of protein research. DBPF is a new feature representation method designed by us. It corresponds dipeptides to binary numbers, and finally obtains a binary coding file and a frequency file. Then these features are spliced together and input into our proposed model for prediction and analysis. After a tenfold cross-validation test, this model has a better competitive advantage than the previous methods, and the accuracy is 96.23% and 99.10%, respectively. From the results, compared with the previous methods, it has been greatly improved. It shows that the combination of convolution calculation and recurrent structure has a positive impact on the classification of AHTPs. The results show that this method is a feasible, efficient and competitive sequence analysis tool for AHTPs. Meanwhile, we design a friendly online prediction tool and it is freely accessible at http://ahtps.zhanglab.site/ .

Collapse

Predicting suitable habitats of Melia azedarach L. in China using data mining. Sci Rep 2022;12:12617. [PMID: 35871227 PMCID: PMC9308798 DOI: 10.1038/s41598-022-16571-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 07/12/2022] [Indexed: 11/08/2022] Open

Wang H, Li H, Gao W, Xie J. PrUb-EL: A hybrid framework based on deep learning for identifying ubiquitination sites in Arabidopsis thaliana using ensemble learning strategy. Anal Biochem 2022;658:114935. [PMID: 36206844 DOI: 10.1016/j.ab.2022.114935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 09/25/2022] [Accepted: 09/26/2022] [Indexed: 12/30/2022]

Abstract

Identification of ubiquitination sites is central to many biological experiments. Ubiquitination is a kind of post-translational protein modification (PTM). It is a key mechanism for increasing protein diversity and plays a vital role in regulating cell function. In recent years, many models have been developed to predict ubiquitination sites in humans, mice and yeast. However, few studies have predicted ubiquitination sites in Arabidopsis thaliana. In view of this, a deep network model named PrUb-EL is proposed to predict ubiquitination sites in Arabidopsis thaliana. Firstly, six features based on the protein sequence are extracted with amino acid index database (AAindex), dipeptide deviates from the expected mean (DDE), dipeptide composition (DPC), blocks substitution matrix (BLOSUM62), enhanced amino acid composition (EAAC) and binary encoding. Secondly, the synthetic minority over-sampling technique (SMOTE) is utilized to process the imbalanced data set. Then a new classifier named DG is presented, which includes Dense block, Residual block and Gated recurrent unit (GRU) block. Finally, each of six feature extraction methods is integrated into the DG model, and the ensemble learning strategy is used to gain the final prediction result. Experimental results show that PrUb-EL has good predictive ability with the accuracy (ACC) and area under the ROC curve (auROC) values of 91.00% and 97.70% using 5-fold cross-validation, respectively. Note that the values of ACC and auROC are 88.58% and 96.09% in the independent test, respectively. Compared with previous studies, our model has significantly improved performance thus it is an excellent method for identifying ubiquitination sites in Arabidopsis thaliana. The datasets and code used for the article are available at https://github.com/Tom-Wangy/PreUb-EL.git.

Collapse

Gao H, Chen C, Li S, Wang C, Zhou W, Yu B. Prediction of protein-protein interactions based on ensemble residual conventional neural network. Comput Biol Med 2022. [DOI: 10.1016/j.compbiomed.2022.106471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Wei Q, Zhang Q, Gao H, Song T, Salhi A, Yu B. DEEPStack-RBP: Accurate identification of RNA-binding proteins based on autoencoder feature selection and deep stacking ensemble classifier. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]

Lee J, Wanyan T, Chen Q, Keenan TDL, Glicksberg BS, Chew EY, Lu Z, Wang F, Peng Y. Predicting Age-related Macular Degeneration Progression with Longitudinal Fundus Images Using Deep Learning. MACHINE LEARNING IN MEDICAL IMAGING. MLMI (WORKSHOP) 2022;13583:11-20. [PMID: 36656604 PMCID: PMC9842432 DOI: 10.1007/978-3-031-21014-3_2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Zhang Y, Zhang X, Razbek J, Li D, Xia W, Bao L, Mao H, Daken M, Cao M. Opening the black box: interpretable machine learning for predictor finding of metabolic syndrome. BMC Endocr Disord 2022;22:214. [PMID: 36028865 PMCID: PMC9419421 DOI: 10.1186/s12902-022-01121-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 07/31/2022] [Indexed: 11/10/2022] Open

Abstract

OBJECTIVE

The internal workings ofmachine learning algorithms are complex and considered as low-interpretation "black box" models, making it difficult for domain experts to understand and trust these complex models. The study uses metabolic syndrome (MetS) as the entry point to analyze and evaluate the application value of model interpretability methods in dealing with difficult interpretation of predictive models.

METHODS

The study collects data from a chain of health examination institution in Urumqi from 2017 ~ 2019, and performs 39,134 remaining data after preprocessing such as deletion and filling. RFE is used for feature selection to reduce redundancy; MetS risk prediction models (logistic, random forest, XGBoost) are built based on a feature subset, and accuracy, sensitivity, specificity, Youden index, and AUROC value are used to evaluate the model classification performance; post-hoc model-agnostic interpretation methods (variable importance, LIME) are used to interpret the results of the predictive model.

RESULTS

Eighteen physical examination indicators are screened out by RFE, which can effectively solve the problem of physical examination data redundancy. Random forest and XGBoost models have higher accuracy, sensitivity, specificity, Youden index, and AUROC values compared with logistic regression. XGBoost models have higher sensitivity, Youden index, and AUROC values compared with random forest. The study uses variable importance, LIME and PDP for global and local interpretation of the optimal MetS risk prediction model (XGBoost), and different interpretation methods have different insights into the interpretation of model results, which are more flexible in model selection and can visualize the process and reasons for the model to make decisions. The interpretable risk prediction model in this study can help to identify risk factors associated with MetS, and the results showed that in addition to the traditional risk factors such as overweight and obesity, hyperglycemia, hypertension, and dyslipidemia, MetS was also associated with other factors, including age, creatinine, uric acid, and alkaline phosphatase.

CONCLUSION

The model interpretability methods are applied to the black box model, which can not only realize the flexibility of model application, but also make up for the uninterpretable defects of the model. Model interpretability methods can be used as a novel means of identifying variables that are more likely to be good predictors.

Collapse

Ullah M, Hadi F, Song J, Yu DJ. PScL-DDCFPred: an ensemble deep learning-based approach for characterizing multiclass subcellular localization of human proteins from bioimage data. Bioinformatics 2022;38:4019-4026. [PMID: 35771606 PMCID: PMC9890309 DOI: 10.1093/bioinformatics/btac432] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 06/03/2022] [Accepted: 06/28/2022] [Indexed: 02/04/2023] Open

Shi H, Zhang S, Li X. R5hmCFDV: computational identification of RNA 5-hydroxymethylcytosine based on deep feature fusion and deep voting. Brief Bioinform 2022;23:6658858. [PMID: 35945157 DOI: 10.1093/bib/bbac341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Revised: 07/17/2022] [Accepted: 07/25/2022] [Indexed: 11/13/2022] Open

Abstract

RNA 5-hydroxymethylcytosine (5hmC) is a kind of RNA modification, which is related to the life activities of many organisms. Studying its distribution is very important to reveal its biological function. Previously, high-throughput sequencing was used to identify 5hmC, but it is expensive and inefficient. Therefore, machine learning is used to identify 5hmC sites. Here, we design a model called R5hmCFDV, which is mainly divided into feature representation, feature fusion and classification. (i) Pseudo dinucleotide composition, dinucleotide binary profile and frequency, natural vector and physicochemical property are used to extract features from four aspects: nucleotide composition, coding, natural language and physical and chemical properties. (ii) To strengthen the relevance of features, we construct a novel feature fusion method. Firstly, the attention mechanism is employed to process four single features, stitch them together and feed them to the convolution layer. After that, the output data are processed by BiGRU and BiLSTM, respectively. Finally, the features of these two parts are fused by the multiply function. (iii) We design the deep voting algorithm for classification by imitating the soft voting mechanism in the Python package. The base classifiers contain deep neural network (DNN), convolutional neural network (CNN) and improved gated recurrent unit (GRU). And then using the principle of soft voting, the corresponding weights are assigned to the predicted probabilities of the three classifiers. The predicted probability values are multiplied by the corresponding weights and then summed to obtain the final prediction results. We use 10-fold cross-validation to evaluate the model, and the evaluation indicators are significantly improved. The prediction accuracy of the two datasets is as high as 95.41% and 93.50%, respectively. It demonstrates the stronger competitiveness and generalization performance of our model. In addition, all datasets and source codes can be found at https://github.com/HongyanShi026/R5hmCFDV.

Collapse

FRTpred: A novel approach for accurate prediction of protein folding rate and type. Comput Biol Med 2022;149:105911. [DOI: 10.1016/j.compbiomed.2022.105911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 07/08/2022] [Accepted: 07/23/2022] [Indexed: 11/20/2022]

Wanyan T, Lin M, Klang E, Menon KM, Gulamali FF, Azad A, Zhang Y, Ding Y, Wang Z, Wang F, Glicksberg B, Peng Y. Supervised Pretraining through Contrastive Categorical Positive Samplings to Improve COVID-19 Mortality Prediction. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2022;2022:9. [PMID: 35960866 PMCID: PMC9365529 DOI: 10.1145/3535508.3545541] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Ramón A, Torres AM, Milara J, Cascón J, Blasco P, Mateo J. eXtreme Gradient Boosting-based method to classify patients with COVID-19. J Investig Med 2022;70:jim-2021-002278. [PMID: 35850970 DOI: 10.1136/jim-2021-002278] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/15/2022] [Indexed: 01/08/2023]

Pandi A, Diehl C, Yazdizadeh Kharrazi A, Scholz SA, Bobkova E, Faure L, Nattermann M, Adam D, Chapin N, Foroughijabbari Y, Moritz C, Paczia N, Cortina NS, Faulon JL, Erb TJ. A versatile active learning workflow for optimization of genetic and metabolic networks. Nat Commun 2022;13:3876. [PMID: 35790733 PMCID: PMC9256728 DOI: 10.1038/s41467-022-31245-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2022] [Accepted: 06/10/2022] [Indexed: 11/13/2022] Open

Affiliation(s)

Amir Pandi Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany.
Christoph Diehl Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
Ali Yazdizadeh Kharrazi DataChef, Amsterdam, The Netherlands
Scott A Scholz Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
Elizaveta Bobkova Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
Léon Faure Micalis Institute, INRAE, AgroParisTech, University of Paris-Saclay, Jouy-en-Josas, France
Maren Nattermann Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
David Adam Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
Nils Chapin Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
Yeganeh Foroughijabbari Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
Charles Moritz Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
Nicole Paczia Core Facility for Metabolomics and Small Molecule Mass Spectrometry, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
Niña Socorro Cortina Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany.,LiVeritas Biosciences, Inc., 432N Canal St.; Ste. 20, South San Francisco, CA, 94080, USA
Jean-Loup Faulon Micalis Institute, INRAE, AgroParisTech, University of Paris-Saclay, Jouy-en-Josas, France.,Genomique Metabolique, Genoscope, Institut Francois Jacob, CEA, CNRS, Univ Evry, University of Paris-Saclay, Evry, France.,Manchester Institute of Biotechnology, SYNBIOCHEM center, School of Chemistry, The University of Manchester, Manchester, UK
Tobias J Erb Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany. .,SYNMIKRO Center of Synthetic Microbiology, Marburg, Germany.

Collapse

Feng C, Wu J, Wei H, Xu L, Zou Q. CRCF: A Method of Identifying Secretory Proteins of Malaria Parasites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:2149-2157. [PMID: 34061749 DOI: 10.1109/tcbb.2021.3085589] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Xia Y, Jiang M, Luo Y, Feng G, Jia G, Zhang H, Wang P, Ge R. SuccSPred2.0: A Two-Step Model to Predict Succinylation Sites Based on Multifeature Fusion and Selection Algorithm. J Comput Biol 2022;29:1085-1094. [PMID: 35714347 DOI: 10.1089/cmb.2022.0109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Yang Y, Shao A, Vihinen M. PON-All: Amino Acid Substitution Tolerance Predictor for All Organisms. Front Mol Biosci 2022;9:867572. [PMID: 35782867 PMCID: PMC9245922 DOI: 10.3389/fmolb.2022.867572] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 05/02/2022] [Indexed: 01/08/2023] Open

Zhang L, Zhang J, Nie Q. DIRECT-NET: An efficient method to discover cis-regulatory elements and construct regulatory networks from single-cell multiomics data. SCIENCE ADVANCES 2022;8:eabl7393. [PMID: 35648859 PMCID: PMC9159696 DOI: 10.1126/sciadv.abl7393] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]

Construction of Prediction Model of Renal Damage in Children with Henoch-Schönlein Purpura Based on Machine Learning. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022;2022:6991218. [PMID: 35651924 PMCID: PMC9150995 DOI: 10.1155/2022/6991218] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Revised: 05/08/2022] [Accepted: 05/10/2022] [Indexed: 12/22/2022]

Nakai K, Wei L. Recent Advances in the Prediction of Subcellular Localization of Proteins and Related Topics. FRONTIERS IN BIOINFORMATICS 2022;2:910531. [PMID: 36304291 PMCID: PMC9580943 DOI: 10.3389/fbinf.2022.910531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 04/25/2022] [Indexed: 11/13/2022] Open

Feng X, Chen L. SCSilicon: a tool for synthetic single-cell DNA sequencing data generation. BMC Genomics 2022;23:359. [PMID: 35546390 PMCID: PMC9092674 DOI: 10.1186/s12864-022-08566-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 04/19/2022] [Indexed: 11/25/2022] Open

Zhang C, Mou M, Zhou Y, Zhang W, Lian X, Shi S, Lu M, Sun H, Li F, Wang Y, Zeng Z, Li Z, Zhang B, Qiu Y, Zhu F, Gao J. Biological activities of drug inactive ingredients. Brief Bioinform 2022;23:6582006. [PMID: 35524477 DOI: 10.1093/bib/bbac160] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 04/01/2022] [Accepted: 04/09/2022] [Indexed: 02/06/2023] Open

Affiliation(s)

Chenyang Zhang College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
Minjie Mou College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
Ying Zhou College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China.,State Key Laboratory for Diagnosis and Treatment of Infectious Disease, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, 79 QingChun Road, Hangzhou, Zhejiang 310000, China
Wei Zhang College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
Xichen Lian College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
Shuiyang Shi College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
Mingkun Lu College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
Huaicheng Sun College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
Fengcheng Li College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
Yunxia Wang College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
Zhenyu Zeng Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
Zhaorong Li Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
Bing Zhang Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
Yunqing Qiu State Key Laboratory for Diagnosis and Treatment of Infectious Disease, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, 79 QingChun Road, Hangzhou, Zhejiang 310000, China
Feng Zhu College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
Jianqing Gao College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China.,Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China

Collapse

Yu B, Zhang Y, Wang X, Gao H, Sun J, Gao X. Identification of DNA modification sites based on elastic net and bidirectional gated recurrent unit with convolutional neural network. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103566] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Sikander R, Ghulam A, Ali F. XGB-DrugPred: computational prediction of druggable proteins using eXtreme gradient boosting and optimized features set. Sci Rep 2022;12:5505. [PMID: 35365726 PMCID: PMC8976041 DOI: 10.1038/s41598-022-09484-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Accepted: 03/07/2022] [Indexed: 11/19/2022] Open

Amilpur S, Bhukya R. A sequence-based two-layer predictor for identifying enhancers and their strength through enhanced feature extraction. J Bioinform Comput Biol 2022;20:2250005. [PMID: 35264081 DOI: 10.1142/s0219720022500056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Wang M, Song L, Zhang Y, Gao H, Yan L, Yu B. Malsite-Deep: Prediction of protein malonylation sites through deep learning and multi-information fusion based on NearMiss-2 strategy. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108191] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Chen J, Guo C, Lu M, Ding S. Unifying Diagnosis Identification and Prediction Method Embedding the Disease Ontology Structure From Electronic Medical Records. Front Public Health 2022;9:793801. [PMID: 35127624 PMCID: PMC8811031 DOI: 10.3389/fpubh.2021.793801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 12/21/2021] [Indexed: 11/13/2022] Open

Abstract

OBJECTIVE

The reasonable classification of a large number of distinct diagnosis codes can clarify patient diagnostic information and help clinicians to improve their ability to assign and target treatment for primary diseases. Our objective is to identify and predict a unifying diagnosis (UD) from electronic medical records (EMRs).

METHODS

We screened 4,418 sepsis patients from a public MIMIC-III database and extracted their diagnostic information for UD identification, their demographic information, laboratory examination information, chief complaint, and history of present illness information for UD prediction. We proposed a data-driven UD identification and prediction method (UDIPM) embedding the disease ontology structure. First, we designed a set similarity measure method embedding the disease ontology structure to generate a patient similarity matrix. Second, we applied affinity propagation clustering to divide patients into different clusters, and extracted a typical diagnosis code co-occurrence pattern from each cluster. Furthermore, we identified a UD by fusing visual analysis and a conditional co-occurrence matrix. Finally, we trained five classifiers in combination with feature fusion and feature selection method to unify the diagnosis prediction.

RESULTS

The experimental results on a public electronic medical record dataset showed that the UDIPM could extracted a typical diagnosis code co-occurrence pattern effectively, identified and predicted a UD based on patients' diagnostic and admission information, and outperformed other fusion methods overall.

CONCLUSIONS

The accurate identification and prediction of the UD from a large number of distinct diagnosis codes and multi-source heterogeneous patient admission information in EMRs can provide a data-driven approach to assist better coding integration of diagnosis.

Collapse

Nasiri H, Alavi SA. A Novel Framework Based on Deep Learning and ANOVA Feature Selection Method for Diagnosis of COVID-19 Cases from Chest X-Ray Images. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022;2022:4694567. [PMID: 35013680 PMCID: PMC8742147 DOI: 10.1155/2022/4694567] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 12/20/2021] [Indexed: 12/12/2022]

Abstract

Background and Objective. The new coronavirus disease (known as COVID-19) was first identified in Wuhan and quickly spread worldwide, wreaking havoc on the economy and people's everyday lives. As the number of COVID-19 cases is rapidly increasing, a reliable detection technique is needed to identify affected individuals and care for them in the early stages of COVID-19 and reduce the virus's transmission. The most accessible method for COVID-19 identification is Reverse Transcriptase-Polymerase Chain Reaction (RT-PCR); however, it is time-consuming and has false-negative results. These limitations encouraged us to propose a novel framework based on deep learning that can aid radiologists in diagnosing COVID-19 cases from chest X-ray images. Methods. In this paper, a pretrained network, DenseNet169, was employed to extract features from X-ray images. Features were chosen by a feature selection method, i.e., analysis of variance (ANOVA), to reduce computations and time complexity while overcoming the curse of dimensionality to improve accuracy. Finally, selected features were classified by the eXtreme Gradient Boosting (XGBoost). The ChestX-ray8 dataset was employed to train and evaluate the proposed method. Results and Conclusion. The proposed method reached 98.72% accuracy for two-class classification (COVID-19, No-findings) and 92% accuracy for multiclass classification (COVID-19, No-findings, and Pneumonia). The proposed method's precision, recall, and specificity rates on two-class classification were 99.21%, 93.33%, and 100%, respectively. Also, the proposed method achieved 94.07% precision, 88.46% recall, and 100% specificity for multiclass classification. The experimental results show that the proposed framework outperforms other methods and can be helpful for radiologists in the diagnosis of COVID-19 cases.

Collapse

Wei C, Cao L, Zhou Y, Zhang W, Zhang P, Wang M, Xiong M, Deng C, Xiong Q, Liu W, He Q, Guo Y, Shao Z, Chen X, Chen Z. Multiple statistical models reveal specific volatile organic compounds affect sex hormones in American adult male: NHANES 2013-2016. Front Endocrinol (Lausanne) 2022;13:1076664. [PMID: 36714567 PMCID: PMC9877519 DOI: 10.3389/fendo.2022.1076664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 12/13/2022] [Indexed: 01/13/2023] Open

Affiliation(s)

Chengcheng Wei Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
Li Cao Department of Orthopaedic, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
Yuancheng Zhou Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
Wenting Zhang Department of Obstetrics and Gynecology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
Pu Zhang Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
Miao Wang Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
Ming Xiong Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
Changqi Deng Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
Qi Xiong Chongqing Medical University, Chongqing, China
Weihui Liu Department of Urology, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China
Qingliu He Department of Urology, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China *Correspondence: Zhaohui Chen, ; Xiaogang Chen, ; Zengwu Shao, ; Yihong Guo, ; Qingliu He,
Yihong Guo Department of Urology, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China *Correspondence: Zhaohui Chen, ; Xiaogang Chen, ; Zengwu Shao, ; Yihong Guo, ; Qingliu He,
Zengwu Shao Department of Orthopaedic, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China *Correspondence: Zhaohui Chen, ; Xiaogang Chen, ; Zengwu Shao, ; Yihong Guo, ; Qingliu He,
Xiaogang Chen Department of Urology, Huangshi Central Hospital, The Affliated Hospital of Hubei Polytechnic University, Huangshi, China *Correspondence: Zhaohui Chen, ; Xiaogang Chen, ; Zengwu Shao, ; Yihong Guo, ; Qingliu He,
Zhaohui Chen Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China *Correspondence: Zhaohui Chen, ; Xiaogang Chen, ; Zengwu Shao, ; Yihong Guo, ; Qingliu He,

Collapse

Herrera-Bravo J, Farías JG, Contreras FP, Herrera-Belén L, Norambuena JA, Beltrán JF. VirVACPRED: A Web Server for Prediction of Protective Viral Antigens. Int J Pept Res Ther 2021;28:35. [PMID: 34934411 PMCID: PMC8679566 DOI: 10.1007/s10989-021-10345-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/07/2021] [Indexed: 11/25/2022]

Guo Y, Wu C, Yuan Z, Wang Y, Liang Z, Wang Y, Zhang Y, Xu L. Gene-Based Testing of Interactions Using XGBoost in Genome-Wide Association Studies. Front Cell Dev Biol 2021;9:801113. [PMID: 34977040 PMCID: PMC8716787 DOI: 10.3389/fcell.2021.801113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Accepted: 11/23/2021] [Indexed: 11/30/2022] Open

Liu Y, Jin S, Gao H, Wang X, Wang C, Zhou W, Yu B. Predicting the multi-label protein subcellular localization through multi-information fusion and MLSI dimensionality reduction based on MLFE classifier. Bioinformatics 2021;38:1223-1230. [PMID: 34864897 PMCID: PMC8690230 DOI: 10.1093/bioinformatics/btab811] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Revised: 11/17/2021] [Accepted: 11/30/2021] [Indexed: 01/05/2023] Open

Abstract

MOTIVATION

Multi-label (ML) protein subcellular localization (SCL) is an indispensable way to study protein function. It can locate a certain protein (such as the human transmembrane protein that promotes the invasion of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)) or expression product at a specific location in a cell, which can provide a reference for clinical treatment of diseases such as coronavirus disease 2019 (COVID-19).

RESULTS

The article proposes a novel method named ML-locMLFE. First of all, six feature extraction methods are adopted to obtain protein effective information. These methods include pseudo amino acid composition, encoding based on grouped weight, gene ontology, multi-scale continuous and discontinuous, residue probing transformation and evolutionary distance transformation. In the next part, we utilize the ML information latent semantic index method to avoid the interference of redundant information. In the end, ML learning with feature-induced labeling information enrichment is adopted to predict the ML protein SCL. The Gram-positive bacteria dataset is chosen as a training set, while the Gram-negative bacteria dataset, virus dataset, newPlant dataset and SARS-CoV-2 dataset as the test sets. The overall actual accuracy of the first four datasets are 99.23%, 93.82%, 93.24% and 96.72% by the leave-one-out cross validation. It is worth mentioning that the overall actual accuracy prediction result of our predictor on the SARS-CoV-2 dataset is 72.73%. The results indicate that the ML-locMLFE method has obvious advantages in predicting the SCL of ML protein, which provides new ideas for further research on the SCL of ML protein.

AVAILABILITY AND IMPLEMENTATION

The source codes and datasets are publicly available at https://github.com/QUST-AIBBDRC/ML-locMLFE/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Lv H, Zhang Y, Wang JS, Yuan SS, Sun ZJ, Dao FY, Guan ZX, Lin H, Deng KJ. iRice-MS: An integrated XGBoost model for detecting multitype post-translational modification sites in rice. Brief Bioinform 2021;23:6447435. [PMID: 34864888 DOI: 10.1093/bib/bbab486] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 10/05/2021] [Accepted: 10/23/2021] [Indexed: 12/13/2022] Open

Feng X, Chen L, Qing Y, Li R, Li C, Li SC. SCYN: single cell CNV profiling method using dynamic programming. BMC Genomics 2021;22:651. [PMID: 34789142 PMCID: PMC8596905 DOI: 10.1186/s12864-021-07941-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 08/20/2021] [Indexed: 11/11/2022] Open

Zhang Y, Jiang Z, Chen C, Wei Q, Gu H, Yu B. DeepStack-DTIs: Predicting Drug-Target Interactions Using LightGBM Feature Selection and Deep-Stacked Ensemble Classifier. Interdiscip Sci 2021;14:311-330. [PMID: 34731411 DOI: 10.1007/s12539-021-00488-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2021] [Revised: 10/19/2021] [Accepted: 10/21/2021] [Indexed: 12/12/2022]

Abstract

Accurate prediction of drug-target interactions (DTIs), which is often used in the fields of drug discovery and drug repositioning, is regarded a key challenge in the study of drug science. In this paper, a new method called DeepStack-DTIs is proposed to predict DTIs. First, for the target protein, pseudo-position specific score matrix, pseudo amino acid composition and SPIDER3 are used to extract the different feature information of the target protein. Meanwhile, the path-based fingerprint features of each drug are extracted. Then, the synthetic minority oversampling technique (SMOTE) and light gradient boosting machine (LightGBM) are used for data balancing and feature selection, respectively. Finally, the processed features are input to the deep-stacked ensemble classifier composed of gated recurrent unit (GRU), deep neural network (DNN), support vector machine (SVM), eXtreme gradient boosting (XGBoost) and logistic regression (LR) to predict DTIs. Under the five-fold cross-validation and compared with existing methods, the proposed method achieves higher prediction accuracy on the gold standard dataset. To evaluate the predictive power of DeepStack-DTIs, we validate the method on another dataset and predict the drug-target interaction network. The results indicate that DeepStack-DTIs has excellent predictive ability than the other methods, and provides novel insights for the prediction of DTIs. A novel method DeepStack-DTIs for drug-target interactions prediction. PsePSSM, PseAAC, SPIDER3 and FP2 are fused to convert protein sequence and drug molecule information into digital information, respectively. The SMOTE algorithm is used to balance the dataset and LightGBM feature selection algorithm is employed to remove redundant and irrelevant features to select the optimal feature subset. This optimal feature subset is inputted into the deep-stacked ensemble classifier to predict drug-target interactions. The experimental results show DeepStack-DTIs method can significantly improve the prediction accuracy of drug-target interactions.

Collapse

Jiang Y, Wang D, Wang W, Xu D. Computational methods for protein localization prediction. Comput Struct Biotechnol J 2021;19:5834-5844. [PMID: 34765098 PMCID: PMC8564054 DOI: 10.1016/j.csbj.2021.10.023] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 10/12/2021] [Accepted: 10/13/2021] [Indexed: 12/16/2022] Open