1
|
Singh S, Kaur N, Gehlot A. Application of artificial intelligence in drug design: A review. Comput Biol Med 2024; 179:108810. [PMID: 38991316 DOI: 10.1016/j.compbiomed.2024.108810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 05/31/2024] [Accepted: 06/24/2024] [Indexed: 07/13/2024]
Abstract
Artificial intelligence (AI) is a field of computer science that involves acquiring information, developing rule bases, and mimicking human behaviour. The fundamental concept behind AI is to create intelligent computer systems that can operate with minimal human intervention or without any intervention at all. These rule-based systems are developed using various machine learning and deep learning models, enabling them to solve complex problems. AI is integrated with these models to learn, understand, and analyse provided data. The rapid advancement of Artificial Intelligence (AI) is reshaping numerous industries, with the pharmaceutical sector experiencing a notable transformation. AI is increasingly being employed to automate, optimize, and personalize various facets of the pharmaceutical industry, particularly in pharmacological research. Traditional drug development methods areknown for being time-consuming, expensive, and less efficient, often taking around a decade and costing billions of dollars. The integration of artificial intelligence (AI) techniques addresses these challenges by enabling the examination of compounds with desired properties from a vast pool of input drugs. Furthermore, it plays a crucial role in drug screening by predicting toxicity, bioactivity, ADME properties (absorption, distribution, metabolism, and excretion), physicochemical properties, and more. AI enhances the drug design process by improving the efficiency and accuracy of predicting drug behaviour, interactions, and properties. These approaches further significantly improve the precision of drug discovery processes and decrease clinical trial costs leading to the development of more effective drugs.
Collapse
Affiliation(s)
- Simrandeep Singh
- Department of Electronics & Communication Engineering, UCRD, Chandigarh University, Gharuan, Punjab, India.
| | - Navjot Kaur
- Department of Pharmacognosy, Amar Shaheed Baba Ajit Singh Jujhar Singh Memorial College of Pharmacy, Bela, Ropar, India
| | - Anita Gehlot
- Uttaranchal Institute of technology, Uttaranchal University, Dehradun, India
| |
Collapse
|
2
|
Zhao C, Su KJ, Wu C, Cao X, Sha Q, Li W, Luo Z, Qing T, Qiu C, Zhao LJ, Liu A, Jiang L, Zhang X, Shen H, Zhou W, Deng HW. Multi-scale variational autoencoder for imputation of missing values in untargeted metabolomics using whole-genome sequencing data. Comput Biol Med 2024; 179:108813. [PMID: 38955127 PMCID: PMC11324385 DOI: 10.1016/j.compbiomed.2024.108813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 06/18/2024] [Accepted: 06/24/2024] [Indexed: 07/04/2024]
Abstract
BACKGROUND Missing data is a common challenge in mass spectrometry-based metabolomics, which can lead to biased and incomplete analyses. The integration of whole-genome sequencing (WGS) data with metabolomics data has emerged as a promising approach to enhance the accuracy of data imputation in metabolomics studies. METHOD In this study, we propose a novel method that leverages the information from WGS data and reference metabolites to impute unknown metabolites. Our approach utilizes a multi-scale variational autoencoder to jointly model the burden score, polygenetic risk score (PGS), and linkage disequilibrium (LD) pruned single nucleotide polymorphisms (SNPs) for feature extraction and missing metabolomics data imputation. By learning the latent representations of both omics data, our method can effectively impute missing metabolomics values based on genomic information. RESULTS We evaluate the performance of our method on empirical metabolomics datasets with missing values and demonstrate its superiority compared to conventional imputation techniques. Using 35 template metabolites derived burden scores, PGS and LD-pruned SNPs, the proposed methods achieved R2-scores > 0.01 for 71.55 % of metabolites. CONCLUSION The integration of WGS data in metabolomics imputation not only improves data completeness but also enhances downstream analyses, paving the way for more comprehensive and accurate investigations of metabolic pathways and disease associations. Our findings offer valuable insights into the potential benefits of utilizing WGS data for metabolomics data imputation and underscore the importance of leveraging multi-modal data integration in precision medicine research.
Collapse
Affiliation(s)
- Chen Zhao
- Department of Computer Science, Kennesaw State University, 680 Arntson Dr, Marietta, GA, 30060, USA
| | - Kuan-Jui Su
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Chong Wu
- Department of Biostatistics, University of Texas MD Anderson, Pickens Academic Tower, 1400 Pressler St., Houston, TX, 77030, USA
| | - Xuewei Cao
- Department of Mathematical Sciences, Michigan Technological University, 1400 Townsend Dr, Houghton, MI, 49931, USA
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, 1400 Townsend Dr, Houghton, MI, 49931, USA
| | - Wu Li
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Zhe Luo
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Tian Qing
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Chuan Qiu
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Lan Juan Zhao
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Anqi Liu
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Lindong Jiang
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Xiao Zhang
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Hui Shen
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Weihua Zhou
- Department of Applied Computing, Michigan Technological University, 1400 Townsend Dr, Houghton, MI, 49931, USA; Center for Biocomputing and Digital Health, Institute of Computing and Cybersystems, and Health Research Institute, Michigan Technological University, Houghton, MI, 49931, USA.
| | - Hong-Wen Deng
- Division of Biomedical Informatics and Genomics, Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA, 70112, USA
| |
Collapse
|
3
|
Raheel A. Emotion analysis and recognition in 3D space using classifier-dependent feature selection in response to tactile enhanced audio-visual content using EEG. Comput Biol Med 2024; 179:108807. [PMID: 38970831 DOI: 10.1016/j.compbiomed.2024.108807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 06/13/2024] [Accepted: 06/24/2024] [Indexed: 07/08/2024]
Abstract
Traditional media such as text, images, audio, and video primarily target specific senses like vision and hearing. In contrast, multiple sensorial media aims to create immersive experiences by integrating additional sensory modalities such as touch, smell, and taste where applicable. Tactile enhanced audio-visual content leverages the sense of touch in addition to visual and auditory stimuli, aiming to create a more immersive and engaging interaction for users. Previously, tactile enhanced content has been explored in 2D emotional space (valence and arousal). In this paper, EEG data against tactile enhanced audio-visual content is labeled based on a self-assessment manikin scale in 3 dimensions i.e., valence, arousal, and dominance. Statistical significance (with a 95% confidence interval) is also established based on gathered scores, highlighting a significant difference in the arousal and dominance dimension of traditional media and tactile enhanced media. A new methodology is proposed using classifier-dependent feature selection approach to classify valence, arousal, and dominance states using three different classifiers. A highest accuracy of 75%, 73.8%, and 75% is achieved for classifying valence, arousal, and dominance states, respectively. The proposed scheme outperforms previous emotion recognition based studies in response to enhanced multimedia content in terms of accuracy, F-score, and other error parameters.
Collapse
Affiliation(s)
- Aasim Raheel
- Department of Computer Engineering, University of Engineering and Technology Taxila, Pakistan.
| |
Collapse
|
4
|
Ma S, Li R, Li G, Wei M, Li B, Li Y, Ha C. Identification of a G-protein coupled receptor-related gene signature through bioinformatics analysis to construct a risk model for ovarian cancer prognosis. Comput Biol Med 2024; 178:108747. [PMID: 38897150 DOI: 10.1016/j.compbiomed.2024.108747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 05/31/2024] [Accepted: 06/08/2024] [Indexed: 06/21/2024]
Abstract
BACKGROUND Ovarian cancer (OV) is a common malignant tumor of the female reproductive system with a 5-year survival rate of ∼30 %. Inefficient early diagnosis and prognosis leads to poor survival in most patients. G protein-coupled receptors (GPCRs, the largest family of human cell surface receptors) are associated with OV. We aimed to identify GPCR-related gene (GPCRRG) signatures and develop a novel model to predict OV prognosis. METHOD We downloaded data from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. Prognostic GPCRRGs were screened using least absolute shrinkage and selection operator (LASSO) Cox regression analysis, and a prognostic model was constructed. The predictive ability of the model was evaluated by Kaplan-Meier (K-M) survival analysis. The levels of GPCRRGs were examined in normal and OV cell lines using quantitative reverse-Etranscription polymerase chain reaction. The immunological characteristics of the high- and low-risk groups were analyzed using single-sample gene set enrichment analysis (ssGSEA) and CIBERSORT. RESULTS Based on the risks scores, 17 GPCRRGs were associated with OV prognosis. CXCR4, GPR34, LGR6, LPAR3, and RGS2 were significantly expressed in three OV datasets and enabled accurate OV diagnosis. K-M analysis of the prognostic model showed that it could differentiate high- and low-risk patients, which correspond to poorer and better prognoses, respectively. GPCRRG expression was correlated with immune infiltration rates. CONCLUSIONS Our prognostic model elaborates on the roles of GPCRRGs in OV and provides a new tool for prognosis and immune response prediction in patients with OV.
Collapse
Affiliation(s)
- Shaohan Ma
- Clinical Medical College, Ningxia Medical University, Yinchuan, Ningxia, China
| | - Ruyue Li
- Gynecology Department, General Hospital of Ningxia Medical University, Yinchuan, Ningxia, China
| | - Guangqi Li
- Medical Laboratory Center, General Hospital of Ningxia Medical University, Yinchuan, China
| | - Meng Wei
- Gynecology Department, General Hospital of Ningxia Medical University, Yinchuan, Ningxia, China
| | - Bowei Li
- Clinical Medical College, Ningxia Medical University, Yinchuan, Ningxia, China
| | - Yongmei Li
- Gynecology Department, General Hospital of Ningxia Medical University, Yinchuan, Ningxia, China
| | - Chunfang Ha
- Gynecology Department, General Hospital of Ningxia Medical University, Yinchuan, Ningxia, China; Key Laboratory of Fertility Preservation & Maintenance of Ministry of Education, Ningxia Medical University, Yinchuan, Ningxia, 750000, China.
| |
Collapse
|
5
|
Biswas B, Kumar N, Sugimoto M, Hoque MA. scHD4E: Novel ensemble learning-based differential expression analysis method for single-cell RNA-sequencing data. Comput Biol Med 2024; 178:108769. [PMID: 38897145 DOI: 10.1016/j.compbiomed.2024.108769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 05/14/2024] [Accepted: 06/15/2024] [Indexed: 06/21/2024]
Abstract
Differential expression (DE) analysis between cell types for scRNA-seq data by capturing its complicated features is crucial. Recently, different methods have been developed for targeting the scRNA-seq data analysis based on different modeling frameworks, assumptions, strategies and test statistic in considering various data features. The scDEA is an ensemble learning-based DE analysis method developed recently, yielding p-values using Lancaster's combination, generated by 12 individual DE analysis methods, and producing more accurate and stable results than individual methods. The objective of our study is to propose a new ensemble learning-based DE analysis method, scHD4E, using top performers in only 4 separate methods. The top performer 4 methods have been selected through an evaluation process using six real scRNA-seq data sets. We conducted comprehensive experiments for five experimental data sets to evaluate our proposed method based on the sample size effects, batch effects, type I error control, gene ontology enrichment analysis, runtime, identified matched DE genes, and semantic similarity measurement between methods. We also perform similar analyses (except the last 3 terms) and compute performance measures like accuracy, F1 score, Mathew's correlation coefficient etc. for a simulated data set. The results show that scHD4E is performs better than all the individual and scDEA methods in all the above perspectives. We expect that scHD4E will serve the modern data scientists for detecting the DEGs in scRNA-seq data analysis. To implement our proposed method, a Github R package scHD4E and its shiny application has been developed, and available in the following links: https://github.com/bbiswas1989/scHD4E and https://github.com/bbiswas1989/scHD4E-Shiny.
Collapse
Affiliation(s)
- Biplab Biswas
- Department of Statistics, Faculty of Science, Bangabandhu Sheikh Mujibur Rahman Science & Technology University, Gopalganj, 8100, Bangladesh; Department of Statistics, Faculty of Science, University of Rajshahi, Rajshahi, 6205, Bangladesh.
| | - Nishith Kumar
- Department of Statistics, Faculty of Science, Bangabandhu Sheikh Mujibur Rahman Science & Technology University, Gopalganj, 8100, Bangladesh.
| | - Masahiro Sugimoto
- Institute for Advanced Biosciences, Keio University 246-2 Mizukami, Kakuganji, Tsuruoka, Yamagata, 997-0052, Japan.
| | - Md Aminul Hoque
- Department of Statistics, Faculty of Science, University of Rajshahi, Rajshahi, 6205, Bangladesh.
| |
Collapse
|
6
|
Islam MA, Majumder MZH, Miah MS, Jannaty S. Precision healthcare: A deep dive into machine learning algorithms and feature selection strategies for accurate heart disease prediction. Comput Biol Med 2024; 176:108432. [PMID: 38744014 DOI: 10.1016/j.compbiomed.2024.108432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/06/2024] [Accepted: 04/07/2024] [Indexed: 05/16/2024]
Abstract
This paper presents a comprehensive exploration of machine learning algorithms (MLAs) and feature selection techniques for accurate heart disease prediction (HDP) in modern healthcare. By focusing on diverse datasets encompassing various challenges, the research sheds light on optimal strategies for early detection. MLAs such as Decision Trees (DT), Random Forests (RF), Support Vector Machines (SVM), Gaussian Naive Bayes (NB), and others were studied, with precision and recall metrics emphasized for robust predictions. Our study addresses challenges in real-world data through data cleaning and one-hot encoding, enhancing the integrity of our predictive models. Feature extraction techniques-Recursive Feature Extraction (RFE), Principal Component Analysis (PCA), and univariate feature selection-play a crucial role in identifying relevant features and reducing data dimensionality. Our findings showcase the impact of these techniques on improving prediction accuracy. Optimized models for each dataset have been achieved through grid search hyperparameter tuning, with configurations meticulously outlined. Notably, a remarkable 99.12 % accuracy was achieved on the first Kaggle dataset, showcasing the potential for accurate HDP. Model robustness across diverse datasets was highlighted, with caution against overfitting. The study emphasizes the need for validation of unseen data and encourages ongoing research for generalizability. Serving as a practical guide, this research aids researchers and practitioners in HDP model development, influencing clinical decisions and healthcare resource allocation. By providing insights into effective algorithms and techniques, the paper contributes to reducing heart disease-related morbidity and mortality, supporting the healthcare community's ongoing efforts.
Collapse
Affiliation(s)
- Md Ariful Islam
- Department of Robotics and Mechatronics Engineering, University of Dhaka, Dhaka, 1000, Bangladesh.
| | | | - Md Sohel Miah
- Department of Computer Science and Technology, Moulvibazar Polytechnic Institute, Bangladesh
| | - Sumaia Jannaty
- Gonoshasthaya Samaj Vittik Medical College, Savar, Dhaka, Bangladesh
| |
Collapse
|
7
|
Ameli A, Peña-Castillo L, Usefi H. Assessing the reproducibility of machine-learning-based biomarker discovery in Parkinson's disease. Comput Biol Med 2024; 174:108407. [PMID: 38603902 DOI: 10.1016/j.compbiomed.2024.108407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 03/21/2024] [Accepted: 04/01/2024] [Indexed: 04/13/2024]
Abstract
Feature selection and machine learning algorithms can be used to analyze Single Nucleotide Polymorphisms (SNPs) data and identify potential disease biomarkers. Reproducibility of identified biomarkers is critical for them to be useful for clinical research; however, genotyping platforms and selection criteria for individuals to be genotyped affect the reproducibility of identified biomarkers. To assess biomarkers reproducibility, we collected five SNPs datasets from the database of Genotypes and Phenotypes (dbGaP) and explored several data integration strategies. While combining datasets can lead to a reduction in classification accuracy, it has the potential to improve the reproducibility of potential biomarkers. We evaluated the agreement among different strategies in terms of the SNPs that were identified as potential Parkinson's disease (PD) biomarkers. Our findings indicate that, on average, 93% of the SNPs identified in a single dataset fail to be identified in other datasets. However, through dataset integration, this lack of replication is reduced to 62%. We discovered fifty SNPs that were identified at least twice, which could potentially serve as novel PD biomarkers. These SNPs are indirectly linked to PD in the literature but have not been directly associated with PD before. These findings open up new potential avenues of investigation.
Collapse
Affiliation(s)
- Ali Ameli
- Department of Computer Science, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, A1C5S7, NL, Canada
| | - Lourdes Peña-Castillo
- Department of Computer Science, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, A1C5S7, NL, Canada; Department of Biology, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, A1C5S7, NL, Canada.
| | - Hamid Usefi
- Department of Computer Science, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, A1C5S7, NL, Canada; Department of Mathematics and Statistics, Memorial University of Newfoundland, 230 Elizabeth Ave, St. John's, A1C5S7, NL, Canada.
| |
Collapse
|
8
|
Fu Y, Wang C, Wu Z, Zhang X, Liu Y, Wang X, Liu F, Chen Y, Zhang Y, Zhao H, Wang Q. Discovery of the potential biomarkers for early diagnosis of endometrial cancer via integrating metabolomics and transcriptomics. Comput Biol Med 2024; 173:108327. [PMID: 38552279 DOI: 10.1016/j.compbiomed.2024.108327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 03/07/2024] [Accepted: 03/17/2024] [Indexed: 04/17/2024]
Abstract
Endometrial cancer (EC) is one of the most common malignant tumors in women, and the increasing incidence and mortality pose a serious threat to the public health. Early diagnosis of EC could prolong the survival period and optimize the survivorship, greatly alleviating patients' suffering and social medical pressure. In this study, we collected urine and serum samples from the recruited patients, analyzed the samples using LC-MS approach, and identified the differential metabolites through metabolomic analysis. Then, the differentially expressed genes were identified through the systematic transcriptomic analysis of EC-related dataset from Gene Expression Omnibus (GEO), followed by network profiling of metabolic-reaction-enzyme-gene. In this experiment, a total of 83 differential metabolites and 19 hub genes were discovered, of which 10 different metabolites and 3 hub genes were further evaluated as more potential biomarkers based on network analysis. According to the KEGG enrichment analysis, the potential biomarkers and gene-encoded proteins were found to be involved in the arginine and proline metabolism, histidine metabolism, and pyrimidine metabolism, which was of significance for the early diagnosis of EC. In particular, the combination of metabolites (histamine, 1-methylhistamine, and methylimidazole acetaldehyde) as well as the combination of RRM2, TYMS and TK1 exerted more accurate discrimination abilities between EC and healthy groups, providing more criteria for the early diagnosis of EC.
Collapse
Affiliation(s)
- Yan Fu
- School of Pharmacy, Hebei Medical University, Shijiazhuang, 050017, China; Core Facilities and Centers, Hebei Medical University, Shijiazhuang, 050017, China
| | - Chengzhao Wang
- College of Basic Medicine, Hebei Medical University, Shijiazhuang, 050017, China
| | - Zhimin Wu
- School of Pharmacy, Hebei Medical University, Shijiazhuang, 050017, China
| | - Xiaoguang Zhang
- Core Facilities and Centers, Hebei Medical University, Shijiazhuang, 050017, China; College of Basic Medicine, Hebei Medical University, Shijiazhuang, 050017, China
| | - Yan Liu
- School of Pharmacy, Hebei Medical University, Shijiazhuang, 050017, China
| | - Xu Wang
- School of Pharmacy, Hebei Medical University, Shijiazhuang, 050017, China
| | - Fangfang Liu
- School of Pharmacy, Hebei Medical University, Shijiazhuang, 050017, China
| | - Yujuan Chen
- School of Pharmacy, Hebei Medical University, Shijiazhuang, 050017, China
| | - Yang Zhang
- School of Pharmacy, Hebei Medical University, Shijiazhuang, 050017, China.
| | - Huanhuan Zhao
- Department of Obstetrics and Gynecology, The Fourth Hospital of Hebei Medical University, 050011, China.
| | - Qiao Wang
- School of Pharmacy, Hebei Medical University, Shijiazhuang, 050017, China.
| |
Collapse
|
9
|
Chen J, Wen B. Bi-level gene selection of cancer by combining clustering and sparse learning. Comput Biol Med 2024; 172:108236. [PMID: 38471351 DOI: 10.1016/j.compbiomed.2024.108236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 02/07/2024] [Accepted: 02/25/2024] [Indexed: 03/14/2024]
Abstract
The diagnosis of cancer based on gene expression profile data has attracted extensive attention in the field of biomedical science. This type of data usually has the characteristics of high dimensionality and noise. In this paper, a hybrid gene selection method based on clustering and sparse learning is proposed to choose the key genes with high precision. We first propose a filter method, which combines the k-means clustering algorithm and signal-to-noise ratio ranking method, and then, a weighted gene co-expression network has been applied to the reduced data set to identify modules corresponding to biological pathways. Moreover, we choose the key genes by using group bridge and sparse group lasso as wrapper methods. Finally, we conduct some numerical experiments on six cancer datasets. The numerical results show that our proposed method has achieved good performance in gene selection and cancer classification.
Collapse
Affiliation(s)
- Junnan Chen
- School of Science, Hebei University of Technology, Tianjin, PR China.
| | - Bo Wen
- Institute of Mathematics, Hebei University of Technology, Tianjin, PR China.
| |
Collapse
|
10
|
Zhou Y, Chen Z, Yang M, Chen F, Yin J, Zhang Y, Zhou X, Sun X, Ni Z, Chen L, Lv Q, Zhu F, Liu S. FERREG: ferroptosis-based regulation of disease occurrence, progression and therapeutic response. Brief Bioinform 2024; 25:bbae223. [PMID: 38742521 PMCID: PMC11091744 DOI: 10.1093/bib/bbae223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 03/25/2024] [Accepted: 04/21/2024] [Indexed: 05/16/2024] Open
Abstract
Ferroptosis is a non-apoptotic, iron-dependent regulatory form of cell death characterized by the accumulation of intracellular reactive oxygen species. In recent years, a large and growing body of literature has investigated ferroptosis. Since ferroptosis is associated with various physiological activities and regulated by a variety of cellular metabolism and mitochondrial activity, ferroptosis has been closely related to the occurrence and development of many diseases, including cancer, aging, neurodegenerative diseases, ischemia-reperfusion injury and other pathological cell death. The regulation of ferroptosis mainly focuses on three pathways: system Xc-/GPX4 axis, lipid peroxidation and iron metabolism. The genes involved in these processes were divided into driver, suppressor and marker. Importantly, small molecules or drugs that mediate the expression of these genes are often good treatments in the clinic. Herein, a newly developed database, named 'FERREG', is documented to (i) providing the data of ferroptosis-related regulation of diseases occurrence, progression and drug response; (ii) explicitly describing the molecular mechanisms underlying each regulation; and (iii) fully referencing the collected data by cross-linking them to available databases. Collectively, FERREG contains 51 targets, 718 regulators, 445 ferroptosis-related drugs and 158 ferroptosis-related disease responses. FERREG can be accessed at https://idrblab.org/ferreg/.
Collapse
Affiliation(s)
- Yuan Zhou
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, and Department of Respiratory Medicine of Affiliated Hospital, Hangzhou Normal University, Hangzhou, 311121, China
| | - Zhen Chen
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Mengjie Yang
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, and Department of Respiratory Medicine of Affiliated Hospital, Hangzhou Normal University, Hangzhou, 311121, China
| | - Fengyun Chen
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, and Department of Respiratory Medicine of Affiliated Hospital, Hangzhou Normal University, Hangzhou, 311121, China
| | - Jiayi Yin
- Department of Clinical Pharmacy, The First Affiliated Hospital, Zhejiang University School of Medicine
| | - Yintao Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Xuheng Zhou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Xiuna Sun
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Ziheng Ni
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, and Department of Respiratory Medicine of Affiliated Hospital, Hangzhou Normal University, Hangzhou, 311121, China
| | - Lu Chen
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, and Department of Respiratory Medicine of Affiliated Hospital, Hangzhou Normal University, Hangzhou, 311121, China
| | - Qun Lv
- Department of Respiratory, The Affiliated Hospital of Hangzhou Normal University, Hangzhou, 311121, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Shuiping Liu
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, and Department of Respiratory Medicine of Affiliated Hospital, Hangzhou Normal University, Hangzhou, 311121, China
| |
Collapse
|
11
|
Chen Y, Lu P, Wu S, Yang J, Liu W, Zhang Z, Xu Q. CD163-Mediated Small-Vessel Injury in Alzheimer's Disease: An Exploration from Neuroimaging to Transcriptomics. Int J Mol Sci 2024; 25:2293. [PMID: 38396970 PMCID: PMC10888773 DOI: 10.3390/ijms25042293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 02/10/2024] [Accepted: 02/11/2024] [Indexed: 02/25/2024] Open
Abstract
Patients with Alzheimer's disease (AD) often present with imaging features indicative of small-vessel injury, among which, white-matter hyperintensities (WMHs) are the most prevalent. However, the underlying mechanism of the association between AD and small-vessel injury is still obscure. The aim of this study is to investigate the mechanism of small-vessel injury in AD. Differential gene expression analyses were conducted to identify the genes related to WMHs separately in mild cognitive impairment (MCI) and cognitively normal (CN) subjects from the ADNI database. The WMH-related genes identified in patients with MCI were considered to be associated with small-vessel injury in early AD. Functional enrichment analyses and a protein-protein interaction (PPI) network were performed to explore the pathway and hub genes related to the mechanism of small-vessel injury in MCI. Subsequently, the Boruta algorithm and support vector machine recursive feature elimination (SVM-RFE) algorithm were performed to identify feature-selection genes. Finally, the mechanism of small-vessel injury was analyzed in MCI from the immunological perspectives; the relationship of feature-selection genes with various immune cells and neuroimaging indices were also explored. Furthermore, 5×FAD mice were used to demonstrate the genes related to small-vessel injury. The results of the logistic regression analyses suggested that WMHs significantly contributed to MCI, the early stage of AD. A total of 276 genes were determined as WMH-related genes in patients with MCI, while 203 WMH-related genes were obtained in CN patients. Among them, only 15 genes overlapped and were thus identified as the crosstalk genes. By employing the Boruta and SVM-RFE algorithms, CD163, ALDH3B1, MIR22HG, DTX2, FOLR2, ALDH2, and ZNF23 were recognized as the feature-selection genes linked to small-vessel injury in MCI. After considering the results from the PPI network, CD163 was finally determined as the critical WMH-related gene in MCI. The expression of CD163 was correlated with fractional anisotropy (FA) values in regions that are vulnerable to small-vessel injury in AD. The immunostaining and RT-qPCR results from the verifying experiments demonstrated that the indicators of small-vessel injury presented in the cortical tissue of 5×FAD mice and related to the upregulation of CD163 expression. CD163 may be the most pivotal candidates related to small-vessel injury in early AD.
Collapse
Affiliation(s)
- Yuewei Chen
- Health Management Center, Renji Hospital of Medical School, Shanghai Jiao Tong University, Shanghai 200127, China; (Y.C.); (P.L.); (W.L.)
- Department of Neurology, Renji Hospital of Medical School, Shanghai Jiao Tong University, Shanghai 200127, China
- Renji-UNSW CHeBA (Centre for Healthy Brain Ageing of University of New South Wales) Neurocognitive Center, Renji Hospital of Medical School, Shanghai Jiao Tong University, Shanghai 200127, China
| | - Peiwen Lu
- Health Management Center, Renji Hospital of Medical School, Shanghai Jiao Tong University, Shanghai 200127, China; (Y.C.); (P.L.); (W.L.)
- Renji-UNSW CHeBA (Centre for Healthy Brain Ageing of University of New South Wales) Neurocognitive Center, Renji Hospital of Medical School, Shanghai Jiao Tong University, Shanghai 200127, China
| | - Shengju Wu
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200030, China
| | - Jie Yang
- Health Management Center, Renji Hospital of Medical School, Shanghai Jiao Tong University, Shanghai 200127, China; (Y.C.); (P.L.); (W.L.)
- Department of Neurology, Renji Hospital of Medical School, Shanghai Jiao Tong University, Shanghai 200127, China
- Renji-UNSW CHeBA (Centre for Healthy Brain Ageing of University of New South Wales) Neurocognitive Center, Renji Hospital of Medical School, Shanghai Jiao Tong University, Shanghai 200127, China
| | - Wanwan Liu
- Health Management Center, Renji Hospital of Medical School, Shanghai Jiao Tong University, Shanghai 200127, China; (Y.C.); (P.L.); (W.L.)
| | - Zhijun Zhang
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200030, China
| | - Qun Xu
- Health Management Center, Renji Hospital of Medical School, Shanghai Jiao Tong University, Shanghai 200127, China; (Y.C.); (P.L.); (W.L.)
- Department of Neurology, Renji Hospital of Medical School, Shanghai Jiao Tong University, Shanghai 200127, China
- Renji-UNSW CHeBA (Centre for Healthy Brain Ageing of University of New South Wales) Neurocognitive Center, Renji Hospital of Medical School, Shanghai Jiao Tong University, Shanghai 200127, China
| |
Collapse
|
12
|
Gao Y, Zhang G, Jiang S, Liu Y. Wekemo Bioincloud: A user-friendly platform for meta-omics data analyses. IMETA 2024; 3:e175. [PMID: 38868508 PMCID: PMC10989175 DOI: 10.1002/imt2.175] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 01/23/2024] [Accepted: 01/23/2024] [Indexed: 06/14/2024]
Abstract
The increasing application of meta-omics approaches to investigate the structure, function, and intercellular interactions of microbial communities has led to a surge in available data. However, this abundance of human and environmental microbiome data has exposed new scalability challenges for existing bioinformatics tools. In response, we introduce Wekemo Bioincloud-a specialized platform for -omics studies. This platform offers a comprehensive analysis solution, specifically designed to alleviate the challenges of tool selection for users in the face of expanding data sets. As of now, Wekemo Bioincloud has been regularly equipped with 22 workflows and 65 visualization tools, establishing itself as a user-friendly and widely embraced platform for studying diverse data sets. Additionally, the platform enables the online modification of vector outputs, and the registration-independent personalized dashboard system ensures privacy and traceability. Wekemo Bioincloud is freely available at https://www.bioincloud.tech/.
Collapse
Affiliation(s)
- Yunyun Gao
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at ShenzhenChinese Academy of Agricultural SciencesShenzhenChina
| | - Guoxing Zhang
- Shenzhen Wekemo Technology Group Co., Ltd.ShenzhenChina
| | - Shunyao Jiang
- Shenzhen Wekemo Technology Group Co., Ltd.ShenzhenChina
| | - Yong‐Xin Liu
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at ShenzhenChinese Academy of Agricultural SciencesShenzhenChina
| |
Collapse
|
13
|
Yang Q, Chen S, Jiang W, Mi L, Liu J, Hu Y, Ji X, Wang J, Zhu F. MultiClassMetabo: A Superior Classification Model Constructed Using Metabolic Markers in Multiclass Metabolomics. Anal Chem 2024; 96:1410-1418. [PMID: 38221713 DOI: 10.1021/acs.analchem.3c03212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
Multiclass metabolomics has become a popular technique for revealing the mechanisms underlying certain physiological processes, different tumor types, or different therapeutic responses. In multiclass metabolomics, it is highly important to uncover the underlying biological information on biosamples by identifying the metabolic markers with the most associations and classifying the different sample classes. The classification problem of multiclass metabolomics is more difficult than that of the binary problem. To date, various methods exist for constructing classification models and identifying metabolic markers consisting of well-established techniques and newly emerging machine learning algorithms. However, how to construct a superior classification model using these methods remains unclear for a given multiclass metabolomic data set. Herein, MultiClassMetabo has been developed for constructing a superior classification model using metabolic markers identified in multiclass metabolomics. MultiClassMetabo can enable online services, including (a) identifying metabolic markers by marker identification methods, (b) constructing classification models by classification methods, and (c) performing a comprehensive assessment from multiple perspectives to construct a superior classification model for multiclass metabolomics. In summary, MultiClassMetabo is distinguished for its capability to construct a superior classification model using the most appropriate method through a comprehensive assessment, which makes it an important complement to other available tools in multiclass metabolomics. MultiClassMetabo can be accessed at http://idrblab.cn/multiclassmetabo/.
Collapse
Affiliation(s)
- Qingxia Yang
- Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
- Department of Bioinformatics, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Shuman Chen
- Department of Bioinformatics, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Wenyu Jiang
- Department of Bioinformatics, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Lan Mi
- Department of Bioinformatics, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Jiarui Liu
- Department of Bioinformatics, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Yu Hu
- Department of Bioinformatics, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Xinglai Ji
- Department of Bioinformatics, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Jun Wang
- Department of Bioinformatics, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
14
|
Yin J, Chen Z, You N, Li F, Zhang H, Xue J, Ma H, Zhao Q, Yu L, Zeng S, Zhu F. VARIDT 3.0: the phenotypic and regulatory variability of drug transporter. Nucleic Acids Res 2024; 52:D1490-D1502. [PMID: 37819041 PMCID: PMC10767864 DOI: 10.1093/nar/gkad818] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 09/01/2023] [Accepted: 09/27/2023] [Indexed: 10/13/2023] Open
Abstract
The phenotypic and regulatory variability of drug transporter (DT) are vital for the understanding of drug responses, drug-drug interactions, multidrug resistances, and so on. The ADME property of a drug is collectively determined by multiple types of variability, such as: microbiota influence (MBI), transcriptional regulation (TSR), epigenetics regulation (EGR), exogenous modulation (EGM) and post-translational modification (PTM). However, no database has yet been available to comprehensively describe these valuable variabilities of DTs. In this study, a major update of VARIDT was therefore conducted, which gave 2072 MBIs, 10 610 TSRs, 46 748 EGRs, 12 209 EGMs and 10 255 PTMs. These variability data were closely related to the transportation of 585 approved and 301 clinical trial drugs for treating 572 diseases. Moreover, the majority of the DTs in this database were found with multiple variabilities, which allowed a collective consideration in determining the ADME properties of a drug. All in all, VARIDT 3.0 is expected to be a popular data repository that could become an essential complement to existing pharmaceutical databases, and is freely accessible without any login requirement at: https://idrblab.org/varidt/.
Collapse
Affiliation(s)
- Jiayi Yin
- College of Pharmaceutical Sciences, Department of Clinical Pharmacy, The First Affiliated Hospital, Zhejiang University School of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Zhen Chen
- College of Pharmaceutical Sciences, Department of Clinical Pharmacy, The First Affiliated Hospital, Zhejiang University School of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Nanxin You
- College of Pharmaceutical Sciences, Department of Clinical Pharmacy, The First Affiliated Hospital, Zhejiang University School of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Fengcheng Li
- College of Pharmaceutical Sciences, Department of Clinical Pharmacy, The First Affiliated Hospital, Zhejiang University School of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
- The Children's Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310052, China
| | - Hanyu Zhang
- College of Pharmaceutical Sciences, Department of Clinical Pharmacy, The First Affiliated Hospital, Zhejiang University School of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Jia Xue
- College of Pharmaceutical Sciences, Department of Clinical Pharmacy, The First Affiliated Hospital, Zhejiang University School of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Hui Ma
- College of Pharmaceutical Sciences, Department of Clinical Pharmacy, The First Affiliated Hospital, Zhejiang University School of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Qingwei Zhao
- College of Pharmaceutical Sciences, Department of Clinical Pharmacy, The First Affiliated Hospital, Zhejiang University School of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Lushan Yu
- College of Pharmaceutical Sciences, Department of Clinical Pharmacy, The First Affiliated Hospital, Zhejiang University School of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Su Zeng
- College of Pharmaceutical Sciences, Department of Clinical Pharmacy, The First Affiliated Hospital, Zhejiang University School of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Department of Clinical Pharmacy, The First Affiliated Hospital, Zhejiang University School of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
15
|
Zhou Y, Zhang Y, Zhao D, Yu X, Shen X, Zhou Y, Wang S, Qiu Y, Chen Y, Zhu F. TTD: Therapeutic Target Database describing target druggability information. Nucleic Acids Res 2024; 52:D1465-D1477. [PMID: 37713619 PMCID: PMC10767903 DOI: 10.1093/nar/gkad751] [Citation(s) in RCA: 56] [Impact Index Per Article: 56.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 07/31/2023] [Accepted: 09/05/2023] [Indexed: 09/17/2023] Open
Abstract
Target discovery is one of the essential steps in modern drug development, and the identification of promising targets is fundamental for developing first-in-class drug. A variety of methods have emerged for target assessment based on druggability analysis, which refers to the likelihood of a target being effectively modulated by drug-like agents. In the therapeutic target database (TTD), nine categories of established druggability characteristics were thus collected for 426 successful, 1014 clinical trial, 212 preclinical/patented, and 1479 literature-reported targets via systematic review. These characteristic categories were classified into three distinct perspectives: molecular interaction/regulation, human system profile and cell-based expression variation. With the rapid progression of technology and concerted effort in drug discovery, TTD and other databases were highly expected to facilitate the explorations of druggability characteristics for the discovery and validation of innovative drug target. TTD is now freely accessible at: https://idrblab.org/ttd/.
Collapse
Affiliation(s)
- Ying Zhou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 310058, China
- National Key Laboratory of Diagnosis and Treatment of Severe Infectious Disease, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang 310000, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Yintao Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Donghai Zhao
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Xinyuan Yu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Xinyi Shen
- Department of Environmental Health Sciences, Yale School of Public Health, Yale University, New Haven 06510, USA
| | - Yuan Zhou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Shanshan Wang
- Qian Xuesen Collaborative Research Center of Astrochemistry and Space Life Sciences, Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
| | - Yunqing Qiu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 310058, China
- National Key Laboratory of Diagnosis and Treatment of Severe Infectious Disease, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang 310000, China
| | - Yuzong Chen
- State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, The Graduate School at Shenzhen, Tsinghua University, Shenzhen 518055, China
- Institute of Biomedical Health Technology and Engineering, Shenzhen Bay Laboratory, Shenzhen 518000, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
16
|
Zhang Y, Zhou Y, Zhou Y, Yu X, Shen X, Hong Y, Zhang Y, Wang S, Mou M, Zhang J, Tao L, Gao J, Qiu Y, Chen Y, Zhu F. TheMarker: a comprehensive database of therapeutic biomarkers. Nucleic Acids Res 2024; 52:D1450-D1464. [PMID: 37850638 PMCID: PMC10767989 DOI: 10.1093/nar/gkad862] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 09/21/2023] [Accepted: 09/29/2023] [Indexed: 10/19/2023] Open
Abstract
Distinct from the traditional diagnostic/prognostic biomarker (adopted as the indicator of disease state/process), the therapeutic biomarker (ThMAR) has emerged to be very crucial in the clinical development and clinical practice of all therapies. There are five types of ThMAR that have been found to play indispensable roles in various stages of drug discovery, such as: Pharmacodynamic Biomarker essential for guaranteeing the pharmacological effects of a therapy, Safety Biomarker critical for assessing the extent or likelihood of therapy-induced toxicity, Monitoring Biomarker indispensable for guiding clinical management by serially measuring patients' status, Predictive Biomarker crucial for maximizing the clinical outcome of a therapy for specific individuals, and Surrogate Endpoint fundamental for accelerating the approval of a therapy. However, these data of ThMARs has not been comprehensively described by any of the existing databases. Herein, a database, named 'TheMarker', was therefore constructed to (a) systematically offer all five types of ThMAR used at different stages of drug development, (b) comprehensively describe ThMAR information for the largest number of drugs among available databases, (c) extensively cover the widest disease classes by not just focusing on anticancer therapies. These data in TheMarker are expected to have great implication and significant impact on drug discovery and clinical practice, and it is freely accessible without any login requirement at: https://idrblab.org/themarker.
Collapse
Affiliation(s)
- Yintao Zhang
- College of Pharmaceutical Sciences, The First Affiliated Hospital, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Ying Zhou
- College of Pharmaceutical Sciences, The First Affiliated Hospital, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- National Key Laboratory of Diagnosis and Treatment of Severe Infectious Disease, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310000, China
| | - Yuan Zhou
- College of Pharmaceutical Sciences, The First Affiliated Hospital, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Xinyuan Yu
- College of Pharmaceutical Sciences, The First Affiliated Hospital, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Xinyi Shen
- Department of Environmental Health Sciences, Yale School of Public Health, Yale University, New Haven 06510, USA
| | - Yanfeng Hong
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Yuxin Zhang
- College of Pharmaceutical Sciences, The First Affiliated Hospital, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Shanshan Wang
- Qian Xuesen Collaborative Research Center of Astrochemistry and Space Life Sciences, Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, The First Affiliated Hospital, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Jinsong Zhang
- College of Pharmaceutical Sciences, The First Affiliated Hospital, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Jianqing Gao
- College of Pharmaceutical Sciences, The First Affiliated Hospital, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Yunqing Qiu
- College of Pharmaceutical Sciences, The First Affiliated Hospital, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- National Key Laboratory of Diagnosis and Treatment of Severe Infectious Disease, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310000, China
| | - Yuzong Chen
- State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, The Graduate School at Shenzhen, Tsinghua University, Shenzhen 518055, China
- Institute of Biomedical Health Technology and Engineering, Shenzhen Bay Laboratory, Shenzhen 518000, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The First Affiliated Hospital, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
17
|
Shen L, Sun X, Chen Z, Guo Y, Shen Z, Song Y, Xin W, Ding H, Ma X, Xu W, Zhou W, Che J, Tan L, Chen L, Chen S, Dong X, Fang L, Zhu F. ADCdb: the database of antibody-drug conjugates. Nucleic Acids Res 2024; 52:D1097-D1109. [PMID: 37831118 PMCID: PMC10768060 DOI: 10.1093/nar/gkad831] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/07/2023] [Accepted: 09/28/2023] [Indexed: 10/14/2023] Open
Abstract
Antibody-drug conjugates (ADCs) are a class of innovative biopharmaceutical drugs, which, via their antibody (mAb) component, deliver and release their potent warhead (a.k.a. payload) at the disease site, thereby simultaneously improving the efficacy of delivered therapy and reducing its off-target toxicity. To design ADCs of promising efficacy, it is crucial to have the critical data of pharma-information and biological activities for each ADC. However, no such database has been constructed yet. In this study, a database named ADCdb focusing on providing ADC information (especially its pharma-information and biological activities) from multiple perspectives was thus developed. Particularly, a total of 6572 ADCs (359 approved by FDA or in clinical trial pipeline, 501 in preclinical test, 819 with in-vivo testing data, 1868 with cell line/target testing data, 3025 without in-vivo/cell line/target testing data) together with their explicit pharma-information was collected and provided. Moreover, a total of 9171 literature-reported activities were discovered, which were identified from diverse clinical trial pipelines, model organisms, patient/cell-derived xenograft models, etc. Due to the significance of ADCs and their relevant data, this new database was expected to attract broad interests from diverse research fields of current biopharmaceutical drug discovery. The ADCdb is now publicly accessible at: https://idrblab.org/adcdb/.
Collapse
Affiliation(s)
- Liteng Shen
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Department of Pharmacy, Zhejiang Cancer Hospital, Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310005, China
- Postgraduate Training Base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), Hangzhou 310022, China
- College of Pharmaceutical Science, Zhejiang University of Technology, Hangzhou 310014, China
| | - Xiuna Sun
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Zhen Chen
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Yu Guo
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Zheyuan Shen
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Yi Song
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Wenxiu Xin
- Department of Pharmacy, Zhejiang Cancer Hospital, Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310005, China
| | - Haiying Ding
- Department of Pharmacy, Zhejiang Cancer Hospital, Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310005, China
| | - Xinyue Ma
- Department of Pharmacy, Zhejiang Cancer Hospital, Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310005, China
- Postgraduate Training Base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), Hangzhou 310022, China
| | - Weiben Xu
- Department of Pharmacy, Zhejiang Cancer Hospital, Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310005, China
- College of Pharmaceutical Science, Zhejiang University of Technology, Hangzhou 310014, China
| | - Wanying Zhou
- Department of Pharmacy, Zhejiang Cancer Hospital, Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310005, China
- Postgraduate Training Base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), Hangzhou 310022, China
| | - Jinxin Che
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Lili Tan
- Department of Pharmacy, Zhejiang Cancer Hospital, Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310005, China
- Postgraduate Training Base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), Hangzhou 310022, China
| | - Liangsheng Chen
- Department of Pharmacy, Zhejiang Cancer Hospital, Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310005, China
- Postgraduate Training Base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), Hangzhou 310022, China
| | - Siqi Chen
- School of Pharmaceutical Science, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Xiaowu Dong
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- College of Pharmaceutical Science, Zhejiang University of Technology, Hangzhou 310014, China
| | - Luo Fang
- Department of Pharmacy, Zhejiang Cancer Hospital, Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310005, China
- Postgraduate Training Base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), Hangzhou 310022, China
- College of Pharmaceutical Science, Zhejiang University of Technology, Hangzhou 310014, China
- School of Pharmaceutical Science, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
18
|
Wang J, Zhang L, Sun J, Yang X, Wu W, Chen W, Zhao Q. Predicting drug-induced liver injury using graph attention mechanism and molecular fingerprints. Methods 2024; 221:18-26. [PMID: 38040204 DOI: 10.1016/j.ymeth.2023.11.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Revised: 11/14/2023] [Accepted: 11/25/2023] [Indexed: 12/03/2023] Open
Abstract
Drug-induced liver injury (DILI) is a significant issue in drug development and clinical treatment due to its potential to cause liver dysfunction or damage, which, in severe cases, can lead to liver failure or even fatality. DILI has numerous pathogenic factors, many of which remain incompletely understood. Consequently, it is imperative to devise methodologies and tools for anticipatory assessment of DILI risk in the initial phases of drug development. In this study, we present DMFPGA, a novel deep learning predictive model designed to predict DILI. To provide a comprehensive description of molecular properties, we employ a multi-head graph attention mechanism to extract features from the molecular graphs, representing characteristics at the level of compound nodes. Additionally, we combine multiple fingerprints of molecules to capture features at the molecular level of compounds. The fusion of molecular fingerprints and graph features can more fully express the properties of compounds. Subsequently, we employ a fully connected neural network to classify compounds as either DILI-positive or DILI-negative. To rigorously evaluate DMFPGA's performance, we conduct a 5-fold cross-validation experiment. The obtained results demonstrate the superiority of our method over four existing state-of-the-art computational approaches, exhibiting an average AUC of 0.935 and an average ACC of 0.934. We believe that DMFPGA is helpful for early-stage DILI prediction and assessment in drug development.
Collapse
Affiliation(s)
- Jifeng Wang
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan 114051, China
| | - Li Zhang
- School of Life Science, Liaoning University, Shenyang 110036, China
| | - Jianqiang Sun
- School of Information Science and Engineering, Linyi University, Linyi 276000, China
| | - Xin Yang
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan 114051, China
| | - Wei Wu
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan 114051, China
| | - Wei Chen
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China.
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan 114051, China.
| |
Collapse
|
19
|
Zhou Q, Wang Y, Xin C, Wei X, Yao Y, Xia L. Identification of telomere-associated gene signatures to predict prognosis and drug sensitivity in glioma. Comput Biol Med 2024; 168:107750. [PMID: 38029531 DOI: 10.1016/j.compbiomed.2023.107750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 11/11/2023] [Accepted: 11/20/2023] [Indexed: 12/01/2023]
Abstract
OBJECTIVE Gliomas are a heterogeneous group of brain tumors with distinct biological and clinical properties, leading to significant mortality and morbidity. Emerging evidence shows telomere maintenance has implicated in glioma susceptibility and prognosis. In this study, we comprehensively analyzed gene signatures related to telomere maintenance in glioma and their predictive values for predicting the prognosis and drug sensitivity in glioma. METHODS We initially identified telomere-related genes differentially expressed between low-grade glioma (LGG) and glioblastoma (GBM) and accordingly developed a risk model by univariate and multivariate Cox analysis to assess the expressions of telomere-related genes across the risk groups. Finally, to assess these genes in immune function the anti-tumor medications often used in the clinical treatment of glioma, we computed immune cell infiltration analysis and drug sensitivity analysis. RESULTS The consensus clustering analysis identified 20 telomere-related genes which split LGG patients into two distinct subtypes. The patient survival, the expressions of key telomere-related DEGs, and immune cell infiltration significantly differed between Cluster 1 and Cluster 2. The LASSO risk model [riskScore=(0.086)*HOXA7+(0.242)*WEE1+(0.247)*IGF2BP3+(0.052)*DUSP10] showed significant differences regarding the 1-, 3-, 5-year overall survival, immune cell infiltration, and drug sensitivity between high- and low-risk groups. The predictive nomogram constructed to quantify the survival probability of each sample at 1, 3, and 5 years was consistent with the actual patient survival. CONCLUSION Our comprehensive characterization of telomere-associated gene signatures in glioma reveals their possible roles in the development, tumor microenvironment, and prognosis. The study provides some suggestive relationships between four telomere-related genes (HOXA7, WEE1, IGF2BP3, and DUSP10) and glioma prognosis.
Collapse
Affiliation(s)
- Qingqing Zhou
- Department of Neurosurgery, The First Affiliated Hospital of Yangtze University, Jingzhou First People's Hospital, Jingzhou, 434000, People's Republic of China
| | - Yamei Wang
- Department of Neurology, The First Affiliated Hospital of Yangtze University, Jingzhou First People's Hospital, Jingzhou, 434000, People's Republic of China
| | - Chenqi Xin
- Department of Scientific Research, The First Affiliated Hospital of Yangtze University, Jingzhou First People's Hospital, Jingzhou, 434000, People's Republic of China
| | - XiaoMing Wei
- Department of Neurosurgery, The First Affiliated Hospital of Yangtze University, Jingzhou First People's Hospital, Jingzhou, 434000, People's Republic of China
| | - Yuan Yao
- Department of Neurosurgery, The First Affiliated Hospital of Yangtze University, Jingzhou First People's Hospital, Jingzhou, 434000, People's Republic of China.
| | - Liang Xia
- Department of Neurosurgery, The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital) Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou, 310022, People's Republic of China.
| |
Collapse
|
20
|
Gong Y, Ding W, Wang P, Wu Q, Yao X, Yang Q. Evaluating Machine Learning Methods of Analyzing Multiclass Metabolomics. J Chem Inf Model 2023; 63:7628-7641. [PMID: 38079572 DOI: 10.1021/acs.jcim.3c01525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2023]
Abstract
Multiclass metabolomic studies have become popular for revealing the differences in multiple stages of complex diseases, various lifestyles, or the effects of specific treatments. In multiclass metabolomics, there are multiple data manipulation steps for analyzing raw data, which consist of data filtering, the imputation of missing values, data normalization, marker identification, sample separation, classification, and so on. In each step, several to dozens of machine learning methods can be chosen for the given data set, with potentially hundreds or thousands of method combinations in the whole data processing chain. Therefore, a clear understanding of these machine learning methods is helpful for selecting an appropriate method combination for obtaining stable and reliable analytical results of specific data. However, there has rarely been an overall introduction or evaluation of these methods based on multiclass metabolomic data. Herein, detailed descriptions of these machine learning methods in multiple data manipulation steps are reviewed. Moreover, an assessment of these methods was performed using a benchmark data set for multiclass metabolomics. First, 12 imputation methods for imputing missing values were evaluated based on the PSS (Procrustes statistical shape analysis) and NRMSE (normalized root-mean-square error) values. Second, 17 normalization methods for processing multiclass metabolomic data were evaluated by applying the PMAD (pooled median absolute deviation) value. Third, different methods of identifying markers of multiclass metabolomics were evaluated based on the CWrel (relative weighted consistency) value. Fourth, nine classification methods for constructing multiclass models were assessed using the AUC (area under the curve) value. Performance evaluations of machine learning methods are highly recommended to select the most appropriate method combination before performing the final analysis of the given data. Overall, detailed descriptions and evaluation of various machine learning methods are expected to improve analyses of multiclass metabolomic data.
Collapse
Affiliation(s)
- Yaguo Gong
- State Key Laboratory of Quality Research in Chinese Medicine, School of Pharmacy, Macau University of Science and Technology, Macau 999078, China
| | - Wei Ding
- State Key Laboratory of Quality Research in Chinese Medicine, School of Pharmacy, Macau University of Science and Technology, Macau 999078, China
| | - Panpan Wang
- College of Chemistry and Pharmaceutical Engineering, Huanghuai University, Zhumadian 463000, China
| | - Qibiao Wu
- State Key Laboratory of Quality Research in Chinese Medicine, School of Pharmacy, Macau University of Science and Technology, Macau 999078, China
| | - Xiaojun Yao
- Centre for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao 999078, China
| | - Qingxia Yang
- Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
- Department of Bioinformatics, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| |
Collapse
|
21
|
Moslemi A, Ahmadian A. Dual regularized subspace learning using adaptive graph learning and rank constraint: Unsupervised feature selection on gene expression microarray datasets. Comput Biol Med 2023; 167:107659. [PMID: 37950946 DOI: 10.1016/j.compbiomed.2023.107659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 10/13/2023] [Accepted: 10/31/2023] [Indexed: 11/13/2023]
Abstract
High-dimensional problems have increasingly drawn attention in gene selection and analysis. To add insult to injury, usually the number of features is greater than number of samples in microarray gene dataset which leads to an ill-posed underdetermined equation system. Poor performance and high computational time for learning algorithms are consequences of redundant features in high-dimensional data. Feature selection is a noteworthy pre-processing method to ameliorate the curse of dimensionality with aim of maximum relevancy and minimum redundancy information preservation. Likewise, unsupervised feature selection has been important since collecting labels for data is expensive. In this paper, we develop a novel robust unsupervised feature selection to select discriminative subset of features for unlabeled data based on rank constrained and dual regularized nonnegative matrix factorization. The major focus of the proposed technique is to discard redundant features while keeping the informative features. Proposed feature selection technique consists of nonnegative matrix factorization to decompose the data into feature weight matrix and representation matrix, inner product norm as regularization for both feature weight matrix and representation matrix, adaptive structure learning to preserve local information and Schatten-p norm as rank constraint. To demonstrate the effectiveness of the proposed method, numerical studies are conducted on six benchmark microarray datasets. The results show that the proposed technique outperforms eight state-of-art unsupervised feature selection techniques in terms of clustering accuracy and normalized mutual information.
Collapse
Affiliation(s)
- Amir Moslemi
- Imaging Research and Physical Sciences, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada.
| | - Arash Ahmadian
- Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
22
|
Wu Z, Wang Z, Wu H, Zheng N, Huang D, Huang Z, Han H, Bao J, Xu H, Zhang R, Du Z, Wu D. The pan-cancer multi-omics landscape of key genes of sialylation combined with RNA-sequencing validation. Comput Biol Med 2023; 166:107556. [PMID: 37801920 DOI: 10.1016/j.compbiomed.2023.107556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 09/12/2023] [Accepted: 09/28/2023] [Indexed: 10/08/2023]
Abstract
BACKGROUND Sialylation, the process of salivary acid glycan synthesis, plays a pivotal function in tumor growth, immune escape, tumor metastasis, and resistance to drugs. However, the association between sialylation and prognosis, tumor microenvironment (TME), and treatment response in a variety of cancers remains unclear. METHODS A comprehensive survey of the expression profile, prognostic value, and genetic and epigenetic alterations of sialylation-related genes was performed in pan-cancer. Subsequently, the single-sample gene set enrichment analysis (ssGSEA) algorithm was used to compute sialylation pathway scores in pan-cancer. Correlations of sialylation pathway scores with clinical features, prognosis, and TME were evaluated using multiple algorithms. Finally, the efficacy of the sialylation pathway score in determining the effect of immunotherapy was evaluated. The expression of sialylation-related genes were verified by RNA-sequencing. RESULTS Significant differences were observed in sialylation-related genes expression between tumors and adjacent normal tissues for most cancer types. Sialylation pathway scores differed according to the type of tumor, where the poor prognosis was correlated with high sialylation pathway scores in uveal melanoma (UVM) and pancreatic adenocarcinoma (PAAD). In addition, sialylation pathway scores were positively associated with the ImmuneScore, StromalScore and immune-related pathways. Moreover, the level of immune cells infiltration was higher in tumors with higher sialylation pathway scores. Finally, patients with high sialylation pathway scores were more sensitive to immunotherapy. CONCLUSION Sialylation-related genes are essential in pan-cancer. The sialylation pathway score may be used as a biomarker in oncology patients.
Collapse
Affiliation(s)
- Zhixuan Wu
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325015, Zhejiang, People's Republic of China
| | - Ziqiong Wang
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325015, Zhejiang, People's Republic of China
| | - Haodong Wu
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325015, Zhejiang, People's Republic of China
| | - Na Zheng
- Department of Hernia and Abdominal Wall Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325015, Zhejiang, People's Republic of China
| | - Dongdong Huang
- Department of Hernia and Abdominal Wall Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325015, Zhejiang, People's Republic of China
| | - Zhipeng Huang
- Department of Hernia and Abdominal Wall Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325015, Zhejiang, People's Republic of China
| | - Hui Han
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325015, Zhejiang, People's Republic of China
| | - Jingxia Bao
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325015, Zhejiang, People's Republic of China
| | - Hongjie Xu
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325015, Zhejiang, People's Republic of China
| | - Rongrong Zhang
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325015, Zhejiang, People's Republic of China.
| | - Zhou Du
- Department of Hernia and Abdominal Wall Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325015, Zhejiang, People's Republic of China.
| | - Dazhou Wu
- Department of Hernia and Abdominal Wall Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325015, Zhejiang, People's Republic of China.
| |
Collapse
|
23
|
Liang S, Zhao Y, Jin J, Qiao J, Wang D, Wang Y, Wei L. Rm-LR: A long-range-based deep learning model for predicting multiple types of RNA modifications. Comput Biol Med 2023; 164:107238. [PMID: 37515874 DOI: 10.1016/j.compbiomed.2023.107238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 06/16/2023] [Accepted: 07/07/2023] [Indexed: 07/31/2023]
Abstract
Recent research has highlighted the pivotal role of RNA post-transcriptional modifications in the regulation of RNA expression and function. Accurate identification of RNA modification sites is important for understanding RNA function. In this study, we propose a novel RNA modification prediction method, namely Rm-LR, which leverages a long-range-based deep learning approach to accurately predict multiple types of RNA modifications using RNA sequences only. Rm-LR incorporates two large-scale RNA language pre-trained models to capture discriminative sequential information and learn local important features, which are subsequently integrated through a bilinear attention network. Rm-LR supports a total of ten RNA modification types (m6A, m1A, m5C, m5U, m6Am, Ψ, Am, Cm, Gm, and Um) and significantly outperforms the state-of-the-art methods in terms of predictive capability on benchmark datasets. Experimental results show the effectiveness and superiority of Rm-LR in prediction of various RNA modifications, demonstrating the strong adaptability and robustness of our proposed model. We demonstrate that RNA language pretrained models enable to learn dense biological sequential representations from large-scale long-range RNA corpus, and meanwhile enhance the interpretability of the models. This work contributes to the development of accurate and reliable computational models for RNA modification prediction, providing insights into the complex landscape of RNA modifications.
Collapse
Affiliation(s)
- Sirui Liang
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Yanxi Zhao
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Junru Jin
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Jianbo Qiao
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Ding Wang
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Yu Wang
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Leyi Wei
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China.
| |
Collapse
|
24
|
Alkady W, ElBahnasy K, Gad W. A diagnostic model for COVID-19 based on proteomics analysis. Comput Biol Med 2023; 162:107109. [PMID: 37276752 PMCID: PMC10232940 DOI: 10.1016/j.compbiomed.2023.107109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 05/21/2023] [Accepted: 05/30/2023] [Indexed: 06/07/2023]
Abstract
BACKGROUND AND OBJECTIVE Early diagnosis of Coronavirus Disease 2019 (COVID-19) can help save patients' lives before the disease turns severe. This can be achieved through an effective and correct treatment protocol. In this paper, a prediction model is proposed to detect infected cases and determine the severity level of the disease. METHODS The proposed model is based on utilizing proteins and metabolites as features for each patient, which are then analyzed using feature selection methods such as Principal Component Analysis (PCA), Information Gain (IG), and analysis of Variance (ANOVA) to select the most significant features. The model employs three classifiers, namely K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Random Forest (RF), to predict and classify the severity level of the COVID-19 infection. The proposed model is evaluated using four performance measures: accuracy, sensitivity, specificity, and precision. RESULTS The experiment results show that the proposed model accuracy can reach 80% using RF classifier with PCA. The PCA selects 22 proteins and 10 metabolites. While ANOVA selects 9 proteins and 5 metabolites. The accuracy reaches 92% after applying RF classifier with the ANOVA. Finally, the accuracy reaches 93% using the RF classifier with only ten features. The selected features are 7 proteins and 3 metabolites. Moreover, it shows that the selected features have a relation to the immune system and respiratory systems. CONCLUSION The proposed model uses three classifiers and shows promising results by selecting the important features and maximizing the prediction accuracy.
Collapse
Affiliation(s)
- Walaa Alkady
- Bioinformatics Program, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt.
| | - Khaled ElBahnasy
- Department of Information Systems, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt.
| | - Walaa Gad
- Department of Information Systems, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt.
| |
Collapse
|
25
|
Xu PH, Chen S, Wang Y, Jin S, Wang J, Ye D, Zhu X, Shen Y. FGFR3 mutation characterization identifies prognostic and immune-related gene signatures in bladder cancer. Comput Biol Med 2023; 162:106976. [PMID: 37301098 DOI: 10.1016/j.compbiomed.2023.106976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 03/31/2023] [Accepted: 04/22/2023] [Indexed: 06/12/2023]
Abstract
BACKGROUND Immunotherapy and FGFR3-targeted therapy play an important role in the management of locally advanced and metastatic bladder cancer (BLCA). Previous studies indicated that FGFR3 mutation (mFGFR3) may be involved in the alterations of immune infiltration, which may affect the priority or combination of these two treatment regimes. However, the specific impact of mFGFR3 on the immunity and how FGFR3 regulates the immune response in BLCA to affect prognosis remain unclear. In this study, we aimed to elucidate the immune landscape associated with mFGFR3 status in BLCA, screen immune-related gene signatures with prognostic value, and construct and validate a prognostic model. METHODS ESTIMATE and TIMER were used to assess the immune infiltration within tumors in the TCGA BLCA cohort based on transcriptome data. Further, the mFGFR3 status and mRNA expression profiles were analyzed to identify immune-related genes that were differentially expressed between patients with BLCA with wild-type FGFR3 or mFGFR3 in the TCGA training cohort. An FGFR3-related immune prognostic score (FIPS) model was established in the TCGA training cohort. Furthermore, we validated the prognostic value of FIPS with microarray data in the GEO database and tissue microarray from our center. Multiple fluorescence immunohistochemical analysis was performed to confirm the relationship between FIPS and immune infiltration. RESULTS mFGFR3 resulted in differential immunity in BLCA. In total, 359 immune-related biological processes were enriched in the wild-type FGFR3 group, whereas none were enriched in the mFGFR3 group. FIPS could effectively distinguish high-risk patients with poor prognosis from low-risk patients. The high-risk group was characterized by a higher abundance of neutrophils; macrophages; and follicular helper, CD4, and CD8 T-cells than the low-risk group. In addition, the high-risk group exhibited higher expression of PD-L1, PD-1, CTLA-4, LAG-3, and TIM-3 than the low-risk group, indicating an immune-infiltrated but functionally suppressed immune microenvironment. Furthermore, patients in the high-risk group exhibited a lower mutation rate of FGFR3 than those in the low-risk group. CONCLUSIONS FIPS effectively predicted survival in BLCA. Patients with different FIPS exhibited diverse immune infiltration and mFGFR3 status. FIPS might be a promising tool for selecting targeted therapy and immunotherapy for patients with BLCA.
Collapse
Affiliation(s)
- Pei-Hang Xu
- Department of Urology, Fudan University Shanghai Cancer Center, Shanghai, China; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Siyuan Chen
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China; Department of Medical Oncology, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Yanhao Wang
- Department of Urology, Fudan University Shanghai Cancer Center, Shanghai, China; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Shengming Jin
- Department of Urology, Fudan University Shanghai Cancer Center, Shanghai, China; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Jun Wang
- Department of Urology, Sun Yat-sen University Cancer Center, Guangzhou, China; State Key Laboratory of Oncology in Southern China, Guangzhou, China; State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China.
| | - Dingwei Ye
- Department of Urology, Fudan University Shanghai Cancer Center, Shanghai, China; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.
| | - Xiaodong Zhu
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China; Department of Medical Oncology, Fudan University Shanghai Cancer Center, Shanghai, China.
| | - Yijun Shen
- Department of Urology, Fudan University Shanghai Cancer Center, Shanghai, China; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.
| |
Collapse
|
26
|
Luo X, Wang Y, Zou Q, Xu L. Recall DNA methylation levels at low coverage sites using a CNN model in WGBS. PLoS Comput Biol 2023; 19:e1011205. [PMID: 37315069 DOI: 10.1371/journal.pcbi.1011205] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 05/22/2023] [Indexed: 06/16/2023] Open
Abstract
DNA methylation is an important regulator of gene transcription. WGBS is the gold-standard approach for base-pair resolution quantitative of DNA methylation. It requires high sequencing depth. Many CpG sites with insufficient coverage in the WGBS data, resulting in inaccurate DNA methylation levels of individual sites. Many state-of-arts computation methods were proposed to predict the missing value. However, many methods required either other omics datasets or other cross-sample data. And most of them only predicted the state of DNA methylation. In this study, we proposed the RcWGBS, which can impute the missing (or low coverage) values from the DNA methylation levels on the adjacent sides. Deep learning techniques were employed for the accurate prediction. The WGBS datasets of H1-hESC and GM12878 were down-sampled. The average difference between the DNA methylation level at 12× depth predicted by RcWGBS and that at >50× depth in the H1-hESC and GM2878 cells are less than 0.03 and 0.01, respectively. RcWGBS performed better than METHimpute even though the sequencing depth was as low as 12×. Our work would help to process methylation data of low sequencing depth. It is beneficial for researchers to save sequencing costs and improve data utilization through computational methods.
Collapse
Affiliation(s)
- Ximei Luo
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, Guangdong, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Yansu Wang
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, Guangdong, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, Guangdong, China
| |
Collapse
|
27
|
Vahabzadeh V, Moattar MH. Robust microarray data feature selection using a correntropy based distance metric learning approach. Comput Biol Med 2023; 161:107056. [PMID: 37235945 DOI: 10.1016/j.compbiomed.2023.107056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 04/18/2023] [Accepted: 05/20/2023] [Indexed: 05/28/2023]
Abstract
Classification of high-dimensional microarray data is a challenge in bioinformatics and genetic data processing. One of the challenging issues of feature selection is the presence of outliers. The Euclidean distance metric is sensitive to outliers. In this study, a distance metric learning based feature selection approach that uses the correntropy function as the discrimination metric is proposed. For this purpose, the metric learning problem is formulated as an optimization problem and solved using the Lagrange method. The output of the approach signifies the most important and robust features. After feature selection, different classification methods such as SVM, decision trees, and NN classifiers are used to investigate the classification accuracy of the proposed method as well as precision, recall, and F-measure. Experiments are carried out on 13 high-dimensional datasets and show that the proposed method outperforms the previous models in terms of accuracy and robustness.
Collapse
Affiliation(s)
- Venus Vahabzadeh
- Department of Software Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran.
| | | |
Collapse
|
28
|
Walakira A, Skubic C, Nadižar N, Rozman D, Režen T, Mraz M, Moškon M. Integrative computational modeling to unravel novel potential biomarkers in hepatocellular carcinoma. Comput Biol Med 2023; 159:106957. [PMID: 37116239 DOI: 10.1016/j.compbiomed.2023.106957] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 03/17/2023] [Accepted: 04/16/2023] [Indexed: 04/30/2023]
Abstract
Hepatocellular carcinoma (HCC) is a major health problem around the world. The management of this disease is complicated by the lack of noninvasive diagnostic tools and the few treatment options available. Better clinical outcomes can be achieved if HCC is detected early, but unfortunately, clinical signs appear when the disease is in its late stages. We aim to identify novel genes that can be targeted for the diagnosis and therapy of HCC. We performed a meta-analysis of transcriptomics data to identify differentially expressed genes and applied network analysis to identify hub genes. Fatty acid metabolism, complement and coagulation cascade, chemical carcinogenesis and retinol metabolism were identified as key pathways in HCC. Furthermore, we integrated transcriptomics data into a reference human genome-scale metabolic model to identify key reactions and subsystems relevant in HCC. We conclude that fatty acid activation, purine metabolism, vitamin D, and E metabolism are key processes in the development of HCC and therefore need to be further explored for the development of new therapies. We provide the first evidence that GABRP, HBG1 and DAK (TKFC) genes are important in HCC in humans and warrant further studies.
Collapse
Affiliation(s)
- Andrew Walakira
- Centre for Functional Genomics and Bio-Chips, Institute for Biochemistry and Molecular Genetics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia.
| | - Cene Skubic
- Centre for Functional Genomics and Bio-Chips, Institute for Biochemistry and Molecular Genetics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Nejc Nadižar
- Centre for Functional Genomics and Bio-Chips, Institute for Biochemistry and Molecular Genetics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Damjana Rozman
- Centre for Functional Genomics and Bio-Chips, Institute for Biochemistry and Molecular Genetics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Tadeja Režen
- Centre for Functional Genomics and Bio-Chips, Institute for Biochemistry and Molecular Genetics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Miha Mraz
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
| | - Miha Moškon
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia.
| |
Collapse
|
29
|
Zhou Y, Liu C, Zhang Z, Chen J, Zhao D, Li L, Tong M, Zhang G. Identification and validation of diagnostic biomarkers of coronary artery disease progression in type 1 diabetes via integrated computational and bioinformatics strategies. Comput Biol Med 2023; 159:106940. [PMID: 37075605 DOI: 10.1016/j.compbiomed.2023.106940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Revised: 04/03/2023] [Accepted: 04/14/2023] [Indexed: 04/21/2023]
Abstract
OBJECTIVE Our study aimed to identify early peripheral blood diagnostic biomarkers and elucidate the immune mechanisms of coronary artery disease (CAD) progression in patients with type 1 diabetes mellitus (T1DM). METHODS Three transcriptome datasets were retrieved from the Gene Expression Omnibus (GEO) database. Gene modules associated with T1DM were selected with weighted gene co-expression network analysis. Differentially expressed genes (DEGs) between CAD and acute myocardial infarction (AMI) peripheral blood tissues were identified using limma. Candidate biomarkers were selected with functional enrichment analysis, node gene selection from a constructed protein-protein interaction (PPI) network, and 3 machine learning algorithms. Candidate expression was compared, and the receiver operating characteristic curve (ROC) and nomogram were constructed. Immune cell infiltration was assessed with the CIBERSORT algorithm. RESULTS A total of 1283 genes comprising 2 modules were detected as the most associated with T1DM. In addition, 451 DEGs related to CAD progression were identified. Among them, 182 were common to both diseases and mainly enriched in immune and inflammatory response regulation. The PPI network yielded 30 top node genes, and 6 were selected using the 3 machine learning algorithms. Upon validation, 4 genes (TLR2, CLEC4D, IL1R2, and NLRC4) were recognized as diagnostic biomarkers with the area under the curve (AUC) > 0.7. All 4 genes were positively correlated with neutrophils in patients with AMI. CONCLUSION We identified 4 peripheral blood biomarkers and provided a nomogram for early diagnosing CAD progression to AMI in patients with T1DM. The biomarkers were positively associated with neutrophils, indicating potential therapeutic targets.
Collapse
Affiliation(s)
- Yufei Zhou
- Shanghai Medical College, Fudan University, Shanghai, 200032, China
| | - Chunjiang Liu
- Department of General Surgery, Division of Vascular Surgery, Shaoxing People's Hospital, Shaoxing, 312000, China
| | - Zhongzheng Zhang
- Department of Rehabilitation, The First Affiliated Hospital of Anhui Medical University, Anhui Public Health Clinical Center, Hefei, Anhui, 230000, China
| | - Jian Chen
- Department of Rehabilitation, The First Affiliated Hospital of Anhui Medical University, Anhui Public Health Clinical Center, Hefei, Anhui, 230000, China
| | - Di Zhao
- Shanghai Medical College, Fudan University, Shanghai, 200032, China
| | - Linnan Li
- Shanghai Medical College, Fudan University, Shanghai, 200032, China
| | - Mingyue Tong
- Department of Rehabilitation, The First Affiliated Hospital of Anhui Medical University, Anhui Public Health Clinical Center, Hefei, Anhui, 230000, China.
| | - Gang Zhang
- Department of Rehabilitation, The First Affiliated Hospital of Anhui Medical University, Anhui Public Health Clinical Center, Hefei, Anhui, 230000, China.
| |
Collapse
|
30
|
Fajarda O, Almeida JR, Duarte-Pereira S, Silva RM, Oliveira JL. Methodology to identify a gene expression signature by merging microarray datasets. Comput Biol Med 2023; 159:106867. [PMID: 37060770 DOI: 10.1016/j.compbiomed.2023.106867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 03/01/2023] [Accepted: 03/30/2023] [Indexed: 04/17/2023]
Abstract
A vast number of microarray datasets have been produced as a way to identify differentially expressed genes and gene expression signatures. A better understanding of these biological processes can help in the diagnosis and prognosis of diseases, as well as in the therapeutic response to drugs. However, most of the available datasets are composed of a reduced number of samples, leading to low statistical, predictive and generalization power. One way to overcome this problem is by merging several microarray datasets into a single dataset, which is typically a challenging task. Statistical methods or supervised machine learning algorithms are usually used to determine gene expression signatures. Nevertheless, statistical methods require an arbitrary threshold to be defined, and supervised machine learning methods can be ineffective when applied to high-dimensional datasets like microarrays. We propose a methodology to identify gene expression signatures by merging microarray datasets. This methodology uses statistical methods to obtain several sets of differentially expressed genes and uses supervised machine learning algorithms to select the gene expression signature. This methodology was validated using two distinct research applications: one using heart failure and the other using autism spectrum disorder microarray datasets. For the first, we obtained a gene expression signature composed of 117 genes, with a classification accuracy of approximately 98%. For the second use case, we obtained a gene expression signature composed of 79 genes, with a classification accuracy of approximately 82%. This methodology was implemented in R language and is available, under the MIT licence, at https://github.com/bioinformatics-ua/MicroGES.
Collapse
Affiliation(s)
- Olga Fajarda
- DETI/IEETA, LASI, University of Aveiro, Aveiro, Portugal.
| | - João Rafael Almeida
- DETI/IEETA, LASI, University of Aveiro, Aveiro, Portugal; Department of Computation, University of A Coruña, A Coruña, Spain.
| | - Sara Duarte-Pereira
- DETI/IEETA, LASI, University of Aveiro, Aveiro, Portugal; Department of Medical Sciences and iBiMED-Institute of Biomedicine, University of Aveiro, Aveiro, Portugal.
| | - Raquel M Silva
- Universidade Católica Portuguesa, Faculty of Dental Medicine (FMD), Center for Interdisciplinary Research in Health (CIIS), Viseu, Portugal.
| | | |
Collapse
|
31
|
Liu Y, Ma J, Wang X, Liu P, Cai C, Han Y, Zeng S, Feng Z, Shen H. Lipophagy-related gene RAB7A is involved in immune regulation and malignant progression in hepatocellular carcinoma. Comput Biol Med 2023; 158:106862. [PMID: 37044053 DOI: 10.1016/j.compbiomed.2023.106862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 02/05/2023] [Accepted: 03/30/2023] [Indexed: 04/08/2023]
Abstract
BACKGROUND RAB7A (RAS-related in Brain 7A) is an important member of the RAS oncogene family. However, the correlation between RAB7A and the development and immune infiltration of hepatocellular carcinoma (HCC) has rarely been studied. Here, we studied the role of RAB7A in HCC through bioinformatics analysis, real-world cohort validation, and in vitro experimental exploration. MATERIALS AND METHODS The RAB7A expression level was analyzed through TCGA, HPA and TISIDB databases. TIMER and TISCH were used to analyze the correlation between RAB7A and tumor immune microenvironment. The expression of RAB7A was detected through real-time PCR and western blotting. The cell proliferation was detected by EdU and CCK8. Wound-healing and transwell assays were used to test the invasion and migration ability. Cell cycle distribution and reactive oxygen species (ROS) content were analyzed by flow cytometry. Identification of epithelial-mesenchymal transition (EMT) was performed by immunofluorescence double staining. Immunohistochemistry (IHC) was used to evaluate the correlation between RAB7A and immune checkpoints. RESULTS RAB7A is upregulated in most of the tumor types, and the upregulation of RAB7A is associated with a poorer prognosis in many cancers. The results showed that RAB7A was significantly positively correlated with the infiltration of macrophages and cancer-associated fibroblasts (CAFs), but negatively correlated with M2-type macrophages in most tumors. The single-cell atlas also revealed the distribution and proportion of RAB7A in immune cells of HCC. The in vitro experiments suggested that RAB7A was increased in HCC tissue and cell lines. The knockdown of RAB7A inhibited the activation of the PIK3CA-AKT pathway and suppressed the expression of CDK4, CDK6 and CCNA2. Knockdown of RAB7A induced G0/G1 arrest and ROS accumulation in HCC. In addition, overexpression of RAB7A enhanced migration and invasion by inducing EMT. The real-world cohort showed that the expression level of RAB7A was positively correlated with the expression levels of TGFBR1 and PD-L1. CONCLUSIONS RAB7A may serve as a potential tumor prognostic and immune infiltration-related biomarker, predicting immunotherapy efficacy in certain cancer types, especially in HCC. Besides, RAB7A was a multi-pathway target involved in the malignant progression of HCC.
Collapse
Affiliation(s)
- Yongting Liu
- Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China.
| | - Jiayao Ma
- Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China.
| | - Xinwen Wang
- Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China.
| | - Ping Liu
- Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China.
| | - Changjing Cai
- Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China.
| | - Ying Han
- Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China.
| | - Shan Zeng
- Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China.
| | - Ziyang Feng
- Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China.
| | - Hong Shen
- Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China.
| |
Collapse
|
32
|
Pan J, Gao Y, Han H, Pan T, Guo J, Li S, Xu J, Li Y. Multi-omics characterization of RNA binding proteins reveals disease comorbidities and potential drugs in COVID-19. Comput Biol Med 2023; 155:106651. [PMID: 36805221 PMCID: PMC9916187 DOI: 10.1016/j.compbiomed.2023.106651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Revised: 02/02/2023] [Accepted: 02/08/2023] [Indexed: 02/12/2023]
Abstract
The COVID-19 has led to a devastating global health crisis, which emphasizes the urgent need to deepen our understanding of the molecular mechanism and identifying potential antiviral drugs. Here, we comprehensively analyzed the transcriptomic and proteomic profiles of 178 COVID-19 patients, ranging from asymptomatic to critically ill. Our analyses found that the RNA binding proteins (RBPs) were likely to be perturbed in infection. Interactome analysis revealed that RBPs interact with virus proteins and the viral interacting RBPs were likely to locate in central regions of human protein-protein interaction network. Functional enrichment analysis revealed that the viral interacting RBPs were likely to be enriched in RNA transport, apoptosis and viral genome replication-related pathways. Based on network proximity analyses of 299 human complex-disease genes and COVID-19-related RBPs in the human interactome, we revealed the significant associations between complex diseases and COVID-19. Network analysis also implicated potential antiviral drugs for treatment of COVID-19. In summary, our integrative characterization of COVID-19 patients may thus help providing evidence regarding pathophysiology and potential therapeutic strategies for COVID-19.
Collapse
Affiliation(s)
- Jiwei Pan
- NHC Key Laboratory of Tropical Disease Control, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China
| | - Yueying Gao
- NHC Key Laboratory of Tropical Disease Control, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China
| | - Huirui Han
- NHC Key Laboratory of Tropical Disease Control, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China
| | - Tao Pan
- NHC Key Laboratory of Tropical Disease Control, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China
| | - Jing Guo
- NHC Key Laboratory of Tropical Disease Control, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China
| | - Si Li
- NHC Key Laboratory of Tropical Disease Control, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China
| | - Juan Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China.
| | - Yongsheng Li
- NHC Key Laboratory of Tropical Disease Control, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China.
| |
Collapse
|
33
|
Nalepa J, Kotowski K, Machura B, Adamski S, Bozek O, Eksner B, Kokoszka B, Pekala T, Radom M, Strzelczak M, Zarudzki L, Krason A, Arcadu F, Tessier J. Deep learning automates bidimensional and volumetric tumor burden measurement from MRI in pre- and post-operative glioblastoma patients. Comput Biol Med 2023; 154:106603. [PMID: 36738710 DOI: 10.1016/j.compbiomed.2023.106603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 01/11/2023] [Accepted: 01/22/2023] [Indexed: 02/05/2023]
Abstract
Tumor burden assessment by magnetic resonance imaging (MRI) is central to the evaluation of treatment response for glioblastoma. This assessment is, however, complex to perform and associated with high variability due to the high heterogeneity and complexity of the disease. In this work, we tackle this issue and propose a deep learning pipeline for the fully automated end-to-end analysis of glioblastoma patients. Our approach simultaneously identifies tumor sub-regions, including the enhancing tumor, peritumoral edema and surgical cavity in the first step, and then calculates the volumetric and bidimensional measurements that follow the current Response Assessment in Neuro-Oncology (RANO) criteria. Also, we introduce a rigorous manual annotation process which was followed to delineate the tumor sub-regions by the human experts, and to capture their segmentation confidences that are later used while training deep learning models. The results of our extensive experimental study performed over 760 pre-operative and 504 post-operative adult patients with glioma obtained from the public database (acquired at 19 sites in years 2021-2020) and from a clinical treatment trial (47 and 69 sites for pre-/post-operative patients, 2009-2011) and backed up with thorough quantitative, qualitative and statistical analysis revealed that our pipeline performs accurate segmentation of pre- and post-operative MRIs in a fraction of the manual delineation time (up to 20 times faster than humans). Volumetric measurements were in strong agreement with experts with the Intraclass Correlation Coefficient (ICC): 0.959, 0.703, 0.960 for ET, ED, and cavity. Similarly, automated RANO compared favorably with experienced readers (ICC: 0.681 and 0.866) producing consistent and accurate results. Additionally, we showed that RANO measurements are not always sufficient to quantify tumor burden. The high performance of the automated tumor burden measurement highlights the potential of the tool for considerably improving and simplifying radiological evaluation of glioblastoma in clinical trials and clinical practice.
Collapse
Affiliation(s)
- Jakub Nalepa
- Graylight Imaging, Gliwice, Poland; Department of Algorithmics and Software, Silesian University of Technology, Gliwice, Poland.
| | | | | | | | - Oskar Bozek
- Department of Radiodiagnostics and Invasive Radiology, School of Medicine in Katowice, Medical University of Silesia in Katowice, Katowice, Poland
| | - Bartosz Eksner
- Department of Radiology and Nuclear Medicine, ZSM Chorzów, Chorzów, Poland
| | - Bartosz Kokoszka
- Department of Radiodiagnostics, Interventional Radiology and Nuclear Medicine, University Clinical Centre, Katowice, Poland
| | - Tomasz Pekala
- Department of Radiodiagnostics, Interventional Radiology and Nuclear Medicine, University Clinical Centre, Katowice, Poland
| | - Mateusz Radom
- Department of Radiology and Diagnostic Imaging, Maria Skłodowska-Curie National Research Institute of Oncology, Gliwice Branch, Gliwice, Poland
| | - Marek Strzelczak
- Department of Radiology and Diagnostic Imaging, Maria Skłodowska-Curie National Research Institute of Oncology, Gliwice Branch, Gliwice, Poland
| | - Lukasz Zarudzki
- Department of Radiology and Diagnostic Imaging, Maria Skłodowska-Curie National Research Institute of Oncology, Gliwice Branch, Gliwice, Poland
| | - Agata Krason
- Roche Pharmaceutical Research & Early Development, Early Clinical Development Oncology, Roche Innovation Center Basel, Basel, Switzerland
| | - Filippo Arcadu
- Roche Pharmaceutical Research & Early Development, Early Clinical Development Informatics, Roche Innovation Center Basel, Basel, Switzerland
| | - Jean Tessier
- Roche Pharmaceutical Research & Early Development, Early Clinical Development Oncology, Roche Innovation Center Basel, Basel, Switzerland
| |
Collapse
|
34
|
Yang Y, Cao Y, Han X, Ma X, Li R, Wang R, Xiao L, Xie L. Revealing EXPH5 as a potential diagnostic gene biomarker of the late stage of COPD based on machine learning analysis. Comput Biol Med 2023; 154:106621. [PMID: 36746116 DOI: 10.1016/j.compbiomed.2023.106621] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 01/19/2023] [Accepted: 01/28/2023] [Indexed: 02/01/2023]
Abstract
Chronic obstructive pulmonary disease is a kind of chronic lung disease characterized by persistent air flow obstruction, which was the third leading cause of death in China. The incidence of COPD is steadily and increasing and has been a globally sever disease. Accordingly, it is urgently needed to explore how to diagnose and treat COPD timely. This study aims to find key genes to diagnose COPD as soon as possible to avoid COPD processing and analyze immune cell infiltration between COPD early stage and late stage. Two GEO datasets were merged as the merge data for analyses. 157 DEGs were used for GSEA analysis to find the pathway between COPD early stage and late stage. Above all, gene EXPH5 stood out from the screen as the most likely candidate diagnosis biomarker of COPD indicating the late-stage by least LASSO and SVM-RFE. ROC curves of EXPH5 were applied to represent the discriminatory ability through the area under the curve which is the gold standard to evaluate the accuracy of diagnosis and survival rate. The CIBERSORT algorithm was used to assess the distribution of tissue-infiltrating immune cells between two COPD stages. The diagnosis biomarker, gene EXPH5 had a positive correlation with NK cells resting; mast cell resting, eosinophils, and negative correlation with T cell gamma delta, macrophages M1, which underscore the role of gene and immune cell infiltration. To make results more reliable, we further analyzed the gene EXPH5 expression in single-cell transcriptome data and showed again that EXPH5 genes significantly downregulated in the late stage of COPD especially in the main lung cell types AT1 and AT2. In a word, our study identified genes EXPH5 as a marker gene, which adds to the knowledge for clinical diagnosis and pharmaceutical design of COPD.
Collapse
Affiliation(s)
- Yuwei Yang
- College of Pulmonary & Critical Care Medicine, Chinese PLA General Hospital, Beijing, 100091, China; Beijing Key Laboratory of OTIR, Beijing, 100091, China.
| | - Yan Cao
- College of Pulmonary & Critical Care Medicine, Chinese PLA General Hospital, Beijing, 100091, China; Beijing Key Laboratory of OTIR, Beijing, 100091, China.
| | - Xiaobo Han
- College of Pulmonary & Critical Care Medicine, Chinese PLA General Hospital, Beijing, 100091, China; Beijing Key Laboratory of OTIR, Beijing, 100091, China.
| | - Xihui Ma
- College of Pulmonary & Critical Care Medicine, Chinese PLA General Hospital, Beijing, 100091, China; Beijing Key Laboratory of OTIR, Beijing, 100091, China.
| | - Rui Li
- College of Pulmonary & Critical Care Medicine, Chinese PLA General Hospital, Beijing, 100091, China; Hebei North Universit, Zhangjiakou, 075000, China.
| | - Rentao Wang
- College of Pulmonary & Critical Care Medicine, Chinese PLA General Hospital, Beijing, 100091, China; Beijing Key Laboratory of OTIR, Beijing, 100091, China.
| | - Li Xiao
- College of Pulmonary & Critical Care Medicine, Chinese PLA General Hospital, Beijing, 100091, China; Beijing Key Laboratory of OTIR, Beijing, 100091, China.
| | - Lixin Xie
- College of Pulmonary & Critical Care Medicine, Chinese PLA General Hospital, Beijing, 100091, China; Beijing Key Laboratory of OTIR, Beijing, 100091, China.
| |
Collapse
|
35
|
Baran Y, Doğan B. scMAGS: Marker gene selection from scRNA-seq data for spatial transcriptomics studies. Comput Biol Med 2023; 155:106634. [PMID: 36774895 DOI: 10.1016/j.compbiomed.2023.106634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Revised: 01/28/2023] [Accepted: 02/04/2023] [Indexed: 02/11/2023]
Abstract
Single-Cell RNA sequencing (scRNA-seq) has provided unprecedented opportunities for exploring gene expression and thus uncovering regulatory relationships between genes at the single-cell level. However, scRNA-seq relies on isolating cells from tissues. Therefore, the spatial context of the regulatory processes is lost. A recent technological innovation, spatial transcriptomics, allows for the measurement of gene expression while preserving spatial information. An initial step in the spatial transcriptomic analysis is to identify the cell type, which requires a careful selection of cell-specific marker genes. For this purpose, currently, scRNA-seq data is used to select a limited number of marker genes from among all genes that distinguish cell types from each other. This study proposes scMAGS (single-cell MArker Gene Selection), a novel method for marker gene selection from scRNA-seq data for spatial transcriptomics studies. scMAGS uses a filtering step in which the candidate genes are identified before the marker gene selection step. For the selection of marker genes, cluster validity indices, the Silhouette index, or the Calinski-Harabasz index (for large datasets) are utilized. Experimental results showed that, in comparison to the existing methods, scMAGS is scalable, fast, and accurate. Even for large datasets with millions of cells, scMAGS could find the required number of marker genes in a reasonable amount of time with fewer memory requirements. scMAGS is made freely available at https://github.com/doganlab/scmags and can be downloaded from the Python Package Directory (PyPI) software repository with the command pip install scmags.
Collapse
Affiliation(s)
- Yusuf Baran
- Department of Biomedical Engineering, Inonu University, Malatya, Turkey
| | - Berat Doğan
- Department of Biomedical Engineering, Inonu University, Malatya, Turkey.
| |
Collapse
|
36
|
Moon S, Kim HJ, Lee Y, Lee YJ, Jung S, Lee JS, Hahn SH, Kim K, Roh JY, Nam S. Oncogenic signaling pathways and hallmarks of cancer in Korean patients with acral melanoma. Comput Biol Med 2023; 154:106602. [PMID: 36716688 DOI: 10.1016/j.compbiomed.2023.106602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 12/23/2022] [Accepted: 01/22/2023] [Indexed: 01/25/2023]
Abstract
Acral melanoma (AM), a rare subtype of cutaneous melanoma, shows higher incidence in Asians, including Koreans, than in Caucasians. However, the genetic modification associated with AM in Koreans is not well known and has not been comprehensively investigated in terms of oncogenic signaling, and hallmarks of cancer. We performed whole-exome and RNA sequencing for Korean patients with AM and acquired the genetic alterations and gene expression profiles. KIT alterations (previously known to be recurrent alterations in AM) and CDK4/CCND1 copy number amplifications were identified in the patients. Genetic and transcriptomic alterations in patients with AM were functionally converge to the hallmarks of cancer and oncogenic pathways, including 'proliferative signal persistence', 'apoptotic resistance', and 'activation of invasion and metastasis', despite the heterogeneous somatic mutation profiles of Korean patients with AM. This study may provide a molecular understanding for therapeutic strategy for AM.
Collapse
Affiliation(s)
- SeongRyeol Moon
- Department of Health Sciences and Technology, Gachon Advanced Institute for Health Sciences and Technology, Gachon University, Incheon, 21999, South Korea
| | - Hee Joo Kim
- Department of Dermatology, Gachon University Gil Medical Center, Gachon University College of Medicine, Incheon, 21565, South Korea
| | - Yeeun Lee
- Department of Health Sciences and Technology, Gachon Advanced Institute for Health Sciences and Technology, Gachon University, Incheon, 21999, South Korea
| | - Yu Joo Lee
- Department of Genome Medicine and Science, Gachon Institute of Genome Medicine and Science, Gachon University Gil Medical Center, Gachon University College of Medicine, Incheon, 21565, South Korea
| | - Sungwon Jung
- Department of Genome Medicine and Science, Gachon Institute of Genome Medicine and Science, Gachon University Gil Medical Center, Gachon University College of Medicine, Incheon, 21565, South Korea
| | - Jin Sook Lee
- Department of Genome Medicine and Science, Gachon Institute of Genome Medicine and Science, Gachon University Gil Medical Center, Gachon University College of Medicine, Incheon, 21565, South Korea; Department of Pediatrics, Seoul National University Hospital Child Cancer and Rare Disease Administration, Seoul National University Children's Hospital, Seoul, 03080, South Korea
| | - Si Houn Hahn
- Division of Genetic Medicine, Department of Pediatrics, University of Washington School of Medicine, Seattle Children's Hospital, Seattle, WA, 98105, USA
| | | | - Joo Young Roh
- Department of Dermatology, Ewha Womans University College of Medicine, Seoul Hospital, Seoul, 07804, South Korea.
| | - Seungyoon Nam
- Department of Health Sciences and Technology, Gachon Advanced Institute for Health Sciences and Technology, Gachon University, Incheon, 21999, South Korea; Department of Genome Medicine and Science, Gachon Institute of Genome Medicine and Science, Gachon University Gil Medical Center, Gachon University College of Medicine, Incheon, 21565, South Korea; AI Convergence Center for Medical Science, Gachon University College of Medicine, Incheon, 21565, South Korea.
| |
Collapse
|
37
|
Huang H, Cai X, Lin J, Wu Q, Zhang K, Lin Y, Liu B, Lin J. A novel five-gene metabolism-related risk signature for predicting prognosis and immune infiltration in endometrial cancer: A TCGA data mining. Comput Biol Med 2023; 155:106632. [PMID: 36805217 DOI: 10.1016/j.compbiomed.2023.106632] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 01/01/2023] [Accepted: 02/04/2023] [Indexed: 02/16/2023]
Abstract
BACKGROUND Metabolism dysfunction can affect the biological behavior of tumor cells and result in carcinogenesis and the development of various cancers. However, few thoughtful studies focus on the predictive value and efficacy of immunotherapy of metabolism-related gene signatures in endometrial cancer (EC). This research aims to construct a predictive metabolism-related gene signature in EC with prognostic and therapeutic implications. METHODS We downloaded the RNA profile and clinical data of 503 EC patients and screened out different expressions of metabolism-related genes with prognosis influence of EC from The Cancer Genome Atlas (TCGA) database. We first established a metabolism-related genes model using univariate and multivariate Cox regression and Lasso regression analysis. To internally validate the predictive model, 503 samples (entire set) were randomly assigned into the test set and the train set. Then, we applied the receiver operating characteristic (ROC) curve to confirm our previous predictive model and depicted a nomogram integrating the risk score and the clinicopathological feature. We employed a gene set enrichment analysis (GSEA) to explore the biological processes and pathways of the model. Afterward, we used ESTIMATE to evaluate the TME. Also, we adopted CIBERSORT and ssGSEA to estimate the fraction of immune infiltrating cells and immune function. At last, we investigated the relationship between the predictive model and immune checkpoint genes. RESULTS We first constructed a predictive model based on five metabolism-related genes (INPP5K, PLPP2, MBOAT2, DDC, and ITPKA). This model showed the ability to predict EC patients' prognosis accurately and performed well in the train set, test set, and entire set. Then we confirmed the predictive signature was a novel independent prognostic factor in EC patients. In addition, we drew and validated a nomogram to precisely predict the survival rate of EC patients at 1-, 3-, and 5-years (ROC1-year = 0.714, ROC3-year = 0.750, ROC5-year = 0.767). Furthermore, GSEA unveiled that the cell cycle, certain malignant tumors, and cell metabolism were the main biological functions enriched in this identified model. We found the five metabolism-related genes signature was associated with the immune infiltrating cells and immune functions. Most importantly, it was linked with specific immune checkpoints (PD-1, CTLA4, and CD40) that could predict immunotherapy's clinical response. CONCLUSION The metabolism-related genes signature (INPP5K, PLPP2, MBOAT2, DDC, and ITPKA) is a valuable index for predicting the survival outcomes and efficacy of immunotherapy for EC in clinical settings.
Collapse
Affiliation(s)
- Huaqing Huang
- Department of Pain Medicine, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, Fujian Province, China; Pain Research Institute of Fujian Medical University, Fuzhou, Fujian Province, China
| | - Xintong Cai
- Department of Gynecology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, Fujian Province, China
| | - Jiexiang Lin
- Shengli Clinical Medical College, Fujian Medical University, Fuzhou, China
| | - Qiaoling Wu
- Department of Gynecology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, Fujian Province, China
| | - Kailin Zhang
- Department of Pathology, Fujian Medical University Union Hospital, Fuzhou, China
| | - Yibin Lin
- Department of Gynecology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, Fujian Province, China
| | - Bin Liu
- Department of Gynecology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, Fujian Province, China
| | - Jie Lin
- Department of Gynecology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, Fujian Province, China.
| |
Collapse
|
38
|
Marjit S, Bhattacharyya T, Chatterjee B, Sarkar R. Simulated annealing aided genetic algorithm for gene selection from microarray data. Comput Biol Med 2023; 158:106854. [PMID: 37023541 DOI: 10.1016/j.compbiomed.2023.106854] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 02/26/2023] [Accepted: 03/30/2023] [Indexed: 04/03/2023]
Abstract
In recent times, microarray gene expression datasets have gained significant popularity due to their usefulness to identify different types of cancer directly through bio-markers. These datasets possess a high gene-to-sample ratio and high dimensionality, with only a few genes functioning as bio-markers. Consequently, a significant amount of data is redundant, and it is essential to filter out important genes carefully. In this paper, we propose the Simulated Annealing aided Genetic Algorithm (SAGA), a meta-heuristic approach to identify informative genes from high-dimensional datasets. SAGA utilizes a two-way mutation-based Simulated Annealing (SA) as well as Genetic Algorithm (GA) to ensure a good trade-off between exploitation and exploration of the search space, respectively. The naive version of GA often gets stuck in a local optimum and depends on the initial population, leading to premature convergence. To address this, we have blended a clustering-based population generation with SA to distribute the initial population of GA over the entire feature space. To further enhance the performance, we reduce the initial search space by a score-based filter approach called the Mutually Informed Correlation Coefficient (MICC). The proposed method is evaluated on 6 microarray and 6 omics datasets. Comparison of SAGA with contemporary algorithms has shown that SAGA performs much better than its peers. Our code is available at https://github.com/shyammarjit/SAGA.
Collapse
|
39
|
Rather AA, Chachoo MA. Robust correlation estimation and UMAP assisted topological analysis of omics data for disease subtyping. Comput Biol Med 2023; 155:106640. [PMID: 36774889 DOI: 10.1016/j.compbiomed.2023.106640] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 01/08/2023] [Accepted: 02/05/2023] [Indexed: 02/10/2023]
Abstract
Deciphering information hidden in the gene expression assays for identifying disease subtypes has significant importance in precision medicine. However, computational limitations thwart this process due to the intricacy of the biological networks and the curse of dimensionality of gene expression data. Therefore, clustering in such scenarios often becomes the first choice of exploratory data analysis to identify natural structures and intrinsic patterns in the data. However, sparse and high dimensional nature of omics data prevents conventional clustering algorithms to discover subtypes that are clinically relevant and statistically significant. Hence, non-linear dimensionality reduction techniques coupled with clustering in such scenarios often becomes imperative to improve the clustering results. In this study, we present a robust pipeline to discover disease subtypes with clinical relevance. Specifically, we focus on discovering patient sub-groups that have a residual life patterns remarkably different from other sub-groups. This is significant because by refining prognosis, subtyping can reduce uncertainty in approximating patients expected outcome. The methodology present is based on robust correlation estimation, UMAP- a non-linear dimensionality reduction method and mapper- a tool from topology. Notably, we suggest a method for improving the robustness of the correlation matrix of gene expression data for improving the clustering results. The performance of the model is evaluated by applying to five cancer datasets obtained through TCGA and comparisons are performed with some state of the art methods of NEMO, RSC-OTRI and SNF with regard to log-rank test and Restricted Life Expectancy Difference. For example in GBM dataset, the minimum separation for any two discovered subtypes is 221 days which is significantly higher than the other methodologies. We also compared the results without using the robust correlation based estimate and observed that robust correlation improves separability between survival curves significantly. From the results we infer that our methodology performs better compared to other methodologies with regard to separating survival curves of patient sub-groups despite using single omics profiles of patients compared to multiple omics profiles of SNF and NEMO. Pathway over-representation analysis is performed on the final clustering results to investigate the biological underpinnings characterizing each subtype.
Collapse
Affiliation(s)
- Arif Ahmad Rather
- Department of Computer Sciences, University of Kashmir, Srinagar, JK, India.
| | | |
Collapse
|
40
|
Zafari N, Bathaei P, Velayati M, Khojasteh-Leylakoohi F, Khazaei M, Fiuji H, Nassiri M, Hassanian SM, Ferns GA, Nazari E, Avan A. Integrated analysis of multi-omics data for the discovery of biomarkers and therapeutic targets for colorectal cancer. Comput Biol Med 2023; 155:106639. [PMID: 36805214 DOI: 10.1016/j.compbiomed.2023.106639] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 01/14/2023] [Accepted: 02/05/2023] [Indexed: 02/12/2023]
Abstract
The considerable burden of colorectal cancer and the rising trend in young adults emphasize the necessity of understanding its underlying mechanisms, providing new diagnostic and prognostic markers, and improving therapeutic approaches. Precision medicine is a new trend all over the world and identification of novel biomarkers and therapeutic targets is a step forward towards this trend. In this context, multi-omics data and integrated analysis are being investigated to develop personalized medicine in the management of colorectal cancer. Given the large amount of data from multi-omics approach, data integration and analysis is a great challenge. In this Review, we summarize how statistical and machine learning techniques are applied to analyze multi-omics data and how it contributes to the discovery of useful diagnostic and prognostic biomarkers and therapeutic targets. Moreover, we discuss the importance of these biomarkers and therapeutic targets in the clinical management of colorectal cancer in the future. Taken together, integrated analysis of multi-omics data has great potential for finding novel diagnostic and prognostic biomarkers and therapeutic targets, however, there are still challenges to overcome in future studies.
Collapse
Affiliation(s)
- Nima Zafari
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Parsa Bathaei
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mahla Velayati
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Fatemeh Khojasteh-Leylakoohi
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran; Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Khazaei
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Hamid Fiuji
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mohammadreza Nassiri
- Recombinant Proteins Research Group, The Research Institute of Biotechnology, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Seyed Mahdi Hassanian
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Gordon A Ferns
- Brighton & Sussex Medical School, Division of Medical Education, Falmer, Brighton, Sussex, BN1 9PH, UK
| | - Elham Nazari
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Amir Avan
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran; Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.
| |
Collapse
|
41
|
He H, Duo H, Hao Y, Zhang X, Zhou X, Zeng Y, Li Y, Li B. Computational drug repurposing by exploiting large-scale gene expression data: Strategy, methods and applications. Comput Biol Med 2023; 155:106671. [PMID: 36805225 DOI: 10.1016/j.compbiomed.2023.106671] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 02/05/2023] [Accepted: 02/10/2023] [Indexed: 02/18/2023]
Abstract
De novo drug development is an extremely complex, time-consuming and costly task. Urgent needs for therapies of various diseases have greatly accelerated searches for more effective drug development methods. Luckily, drug repurposing provides a new and effective perspective on disease treatment. Rapidly increased large-scale transcriptome data paints a detailed prospect of gene expression during disease onset and thus has received wide attention in the field of computational drug repurposing. However, how to efficiently mine transcriptome data and identify new indications for old drugs remains a critical challenge. This review discussed the irreplaceable role of transcriptome data in computational drug repurposing and summarized some representative databases, tools and strategies. More importantly, it proposed a practical guideline through establishing the correspondence between three gene expression data types and five strategies, which would facilitate researchers to adopt appropriate strategies to deeply mine large-scale transcriptome data and discover more effective therapies.
Collapse
Affiliation(s)
- Hao He
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China; State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Institutes of Brain Science, Fudan University, Shanghai, 200032, PR China
| | - Hongrui Duo
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Youjin Hao
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Xiaoxi Zhang
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Xinyi Zhou
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Yujie Zeng
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Yinghong Li
- The Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, PR China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China.
| |
Collapse
|
42
|
Yan TC, Yue ZX, Xu HQ, Liu YH, Hong YF, Chen GX, Tao L, Xie T. A systematic review of state-of-the-art strategies for machine learning-based protein function prediction. Comput Biol Med 2023; 154:106446. [PMID: 36680931 DOI: 10.1016/j.compbiomed.2022.106446] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 12/07/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]
Abstract
New drug discovery is inseparable from the discovery of drug targets, and the vast majority of the known targets are proteins. At the same time, proteins are essential structural and functional elements of living cells necessary for the maintenance of all forms of life. Therefore, protein functions have become the focus of many pharmacological and biological studies. Traditional experimental techniques are no longer adequate for rapidly growing annotation of protein sequences, and approaches to protein function prediction using computational methods have emerged and flourished. A significant trend has been to use machine learning to achieve this goal. In this review, approaches to protein function prediction based on the sequence, structure, protein-protein interaction (PPI) networks, and fusion of multi-information sources are discussed. The current status of research on protein function prediction using machine learning is considered, and existing challenges and prominent breakthroughs are discussed to provide ideas and methods for future studies.
Collapse
Affiliation(s)
- Tian-Ci Yan
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Zi-Xuan Yue
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Hong-Quan Xu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yu-Hong Liu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yan-Feng Hong
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Gong-Xing Chen
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| | - Tian Xie
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| |
Collapse
|
43
|
Wang T, Sun J, Zhao Q. Investigating cardiotoxicity related with hERG channel blockers using molecular fingerprints and graph attention mechanism. Comput Biol Med 2023; 153:106464. [PMID: 36584603 DOI: 10.1016/j.compbiomed.2022.106464] [Citation(s) in RCA: 110] [Impact Index Per Article: 110.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 12/12/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]
Abstract
Human ether-a-go-go-related gene (hERG) channel blockade by small molecules is a big concern during drug development in the pharmaceutical industry. Failure or inhibition of hERG channel activity caused by drug molecules can lead to prolonging QT interval, which will result in serious cardiotoxicity. Thus, evaluating the hERG blocking activity of all these small molecular compounds is technically challenging, and the relevant procedures are expensive and time-consuming. In this study, we develop a novel deep learning predictive model named DMFGAM for predicting hERG blockers. In order to characterize the molecule more comprehensively, we first consider the fusion of multiple molecular fingerprint features to characterize its final molecular fingerprint features. Then, we use the multi-head attention mechanism to extract the molecular graph features. Both molecular fingerprint features and molecular graph features are fused as the final features of the compounds to make the feature expression of compounds more comprehensive. Finally, the molecules are classified into hERG blockers or hERG non-blockers through the fully connected neural network. We conduct 5-fold cross-validation experiment to evaluate the performance of DMFGAM, and verify the robustness of DMFGAM on external validation datasets. We believe DMFGAM can serve as a powerful tool to predict hERG channel blockers in the early stages of drug discovery and development.
Collapse
Affiliation(s)
- Tianyi Wang
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Jianqiang Sun
- School of Automation and Electrical Engineering, Linyi University, Linyi, 276000, China
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China.
| |
Collapse
|
44
|
Cheng N, Liu J, Chen C, Zheng T, Li C, Huang J. Prediction of lung cancer metastasis by gene expression. Comput Biol Med 2023; 153:106490. [PMID: 36638618 DOI: 10.1016/j.compbiomed.2022.106490] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 12/14/2022] [Accepted: 12/27/2022] [Indexed: 12/31/2022]
Abstract
Tumor metastasis is the main cause of death in cancer patients. Early prediction of tumor metastasis can allow for timely intervention. At present, research on tumor metastasis mainly focuses on manual diagnosis by imaging or diagnosis by computational methods. With the deterioration of the tumor, gene expression levels in blood change greatly. It is feasible to measure the transcripts of key genes to predict whether cancer will metastasize. Therefore, in this paper, we obtained gene expression data from 226 patients from TCGA. These data included 239,322 transcripts. Background screening and LASSO analysis were used to select 31 transcripts as features. Finally, a deep neural network (DNN) was used to determine whether or not lung cancer would metastasize. We compared our methods with several other methods and found that our method achieved the best precision. In addition, in a previous study, we identified 7 genes that play a vital role in lung cancer. We added those gene transcripts into the DNN and found that the AUC and AUPR of the model were increased.
Collapse
Affiliation(s)
- Nitao Cheng
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Junliang Liu
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Chen Chen
- Department of Biological Repositories, Zhongnan Hospital of Wuhan University, China
| | - Tang Zheng
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Changsheng Li
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Jingyu Huang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China.
| |
Collapse
|
45
|
Huang P, Yan L, Li Z, Zhao S, Feng Y, Zeng J, Chen L, Huang A, Chen Y, Lei S, Huang X, Deng Y, Xie D, Guan H, Peng W, Yu L, Chen B. Potential shared gene signatures and molecular mechanisms between atherosclerosis and depression: Evidence from transcriptome data. Comput Biol Med 2023; 152:106450. [PMID: 36565484 DOI: 10.1016/j.compbiomed.2022.106450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 12/09/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]
Abstract
BACKGROUND Atherosclerosis and depression contribute to each other; however, mechanisms linking them at the genetic level remain unexplored. This study aimed to identify shared gene signatures and related pathways between these comorbidities. METHODS Atherosclerosis-related datasets were downloaded from the Gene Expression Omnibus database. Differential and weighted gene co-expression network analyses were employed to identify atherosclerosis-related genes. Depression-related genes were downloaded from the DisGeNET database, and the overlaps between atherosclerosis-related genes and depression-related genes were characterized as crosstalk genes. The functional enrichment analysis and protein-protein interaction network were performed in these gene sets. Subsequently, the Boruta algorithm and Recursive Feature Elimination algorithm were performed to identify feature-selection genes. A support vector machine was constructed to measure the accuracy of calculations, and two external validation sets were included to verify the results. RESULTS Based on two atherosclerosis-related datasets (GSE28829 and GSE43292), 165 genes were determined as atherosclerosis-related genes. Meanwhile, 1478 depression-related genes were obtained. After intersecting, 24 crosstalk genes were identified, and two pathways, "lipid and atherosclerosis" and "tryptophan metabolism," were revealed as mutual pathways according to the enrichment analysis results. Through the protein-protein interaction network, Molecular Complex Detection plugin, and cytoHubba plugin, PTPRC and MMP9 were identified as the hub gene. Moreover, SLC22A3, CASP1, AMPD3, and PIK3CG were recognized as feature-selection genes. Based on two external validation sets, CASP1 and MMP9 were finally determined as the critical crosstalk genes. CONCLUSIONS "Lipid and atherosclerosis" and "tryptophan metabolism" were possibly the pathways of atherosclerosis secondary to depression and depression due to atherosclerosis, respectively. CASP1 and MMP9 were revealed as the most pivotal candidates linking atherosclerosis and depression by mediating these two pathways. Further experimentation is needed to confirm these conclusions.
Collapse
Affiliation(s)
- Peiying Huang
- The Second Clinical Medical School of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Li Yan
- Department of Neurosurgery of Shenyang Second Hospital of Traditional Chinese Medicine, Shenyang, China
| | - Zhishang Li
- Emergency Department of Guangdong Provincial Hospital of Traditional Chinese Medicine, Guangzhou, China
| | - Shuai Zhao
- Emergency Department of Guangdong Provincial Hospital of Traditional Chinese Medicine, Guangzhou, China
| | - Yuchao Feng
- Guangdong Provincial Key Laboratory of Research on Emergency in Traditional Chinese Medicine, Clinical Research Team of Prevention and Treatment of Cardiac Emergencies with Traditional Chinese Medicine, Guangzhou, China
| | - Jing Zeng
- Emergency Department of Guangdong Provincial Hospital of Traditional Chinese Medicine, Guangzhou, China
| | - Li Chen
- Emergency Department of Guangdong Provincial Hospital of Traditional Chinese Medicine, Guangzhou, China
| | - Afang Huang
- Departments of Laboratory Medicine of Foshan Forth People's Hospital, Foshan, China
| | - Yan Chen
- Emergency Department of Guangdong Provincial Hospital of Traditional Chinese Medicine, Guangzhou, China
| | - Sisi Lei
- The Second Clinical Medical School of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Xiaoyan Huang
- Emergency Department of Guangdong Provincial Hospital of Traditional Chinese Medicine, Guangzhou, China
| | - Yi Deng
- Emergency Department of Guangdong Provincial Hospital of Traditional Chinese Medicine, Guangzhou, China
| | - Dan Xie
- The Second Clinical Medical School of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Hansu Guan
- The Third Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Weihang Peng
- The Second Clinical Medical School of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Liyuan Yu
- The Second Clinical Medical School of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Bojun Chen
- The Second Clinical Medical School of Guangzhou University of Chinese Medicine, Guangzhou, China; Emergency Department of Guangdong Provincial Hospital of Traditional Chinese Medicine, Guangzhou, China; Guangdong Provincial Key Laboratory of Research on Emergency in Traditional Chinese Medicine, Clinical Research Team of Prevention and Treatment of Cardiac Emergencies with Traditional Chinese Medicine, Guangzhou, China.
| |
Collapse
|
46
|
Xiang J, Wang X, Wang X, Zhang J, Yang S, Yang W, Han X, Liu Y. Automatic diagnosis and grading of Prostate Cancer with weakly supervised learning on whole slide images. Comput Biol Med 2023; 152:106340. [PMID: 36481762 DOI: 10.1016/j.compbiomed.2022.106340] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 11/02/2022] [Accepted: 11/16/2022] [Indexed: 11/23/2022]
Abstract
BACKGROUND The workflow of prostate cancer diagnosis and grading is cumbersome and the results suffer from substantial inter-observer variability. Recent trials have shown potential in using machine learning to develop automated systems to address this challenge. Most automated deep learning systems for prostate cancer Gleason grading focused on supervised learning requiring demanding fine-grained pixel-level annotations. METHODS A weakly-supervised deep learning model with slide-level labels is presented in this study for the diagnosis and grading of prostate cancer with whole slide image (WSI). WSIs are first cropped into small patches and then processed with a deep learning model to extract patch-level features. A graph convolution network (GCN) is used to aggregate the features for classifications. Throughout the training process, the noisy labels are progressively filtered out to reduce inter-observer variations in clinical reports. Finally, multi-center independent test cohorts with 6,174 slides are collected to evaluate the prostate cancer diagnosis and grading performance of our model. RESULTS The cancer diagnosis (2-level classification) results on two external test sets (n= 4,675, n= 844) show an area under the receiver operating characteristic curve (AUC) of 0.985 and 0.986. The Gleason grading (6-level classification) results reach 0.931 quadratic weighted kappa on the internal test set (n= 531). It generalizes well on the external test dataset (n= 844) with 0.801 quadratic weighted kappa with the reference standard set independently. The model enables pathological meaningful interpretability by visualizing the most attended lesions which are highly consistent with expert annotations. CONCLUSION The proposed model incorporates a graph network in weakly supervised learning with only slide-level reports. A robust learning strategy is also employed to correct the label noise. It is highly accurate (>0.985 AUC for diagnosis) and also interpretable with intuitive heatmap visualization. It can be unified with a digital pathology pipeline to deliver prostate cancer metrics for a pathology report.
Collapse
Affiliation(s)
| | - Xiyue Wang
- College of Computer Science, Sichuan University, Chengdu, China
| | - Xinran Wang
- Department of Pathology, The Fourth Hospital of Hebei Medical University, Shijiazhuang, China
| | | | - Sen Yang
- AI Lab, Tencent, Shenzhen, China
| | - Wei Yang
- AI Lab, Tencent, Shenzhen, China
| | - Xiao Han
- AI Lab, Tencent, Shenzhen, China
| | - Yueping Liu
- Department of Pathology, The Fourth Hospital of Hebei Medical University, Shijiazhuang, China.
| |
Collapse
|
47
|
Zhao Y, Zhang L, Hu Q, Zhu D, Xie Z. Identification and analysis of C17orf53 as a prognostic signature for hepatocellular carcinoma. Comput Biol Med 2023; 152:106348. [PMID: 36470143 DOI: 10.1016/j.compbiomed.2022.106348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Revised: 10/28/2022] [Accepted: 11/22/2022] [Indexed: 11/30/2022]
Abstract
C17orf53 is a novel gene for DNA synthesis and homologous recombination. However, the exact role of C17orf53 in hepatocellular carcinoma (HCC) remains unclear. In this study, we analyzed it using a set of public datasets. UALCAN, Human Protein Atlas (HPA), Kaplan‒Meier Plotter, Tumor Immune Estimation Resource (TIMER), cBioPortal, GEPIA, GeneMANIA, and LinkedOmics were used. Functional analysis was conducted in SK-Hep-1 cells by using small interfering RNA (siRNA). C17orf53 was highly expressed and predicted unfavorable survival in HCC patients. Moreover, it showed positive correlations with the abundance of B cells, macrophages and dendritic cells. In addition, we identified 126 genes that were positively correlated with C17orf53 and its coeffector minichromosome maintenance 8 (MCM8). These genes were mainly enriched in the cell cycle, DNA replication and Fanconi anemia pathways. Knockdown of C17orf53 significantly inhibited the proliferation of SK-Hep-1 cells and decreased the expression of MCM8, cyclin D1 and proliferating cell nuclear antigen (PCNA). Overall, C17orf53 is a novel prognostic signature for HCC.
Collapse
Affiliation(s)
- Yalei Zhao
- Department of Infectious Diseases, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
| | - Lingjian Zhang
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, China
| | - Qingqing Hu
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, China
| | - Danhua Zhu
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, China
| | - Zhongyang Xie
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, China.
| |
Collapse
|
48
|
Liu C, Zhou Y, Zhou Y, Tang X, Tang L, Wang J. Identification of crucial genes for predicting the risk of atherosclerosis with system lupus erythematosus based on comprehensive bioinformatics analysis and machine learning. Comput Biol Med 2023; 152:106388. [PMID: 36470144 DOI: 10.1016/j.compbiomed.2022.106388] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 11/22/2022] [Accepted: 11/28/2022] [Indexed: 12/02/2022]
Abstract
BACKGROUND Systemic lupus erythematosus (SLE) has become a major public health problem over the years, and atherosclerosis (AS) is one of the main complications of SLE associated with serious cardiovascular consequences in this patient population. The present study aimed to identify potential biomarkers for SLE patients with AS. METHODS Five microarray datasets (GSE50772, GSE81622, GSE100927, GSE28829, GSE37356) were downloaded from the NCBI Gene Expression Omnibus database. The Limma package was used to identify differentially expressed genes (DEGs) in AS. Weighted gene coexpression network analysis (WGCNA) was used to identify significant module genes associated with SLE. Functional enrichment analysis, protein-protein interaction (PPI) network construction, and machine learning algorithms (least absolute shrinkage and selection operator (Lasso, Support Vector Machine-Recursive Feature Elimination (SVM-RFE), and random forest) were applied to identify hub genes. Subsequently, we generated a nomogram and receiver operating characteristic curve (ROC) for predicting the risk of AS in SLE patients. Finally, immune cell infiltrations were analyzed, and Consensus Cluster Analysis was conducted based on Single Sample Gene Set Enrichment Analysis (ssGSEA) scores. RESULTS Five hub genes (SPI1, MMP9, C1QA, CX3CR1, and MNDA) were identified and used to establish a nomogram that yielded a high predictive performance (area under the curve 0.900-0.981). Dysregulated immune cell infiltrations were found in AS, with positive correlations with the five hub genes. Consensus clustering showed that the optimal number of subtypes was 3. Compared to subtypes A and B, subtype C presented higher expression of the five hub genes, immune cell infiltration levels and immune checkpoint expression. CONCLUSION Our study systematically identified five candidate hub genes (SPI1, MMP9, C1QA, CX3CR1, MNDA) and established a nomogram that could predict the risk of AS with SLE using various bioinformatic analyses and machine learning algorithms. Our findings provide the foothold for future studies on potential crucial genes for AS in SLE patients. Additionally, the dysregulated immune cell proportions and immune checkpoint expressions in AS with SLE were identified.
Collapse
Affiliation(s)
- Chunjiang Liu
- Department of General Surgery, Division of Vascular Surgery, Shaoxing People's Hospital (Shaoxing Hospital of Zhejiang University), Shaoxing, 312000, China
| | - Yufei Zhou
- Department of Cardiology, Shanghai Institute of Cardiovascular Diseases, Zhongshan Hospital and Institutes of Biomedical Sciences, Fudan University, Shanghai, 200032, China
| | - Yue Zhou
- Department of General Surgery, Division of Vascular Surgery, Shaoxing People's Hospital (Shaoxing Hospital of Zhejiang University), Shaoxing, 312000, China
| | - Xiaoqi Tang
- Department of General Surgery, Division of Vascular Surgery, Shaoxing People's Hospital (Shaoxing Hospital of Zhejiang University), Shaoxing, 312000, China
| | - Liming Tang
- Department of General Surgery, Division of Vascular Surgery, Shaoxing People's Hospital (Shaoxing Hospital of Zhejiang University), Shaoxing, 312000, China.
| | - Jiajia Wang
- Department of Rheumatology, Shaoxing People's Hospital (Shaoxing Hospital of Zhejiang University), Shaoxing, 312000, China.
| |
Collapse
|
49
|
Yue ZX, Yan TC, Xu HQ, Liu YH, Hong YF, Chen GX, Xie T, Tao L. A systematic review on the state-of-the-art strategies for protein representation. Comput Biol Med 2023; 152:106440. [PMID: 36543002 DOI: 10.1016/j.compbiomed.2022.106440] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/08/2022] [Accepted: 12/15/2022] [Indexed: 12/23/2022]
Abstract
The study of drug-target protein interaction is a key step in drug research. In recent years, machine learning techniques have become attractive for research, including drug research, due to their automated nature, predictive power, and expected efficiency. Protein representation is a key step in the study of drug-target protein interaction by machine learning, which plays a fundamental role in the ultimate accomplishment of accurate research. With the progress of machine learning, protein representation methods have gradually attracted attention and have consequently developed rapidly. Therefore, in this review, we systematically classify current protein representation methods, comprehensively review them, and discuss the latest advances of interest. According to the information extraction methods and information sources, these representation methods are generally divided into structure and sequence-based representation methods. Each primary class can be further divided into specific subcategories. As for the particular representation methods involve both traditional and the latest approaches. This review contains a comprehensive assessment of the various methods which researchers can use as a reference for their specific protein-related research requirements, including drug research.
Collapse
Affiliation(s)
- Zi-Xuan Yue
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Tian-Ci Yan
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Hong-Quan Xu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yu-Hong Liu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yan-Feng Hong
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Gong-Xing Chen
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Tian Xie
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| |
Collapse
|
50
|
Zhou Y, Zhang Y, Li F, Lian X, Zhu Q, Zhu F, Qiu Y. SISPRO: signature identification for spatial proteomics. J Mol Biol 2023. [DOI: 10.1016/j.jmb.2022.167944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|