1
|
Papadimitrakis D, Perdikakis M, Gargalionis AN, Papavassiliou AG. Biomarkers in Cerebrospinal Fluid for the Diagnosis and Monitoring of Gliomas. Biomolecules 2024; 14:801. [PMID: 39062515 PMCID: PMC11274947 DOI: 10.3390/biom14070801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Revised: 07/01/2024] [Accepted: 07/03/2024] [Indexed: 07/28/2024] Open
Abstract
Gliomas are the most common type of malignant brain tumor and are characterized by a plethora of heterogeneous molecular alterations. Current treatments require the emergence of reliable biomarkers that will aid personalized treatment decisions and increase life expectancy. Glioma tissues are not as easily accessible as other solid tumors; therefore, detecting prominent biomarkers in biological fluids is necessary. Cerebrospinal fluid (CSF) circulates adjacent to the cerebral parenchyma and holds promise for discovering useful prognostic, diagnostic, and predictive biomarkers. In this review, we summarize extensive research regarding the role of circulating DNA, tumor cells, proteins, microRNAs, metabolites, and extracellular vesicles as potential CSF biomarkers for glioma diagnosis, prognosis, and monitoring. Future studies should address discrepancies and issues of specificity regarding CSF biomarkers, as well as the validation of candidate biomarkers.
Collapse
Affiliation(s)
- Dimosthenis Papadimitrakis
- Department of Biological Chemistry, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece; (D.P.); (M.P.)
| | - Miltiadis Perdikakis
- Department of Biological Chemistry, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece; (D.P.); (M.P.)
| | - Antonios N. Gargalionis
- Laboratory of Clinical Biochemistry, Medical School, ‘Attikon’ University General Hospital, National and Kapodistrian University of Athens, 12462 Athens, Greece
| | - Athanasios G. Papavassiliou
- Department of Biological Chemistry, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece; (D.P.); (M.P.)
| |
Collapse
|
2
|
Ding H, Xing F, Zou L, Zhao L. QSAR analysis of VEGFR-2 inhibitors based on machine learning, Topomer CoMFA and molecule docking. BMC Chem 2024; 18:59. [PMID: 38555462 PMCID: PMC10981835 DOI: 10.1186/s13065-024-01165-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 03/12/2024] [Indexed: 04/02/2024] Open
Abstract
VEGFR-2 kinase inhibitors are clinically approved drugs that can effectively target cancer angiogenesis. However, such inhibitors have adverse effects such as skin toxicity, gastrointestinal reactions and hepatic impairment. In this study, machine learning and Topomer CoMFA, which is an alignment-dependent, descriptor-based method, were employed to build structural activity relationship models of potentially new VEGFR-2 inhibitors. The prediction ac-curacy of the training and test sets of the 2D-SAR model were 82.4 and 80.1%, respectively, with KNN. Topomer CoMFA approach was then used for 3D-QSAR modeling of VEGFR-2 inhibitors. The coefficient of q2 for cross-validation of the model 1 was greater than 0.5, suggesting that a stable drug activity-prediction model was obtained. Molecular docking was further performed to simulate the interactions between the five most promising compounds and VEGFR-2 target protein and the Total Scores were all greater than 6, indicating that they had a strong hydrogen bond interactions were present. This study successfully used machine learning to obtain five potentially novel VEGFR-2 inhibitors to increase our arsenal of drugs to combat cancer.
Collapse
Affiliation(s)
- Hao Ding
- Department of Ultrasound, Shengjing Hospital of China Medical University, Shenyang, 110004, Liaoning, China
| | - Fei Xing
- Department of Oncology, Shengjing Hospital of China Medical University, Shenyang, 110004, Liaoning, China
| | - Lin Zou
- Medical College of Guangxi University, Nanning, 530004, Guangxi, China
| | - Liang Zhao
- Hepatobiliary and Splenic Surgery Ward, Department of General Surgery, Shengjing Hospital of China Medical University, Shenyang, 110004, Liaoning, China.
| |
Collapse
|
3
|
Shen Y, Huang J, Jia L, Zhang C, Xu J. Bioinformatics and machine learning driven key genes screening for hepatocellular carcinoma. Biochem Biophys Rep 2024; 37:101587. [PMID: 38107663 PMCID: PMC10724547 DOI: 10.1016/j.bbrep.2023.101587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Revised: 11/01/2023] [Accepted: 11/17/2023] [Indexed: 12/19/2023] Open
Abstract
Liver cancer, a global menace, ranked as the sixth most prevalent and third deadliest cancer in 2020. The challenge of early diagnosis and treatment, especially for hepatocellular carcinoma (HCC), persists due to late-stage detections. Understanding HCC's complex pathogenesis is vital for advancing diagnostics and therapies. This study combines bioinformatics and machine learning, examining HCC comprehensively. Three datasets underwent meticulous scrutiny, employing various analytical tools such as Gene Ontology (GO) function and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, protein interaction assessment, and survival analysis. These rigorous investigations uncovered twelve pivotal genes intricately linked with HCC's pathophysiological intricacies. Among them, CYP2C8, CYP2C9, EPHX2, and ESR1 were significantly positively correlated with overall patient survival, while AKR1B10 and NQO1 displayed a negative correlation. Moreover, the Adaboost prediction model yielded an 86.8 % accuracy, showcasing machine learning's potential in deciphering complex dataset patterns for clinically relevant predictions. These findings promise to contribute valuable insights into the elusive mechanisms driving liver cancer (HCC). They hold the potential to guide the development of more precise diagnostic methods and treatment strategies in the future. In the fight against this global health challenge, unraveling HCC's intricacies is of paramount importance.
Collapse
Affiliation(s)
- Ye Shen
- Department of Radiology, Wujin Hospital Affiliated with Jiangsu University, Changzhou, 213002, China
| | - Juanjie Huang
- Department of General Surgery, Dongguan Qingxi Hospital, Dongguan, 523660, China
| | - Lei Jia
- International Health Medicine Innovation Center, Shenzhen University, ShenZhen, 518060, China
| | - Chi Zhang
- Huaxia Eye Hospital of Foshan, Huaxia Eye Hospital Group, Foshan, Guangdong, 528000, China
| | - Jianxing Xu
- Department of Radiology, Wujin Hospital Affiliated with Jiangsu University, Changzhou, 213002, China
- Department of Radiology, The Wujin Clinical College of Xuzhou Medical University, Changzhou, 213002, China
| |
Collapse
|
4
|
Godlewski A, Czajkowski M, Mojsak P, Pienkowski T, Gosk W, Lyson T, Mariak Z, Reszec J, Kondraciuk M, Kaminski K, Kretowski M, Moniuszko M, Kretowski A, Ciborowski M. A comparison of different machine-learning techniques for the selection of a panel of metabolites allowing early detection of brain tumors. Sci Rep 2023; 13:11044. [PMID: 37422554 PMCID: PMC10329700 DOI: 10.1038/s41598-023-38243-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 07/05/2023] [Indexed: 07/10/2023] Open
Abstract
Metabolomics combined with machine learning methods (MLMs), is a powerful tool for searching novel diagnostic panels. This study was intended to use targeted plasma metabolomics and advanced MLMs to develop strategies for diagnosing brain tumors. Measurement of 188 metabolites was performed on plasma samples collected from 95 patients with gliomas (grade I-IV), 70 with meningioma, and 71 healthy individuals as a control group. Four predictive models to diagnose glioma were prepared using 10 MLMs and a conventional approach. Based on the cross-validation results of the created models, the F1-scores were calculated, then obtained values were compared. Subsequently, the best algorithm was applied to perform five comparisons involving gliomas, meningiomas, and controls. The best results were obtained using the newly developed hybrid evolutionary heterogeneous decision tree (EvoHDTree) algorithm, which was validated using Leave-One-Out Cross-Validation, resulting in an F1-score for all comparisons in the range of 0.476-0.948 and the area under the ROC curves ranging from 0.660 to 0.873. Brain tumor diagnostic panels were constructed with unique metabolites, which reduces the likelihood of misdiagnosis. This study proposes a novel interdisciplinary method for brain tumor diagnosis based on metabolomics and EvoHDTree, exhibiting significant predictive coefficients.
Collapse
Affiliation(s)
- Adrian Godlewski
- Clinical Research Centre, Medical University of Bialystok, M. Sklodowskiej-Curie 24a, 15-276, Białystok, Poland
| | - Marcin Czajkowski
- Faculty of Computer Science, Bialystok University of Technology, Białystok, Poland
| | - Patrycja Mojsak
- Clinical Research Centre, Medical University of Bialystok, M. Sklodowskiej-Curie 24a, 15-276, Białystok, Poland
| | - Tomasz Pienkowski
- Clinical Research Centre, Medical University of Bialystok, M. Sklodowskiej-Curie 24a, 15-276, Białystok, Poland
| | - Wioleta Gosk
- Clinical Research Centre, Medical University of Bialystok, M. Sklodowskiej-Curie 24a, 15-276, Białystok, Poland
| | - Tomasz Lyson
- Department of Neurosurgery, Medical University of Bialystok, Białystok, Poland
| | - Zenon Mariak
- Department of Neurosurgery, Medical University of Bialystok, Białystok, Poland
| | - Joanna Reszec
- Department of Medical Pathomorphology, Medical University of Bialystok, Białystok, Poland
| | - Marcin Kondraciuk
- Department of Population Medicine and Lifestyle Diseases Prevention, Medical University of Bialystok, Białystok, Poland
| | - Karol Kaminski
- Department of Population Medicine and Lifestyle Diseases Prevention, Medical University of Bialystok, Białystok, Poland
| | - Marek Kretowski
- Faculty of Computer Science, Bialystok University of Technology, Białystok, Poland
| | - Marcin Moniuszko
- Department of Regenerative Medicine and Immune Regulation, Medical University of Bialystok, Białystok, Poland
- Department of Allergology and Internal Medicine, Medical University of Bialystok, Białystok, Poland
| | - Adam Kretowski
- Clinical Research Centre, Medical University of Bialystok, M. Sklodowskiej-Curie 24a, 15-276, Białystok, Poland
- Department of Endocrinology, Diabetology and Internal Medicine, Medical University of Bialystok, Białystok, Poland
| | - Michal Ciborowski
- Clinical Research Centre, Medical University of Bialystok, M. Sklodowskiej-Curie 24a, 15-276, Białystok, Poland.
| |
Collapse
|
5
|
Lai Q, Liu X, Yang F, Li J, Xie Y, Qin W. Constructing metabolism-protein interaction relationship to identify glioma prognosis using deep learning. Comput Biol Med 2023; 158:106875. [PMID: 37058759 DOI: 10.1016/j.compbiomed.2023.106875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 03/08/2023] [Accepted: 03/30/2023] [Indexed: 04/05/2023]
Abstract
Glioma is heterogeneous disease that requires classification into subtypes with similar clinical phenotypes, prognosis or treatment responses. Metabolic-protein interaction (MPI) can provide meaningful insights into cancer heterogeneity. Moreover, the potential of lipids and lactate for identifying prognostic subtypes of glioma remains relatively unexplored. Therefore, we proposed a method to construct an MPI relationship matrix (MPIRM) based on a triple-layer network (Tri-MPN) combined with mRNA expression, and processed the MPIRM by deep learning to identify glioma prognostic subtypes. These Subtypes with significant differences in prognosis were detected in glioma (p-value < 2e-16, 95% CI). These subtypes had a strong correlation in immune infiltration, mutational signatures and pathway signatures. This study demonstrated the effectiveness of node interaction from MPI networks in understanding the heterogeneity of glioma prognosis.
Collapse
Affiliation(s)
- Qingpei Lai
- Shenzhen Institute of Advanced Technology, Chinese Academy of Science, 518055, Shenzhen, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, 518055, Shenzhen, China
| | - Xiang Liu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Science, 518055, Shenzhen, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, 518055, Shenzhen, China
| | - Fan Yang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Science, 518055, Shenzhen, China
| | - Jie Li
- Department of Infectious Diseases, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, 210008, Nanjing, Jiangsu, China
| | - Yaoqin Xie
- Shenzhen Institute of Advanced Technology, Chinese Academy of Science, 518055, Shenzhen, China
| | - Wenjian Qin
- Shenzhen Institute of Advanced Technology, Chinese Academy of Science, 518055, Shenzhen, China.
| |
Collapse
|
6
|
Ortiz-Vilchis P, De-la-Cruz-García JS, Ramirez-Arellano A. Identification of Relevant Protein Interactions with Partial Knowledge: A Complex Network and Deep Learning Approach. BIOLOGY 2023; 12:biology12010140. [PMID: 36671832 PMCID: PMC9856098 DOI: 10.3390/biology12010140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 01/11/2023] [Accepted: 01/12/2023] [Indexed: 01/18/2023]
Abstract
Protein-protein interactions (PPIs) are the basis for understanding most cellular events in biological systems. Several experimental methods, e.g., biochemical, molecular, and genetic methods, have been used to identify protein-protein associations. However, some of them, such as mass spectrometry, are time-consuming and expensive. Machine learning (ML) techniques have been widely used to characterize PPIs, increasing the number of proteins analyzed simultaneously and optimizing time and resources for identifying and predicting protein-protein functional linkages. Previous ML approaches have focused on well-known networks or specific targets but not on identifying relevant proteins with partial or null knowledge of the interaction networks. The proposed approach aims to generate a relevant protein sequence based on bidirectional Long-Short Term Memory (LSTM) with partial knowledge of interactions. The general framework comprises conducting a scale-free and fractal complex network analysis. The outcome of these analyses is then used to fine-tune the fractal method for the vital protein extraction of PPI networks. The results show that several PPI networks are self-similar or fractal, but that both features cannot coexist. The generated protein sequences (by the bidirectional LSTM) also contain an average of 39.5% of proteins in the original sequence. The average length of the generated sequences was 17% of the original one. Finally, 95% of the generated sequences were true.
Collapse
Affiliation(s)
- Pilar Ortiz-Vilchis
- Sección de Estudios de Posgrado e Investigación, Escuela Superior de Medicina, Instituto Politécnico Nacional, Mexico City 11340, Mexico
| | - Jazmin-Susana De-la-Cruz-García
- Sección de Estudios de Posgrado e Investigación, Unidad Profesional Interdisciplinaria de Ingeniería y Ciencias Sociales y Administrativas, Instituto Politécnico Nacional, Mexico City 08400, Mexico
| | - Aldo Ramirez-Arellano
- Sección de Estudios de Posgrado e Investigación, Unidad Profesional Interdisciplinaria de Ingeniería y Ciencias Sociales y Administrativas, Instituto Politécnico Nacional, Mexico City 08400, Mexico
- Correspondence: ; Tel.: +52-552-805-3125
| |
Collapse
|
7
|
Yang L, Zhang YH, Huang F, Li Z, Huang T, Cai YD. Identification of protein–protein interaction associated functions based on gene ontology and KEGG pathway. Front Genet 2022; 13:1011659. [PMID: 36171880 PMCID: PMC9511048 DOI: 10.3389/fgene.2022.1011659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 08/22/2022] [Indexed: 11/13/2022] Open
Abstract
Protein–protein interactions (PPIs) are extremely important for gaining mechanistic insights into the functional organization of the proteome. The resolution of PPI functions can help in the identification of novel diagnostic and therapeutic targets with medical utility, thus facilitating the development of new medications. However, the traditional methods for resolving PPI functions are mainly experimental methods, such as co-immunoprecipitation, pull-down assays, cross-linking, label transfer, and far-Western blot analysis, that are not only expensive but also time-consuming. In this study, we constructed an integrated feature selection scheme for the large-scale selection of the relevant functions of PPIs by using the Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway annotations of PPI participants. First, we encoded the proteins in each PPI with their gene ontologies and KEGG pathways. Then, the encoded protein features were refined as features of both positive and negative PPIs. Subsequently, Boruta was used for the initial filtering of features to obtain 5684 features. Three feature ranking algorithms, namely, least absolute shrinkage and selection operator, light gradient boosting machine, and max-relevance and min-redundancy, were applied to evaluate feature importance. Finally, the top-ranked features derived from multiple datasets were comprehensively evaluated, and the intersection of results mined by three feature ranking algorithms was taken to identify the features with high correlation with PPIs. Some functional terms were identified in our study, including cytokine–cytokine receptor interaction (hsa04060), intrinsic component of membrane (GO:0031224), and protein-binding biological process (GO:0005515). Our newly proposed integrated computational approach offers a novel perspective of the large-scale mining of biological functions linked to PPI.
Collapse
Affiliation(s)
- Lili Yang
- Measurement Biotechnique Research Center, School of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Yu-Hang Zhang
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - FeiMing Huang
- School of Life Sciences, Shanghai University, Shanghai, China
| | - ZhanDong Li
- Measurement Biotechnique Research Center, School of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- *Correspondence: Tao Huang, ; Yu-Dong Cai,
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
- *Correspondence: Tao Huang, ; Yu-Dong Cai,
| |
Collapse
|
8
|
Panditrao G, Bhowmick R, Meena C, Sarkar RR. Emerging landscape of molecular interaction networks: Opportunities, challenges and prospects. J Biosci 2022. [PMID: 36210749 PMCID: PMC9018971 DOI: 10.1007/s12038-022-00253-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Network biology finds application in interpreting molecular interaction networks and providing insightful inferences using graph theoretical analysis of biological systems. The integration of computational bio-modelling approaches with different hybrid network-based techniques provides additional information about the behaviour of complex systems. With increasing advances in high-throughput technologies in biological research, attempts have been made to incorporate this information into network structures, which has led to a continuous update of network biology approaches over time. The newly minted centrality measures accommodate the details of omics data and regulatory network structure information. The unification of graph network properties with classical mathematical and computational modelling approaches and technologically advanced approaches like machine-learning- and artificial intelligence-based algorithms leverages the potential application of these techniques. These computational advances prove beneficial and serve various applications such as essential gene prediction, identification of drug–disease interaction and gene prioritization. Hence, in this review, we have provided a comprehensive overview of the emerging landscape of molecular interaction networks using graph theoretical approaches. With the aim to provide information on the wide range of applications of network biology approaches in understanding the interaction and regulation of genes, proteins, enzymes and metabolites at different molecular levels, we have reviewed the methods that utilize network topological properties, emerging hybrid network-based approaches and applications that integrate machine learning techniques to analyse molecular interaction networks. Further, we have discussed the applications of these approaches in biomedical research with a note on future prospects.
Collapse
Affiliation(s)
- Gauri Panditrao
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Rupa Bhowmick
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| | - Chandrakala Meena
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Ram Rup Sarkar
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| |
Collapse
|
9
|
Zhou L, Wang H. A Combined Feature Screening Approach of Random Forest and Filter-based Methods for Ultra-high Dimensional Data. Curr Bioinform 2022. [DOI: 10.2174/1574893617666220221120618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Various feature (variable) screening approaches have been proposed in the past decade to mitigate the impact of ultra-high dimensionality in classification and regression problems, including filter based methods such as sure indepen¬dence screening, and wrapper based methods such random forest. However, the former type of methods rely heavily on strong modelling assumptions while the latter ones requires an adequate sample size to make the data speak for themselves. These require¬ments can seldom be met in biochemical studies in cases where we have only access to ultra-high dimensional data with a complex structure and a small number of observations.
Objective:
In this research, we want to investigate the possibility of combing both filter based screening methods and random forest based screening methods in the regression context.
Method:
We have combined four state-of-art filter approaches, namely, sure independence screening (SIS) , robust rank corre¬lation based screening (RRCS), high dimensional ordinary least squares projection (HOLP) and a model free sure independence screening procedure based on the distance correlation (DCSIS) from the statistical community with a random forest based Boruta screening method from the machine learning community for regression problems.
Result:
Among all combined methods, RF-DCSIS performs better than the other methods in terms of screening accuracy and prediction capability on the simulated scenarios and real benchmark datasets.
Conclusion:
By empirical study from both extensive simulation and real data, we have shown that both filter based screening and random forest based screening have their pros and cons while a combination of both may lead to a better feature screening result and prediction capability
Keywords:
feature screening, filter-based method, ultra-high dimensional data, variable selection, random forest,RF-DCSIS
Collapse
Affiliation(s)
- Lifeng Zhou
- School of Economics and Management, Changsha University, China
| | - Hong Wang
- School of Mathematics and Statistics, Central South University, China
| |
Collapse
|
10
|
Qin G, Du L, Ma Y, Yin Y, Wang L. Gene biomarker prediction in glioma by integrating scRNA-seq data and gene regulatory network. BMC Med Genomics 2021; 14:287. [PMID: 34863158 PMCID: PMC8643020 DOI: 10.1186/s12920-021-01115-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 11/01/2021] [Indexed: 12/22/2022] Open
Abstract
Background Although great efforts have been made to study the occurrence and development of glioma, the molecular mechanisms of glioma are still unclear. Single-cell sequencing technology provides a new perspective for researchers to explore the pathogens of tumors to further help make treatment and prognosis decisions for patients with tumors. Methods In this study, we proposed an algorithm framework to explore the molecular mechanisms of glioma by integrating single-cell gene expression profiles and gene regulatory relations. First, since there were great differences among malignant cells from different glioma samples, we analyzed the expression status of malignant cells for each sample, and then tumor consensus genes were identified by constructing and analyzing cell-specific networks. Second, to comprehensively analyze the characteristics of glioma, we integrated transcriptional regulatory relationships and consensus genes to construct a tumor-specific regulatory network. Third, we performed a hybrid clustering analysis to identify glioma cell types. Finally, candidate tumor gene biomarkers were identified based on cell types and known glioma-related genes. Results We got six identified cell types using the method we proposed and for these cell types, we performed functional and biological pathway enrichment analyses. The candidate tumor gene biomarkers were analyzed through survival analysis and verified using literature from PubMed. Conclusions The results showed that these candidate tumor gene biomarkers were closely related to glioma and could provide clues for the diagnosis and prognosis of patients with glioma. In addition, we found that four of the candidate tumor gene biomarkers (NDUFS5, NDUFA1, NDUFA13, and NDUFB8) belong to the NADH ubiquinone oxidoreductase subunit gene family, so we inferred that this gene family may be strongly related to glioma.
Collapse
Affiliation(s)
- Guimin Qin
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, China
| | - Longting Du
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, China
| | - Yuying Ma
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, China
| | - Yu Yin
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, China
| | - Liming Wang
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, China.
| |
Collapse
|
11
|
Wu Y, Guo Y, Ma J, Sa Y, Li Q, Zhang N. Research Progress of Gliomas in Machine Learning. Cells 2021; 10:cells10113169. [PMID: 34831392 PMCID: PMC8622230 DOI: 10.3390/cells10113169] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Revised: 11/04/2021] [Accepted: 11/05/2021] [Indexed: 12/29/2022] Open
Abstract
In the field of gliomas research, the broad availability of genetic and image information originated by computer technologies and the booming of biomedical publications has led to the advent of the big-data era. Machine learning methods were applied as possible approaches to speed up the data mining processes. In this article, we reviewed the present situation and future orientations of machine learning application in gliomas within the context of workflows to integrate analysis for precision cancer care. Publicly available tools or algorithms for key machine learning technologies in the literature mining for glioma clinical research were reviewed and compared. Further, the existing solutions of machine learning methods and their limitations in glioma prediction and diagnostics, such as overfitting and class imbalanced, were critically analyzed.
Collapse
|
12
|
Abstract
Amongst the several types of brain cancers known to humankind, glioma is one of the most severe and life-threatening types of cancer, comprising 40% of all primary brain tumors. Recent reports have shown the incident rate of gliomas to be 6 per 100,000 individuals per year globally. Despite the various therapeutics used in the treatment of glioma, patient survival rate remains at a median of 15 months after undergoing first-line treatment including surgery, radiation, and chemotherapy with Temozolomide. As such, the discovery of newer and more effective therapeutic agents is imperative for patient survival rate. The advent of computer-aided drug design in the development of drug discovery has emerged as a powerful means to ascertain potential hit compounds with distinctively high therapeutic effectiveness against glioma. This review encompasses the recent advances of bio-computational in-silico modeling that have elicited the discovery of small molecule inhibitors and/or drugs against various therapeutic targets in glioma. The relevant information provided in this report will assist researchers, especially in the drug design domains, to develop more effective therapeutics against this global disease.
Collapse
|
13
|
Zhang J, Peng H, Wang YL, Xiao HF, Cui YY, Bian XB, Zhang DK, Ma L. Predictive Role of the Apparent Diffusion Coefficient and MRI Morphologic Features on IDH Status in Patients With Diffuse Glioma: A Retrospective Cross-Sectional Study. Front Oncol 2021; 11:640738. [PMID: 34055608 PMCID: PMC8155475 DOI: 10.3389/fonc.2021.640738] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Accepted: 04/26/2021] [Indexed: 11/13/2022] Open
Abstract
Purpose To evaluate isocitrate dehydrogenase (IDH) status in clinically diagnosed grade II~IV glioma patients using the 2016 World Health Organization (WHO) classification based on MRI parameters. Materials and Methods One hundred and seventy-six patients with confirmed WHO grade II~IV glioma were retrospectively investigated as the study set, including lower-grade glioma (WHO grade II, n = 64; WHO grade III, n = 38) and glioblastoma (WHO grade IV, n = 74). The minimum apparent diffusion coefficient (ADCmin) in the tumor and the contralateral normal-appearing white matter (ADCn) and the rADC (ADCmin to ADCn ratio) were defined and calculated. Intraclass correlation coefficient (ICC) analysis was carried out to evaluate interobserver and intraobserver agreement for the ADC measurements. Interobserver agreement for the morphologic categories was evaluated by Cohen’s kappa analysis. The nonparametric Kruskal-Wallis test was used to determine whether the ADC measurements and glioma subtypes were related. By univariable analysis, if the differences in a variable were significant (P<0.05) or an image feature had high consistency (ICC >0.8; κ >0.6), then it was chosen as a predictor variable. The performance of the area under the receiver operating characteristic curve (AUC) was evaluated using several machine learning models, including logistic regression, support vector machine, Naive Bayes and Ensemble. Five evaluation indicators were adopted to compare the models. The optimal model was developed as the final model to predict IDH status in 40 patients with glioma as the subsequent test set. DeLong analysis was used to compare significant differences in the AUCs. Results In the study set, six measured variables (rADC, age, enhancement, calcification, hemorrhage, and cystic change) were selected for the machine learning model. Logistic regression had better performance than other models. Two predictive models, model 1 (including all predictor variables) and model 2 (excluding calcification), correctly classified IDH status with an AUC of 0.897 and 0.890, respectively. The test set performed equally well in prediction, indicating the effectiveness of the trained classifier. The subgroup analysis revealed that the model predicted IDH status of LGG and GBM with accuracy of 84.3% (AUC = 0.873) and 85.1% (AUC = 0.862) in the study set, and with the accuracy of 70.0% (AUC = 0.762) and 70.0% (AUC = 0.833) in the test set, respectively. Conclusion Through the use of machine-learning algorithms, the accurate prediction of IDH-mutant versus IDH-wildtype was achieved for adult diffuse gliomas via noninvasive MR imaging characteristics, including ADC values and tumor morphologic features, which are considered widely available in most clinical workstations.
Collapse
Affiliation(s)
- Jun Zhang
- The Medical School of Chinese People's Liberation Army (PLA) General Hospital, Beijing, China.,Department of Radiology, The First Medical Center, Chinese PLA General Hospital, Beijing, China.,Department of Radiology, The Sixth Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Hong Peng
- The Medical School of Chinese People's Liberation Army (PLA) General Hospital, Beijing, China.,Department of Radiology, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Yu-Lin Wang
- Department of Radiology, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Hua-Feng Xiao
- Department of Radiology, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Yuan-Yuan Cui
- The Medical School of Chinese People's Liberation Army (PLA) General Hospital, Beijing, China.,Department of Radiology, Qingdao Special Servicemen Recuperation Center of PLA Navy, Qingdao, China
| | - Xiang-Bing Bian
- Department of Radiology, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - De-Kang Zhang
- Department of Radiology, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Lin Ma
- Department of Radiology, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| |
Collapse
|
14
|
Baptiste M, Moinuddeen SS, Soliz CL, Ehsan H, Kaneko G. Making Sense of Genetic Information: The Promising Evolution of Clinical Stratification and Precision Oncology Using Machine Learning. Genes (Basel) 2021; 12:722. [PMID: 34065872 PMCID: PMC8151328 DOI: 10.3390/genes12050722] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 05/07/2021] [Accepted: 05/08/2021] [Indexed: 12/16/2022] Open
Abstract
Precision medicine is a medical approach to administer patients with a tailored dose of treatment by taking into consideration a person's variability in genes, environment, and lifestyles. The accumulation of omics big sequence data led to the development of various genetic databases on which clinical stratification of high-risk populations may be conducted. In addition, because cancers are generally caused by tumor-specific mutations, large-scale systematic identification of single nucleotide polymorphisms (SNPs) in various tumors has propelled significant progress of tailored treatments of tumors (i.e., precision oncology). Machine learning (ML), a subfield of artificial intelligence in which computers learn through experience, has a great potential to be used in precision oncology chiefly to help physicians make diagnostic decisions based on tumor images. A promising venue of ML in precision oncology is the integration of all available data from images to multi-omics big data for the holistic care of patients and high-risk healthy subjects. In this review, we provide a focused overview of precision oncology and ML with attention to breast cancer and glioma as well as the Bayesian networks that have the flexibility and the ability to work with incomplete information. We also introduce some state-of-the-art attempts to use and incorporate ML and genetic information in precision oncology.
Collapse
Affiliation(s)
| | | | | | | | - Gen Kaneko
- School of Arts & Sciences, University of Houston-Victoria, Victoria, TX 77901, USA; (M.B.); (S.S.M.); (C.L.S.); (H.E.)
| |
Collapse
|
15
|
Wang Y, Zhou M, Zou Q, Xu L. Machine learning for phytopathology: from the molecular scale towards the network scale. Brief Bioinform 2021; 22:6204793. [PMID: 33787847 DOI: 10.1093/bib/bbab037] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 01/09/2021] [Accepted: 01/26/2021] [Indexed: 01/16/2023] Open
Abstract
With the increasing volume of high-throughput sequencing data from a variety of omics techniques in the field of plant-pathogen interactions, sorting, retrieving, processing and visualizing biological information have become a great challenge. Within the explosion of data, machine learning offers powerful tools to process these complex omics data by various algorithms, such as Bayesian reasoning, support vector machine and random forest. Here, we introduce the basic frameworks of machine learning in dissecting plant-pathogen interactions and discuss the applications and advances of machine learning in plant-pathogen interactions from molecular to network biology, including the prediction of pathogen effectors, plant disease resistance protein monitoring and the discovery of protein-protein networks. The aim of this review is to provide a summary of advances in plant defense and pathogen infection and to indicate the important developments of machine learning in phytopathology.
Collapse
Affiliation(s)
- Yansu Wang
- Postdoctoral Innovation Practice Base, Shenzhen Polytechnic, China
| | | | - Quan Zou
- University of Electronic Science and Technology of China
| | - Lei Xu
- Shenzhen Polytechnic, China
| |
Collapse
|
16
|
A novel entropy-based mapping method for determining the protein-protein interactions in viral genomes by using coevolution analysis. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2020.102359] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
17
|
Kumar R, Dhanda SK. Bird Eye View of Protein Subcellular Localization Prediction. Life (Basel) 2020; 10:E347. [PMID: 33327400 PMCID: PMC7764902 DOI: 10.3390/life10120347] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 12/11/2020] [Accepted: 12/11/2020] [Indexed: 12/12/2022] Open
Abstract
Proteins are made up of long chain of amino acids that perform a variety of functions in different organisms. The activity of the proteins is determined by the nucleotide sequence of their genes and by its 3D structure. In addition, it is essential for proteins to be destined to their specific locations or compartments to perform their structure and functions. The challenge of computational prediction of subcellular localization of proteins is addressed in various in silico methods. In this review, we reviewed the progress in this field and offered a bird eye view consisting of a comprehensive listing of tools, types of input features explored, machine learning approaches employed, and evaluation matrices applied. We hope the review will be useful for the researchers working in the field of protein localization predictions.
Collapse
Affiliation(s)
- Ravindra Kumar
- Biometric Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, NIH, 9609 Medical Center Drive, Rockville, MD 20850, USA
| | - Sandeep Kumar Dhanda
- Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| |
Collapse
|
18
|
Saurabh R, Nandi S, Sinha N, Shukla M, Sarkar RR. Prediction of survival rate and effect of drugs on cancer patients with somatic mutations of genes: An AI‐based approach. Chem Biol Drug Des 2020; 96:1005-1019. [DOI: 10.1111/cbdd.13668] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 01/24/2020] [Accepted: 02/02/2020] [Indexed: 01/03/2023]
Affiliation(s)
- Rochi Saurabh
- Chemical Engineering and Process Development Division CSIR‐National Chemical Laboratory Pune India
| | - Sutanu Nandi
- Chemical Engineering and Process Development Division CSIR‐National Chemical Laboratory Pune India
- Academy of Scientific & Innovative Research (AcSIR) Ghaziabad India
| | - Noopur Sinha
- Chemical Engineering and Process Development Division CSIR‐National Chemical Laboratory Pune India
- Academy of Scientific & Innovative Research (AcSIR) Ghaziabad India
| | - Mudita Shukla
- Chemical Engineering and Process Development Division CSIR‐National Chemical Laboratory Pune India
| | - Ram Rup Sarkar
- Chemical Engineering and Process Development Division CSIR‐National Chemical Laboratory Pune India
- Academy of Scientific & Innovative Research (AcSIR) Ghaziabad India
| |
Collapse
|
19
|
Torkamanian-Afshar M, Lanjanian H, Nematzadeh S, Tabarzad M, Najafi A, Kiani F, Masoudi-Nejad A. RPINBASE: An online toolbox to extract features for predicting RNA-protein interactions. Genomics 2020; 112:2623-2632. [DOI: 10.1016/j.ygeno.2020.02.013] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 01/04/2020] [Accepted: 02/13/2020] [Indexed: 12/12/2022]
|
20
|
Chou KC. Impacts of Pseudo Amino Acid Components and 5-steps Rule to Proteomics and Proteome Analysis. Curr Top Med Chem 2019; 19:2283-2300. [DOI: 10.2174/1568026619666191018100141] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Revised: 08/18/2019] [Accepted: 08/26/2019] [Indexed: 01/27/2023]
Abstract
Stimulated by the 5-steps rule during the last decade or so, computational proteomics has achieved remarkable progresses in the following three areas: (1) protein structural class prediction; (2) protein subcellular location prediction; (3) post-translational modification (PTM) site prediction. The results obtained by these predictions are very useful not only for an in-depth study of the functions of proteins and their biological processes in a cell, but also for developing novel drugs against major diseases such as cancers, Alzheimer’s, and Parkinson’s. Moreover, since the targets to be predicted may have the multi-label feature, two sets of metrics are introduced: one is for inspecting the global prediction quality, while the other for the local prediction quality. All the predictors covered in this review have a userfriendly web-server, through which the majority of experimental scientists can easily obtain their desired data without the need to go through the complicated mathematics.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| |
Collapse
|