1
|
Yang L, Wang T, Zhang J, Kang S, Xu S, Wang K. Deep learning-based automatic segmentation of meningioma from T1-weighted contrast-enhanced MRI for preoperative meningioma differentiation using radiomic features. BMC Med Imaging 2024; 24:56. [PMID: 38443817 PMCID: PMC10916038 DOI: 10.1186/s12880-024-01218-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 01/21/2024] [Indexed: 03/07/2024] Open
Abstract
BACKGROUND This study aimed to establish a dedicated deep-learning model (DLM) on routine magnetic resonance imaging (MRI) data to investigate DLM performance in automated detection and segmentation of meningiomas in comparison to manual segmentations. Another purpose of our work was to develop a radiomics model based on the radiomics features extracted from automatic segmentation to differentiate low- and high-grade meningiomas before surgery. MATERIALS A total of 326 patients with pathologically confirmed meningiomas were enrolled. Samples were randomly split with a 6:2:2 ratio to the training set, validation set, and test set. Volumetric regions of interest (VOIs) were manually drawn on each slice using the ITK-SNAP software. An automatic segmentation model based on SegResNet was developed for the meningioma segmentation. Segmentation performance was evaluated by dice coefficient and 95% Hausdorff distance. Intra class correlation (ICC) analysis was applied to assess the agreement between radiomic features from manual and automatic segmentations. Radiomics features derived from automatic segmentation were extracted by pyradiomics. After feature selection, a model for meningiomas grading was built. RESULTS The DLM detected meningiomas in all cases. For automatic segmentation, the mean dice coefficient and 95% Hausdorff distance were 0.881 (95% CI: 0.851-0.981) and 2.016 (95% CI:1.439-3.158) in the test set, respectively. Features extracted on manual and automatic segmentation are comparable: the average ICC value was 0.804 (range, 0.636-0.933). Features extracted on manual and automatic segmentation are comparable: the average ICC value was 0.804 (range, 0.636-0.933). For meningioma classification, the radiomics model based on automatic segmentation performed well in grading meningiomas, yielding a sensitivity, specificity, accuracy, and area under the curve (AUC) of 0.778 (95% CI: 0.701-0.856), 0.860 (95% CI: 0.722-0.908), 0.848 (95% CI: 0.715-0.903) and 0.842 (95% CI: 0.807-0.895) in the test set, respectively. CONCLUSIONS The DLM yielded favorable automated detection and segmentation of meningioma and can help deploy radiomics for preoperative meningioma differentiation in clinical practice.
Collapse
Affiliation(s)
- Liping Yang
- Department of PET-CT, Harbin Medical University Cancer Hospital, Harbin, 150001, China
| | - Tianzuo Wang
- Medical Imaging Department, Changzheng Hospital of Harbin City, Harbin, China
| | - Jinling Zhang
- Medical Imaging Department, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Shi Kang
- Medical Imaging Department, The Second Hospital of Heilongjiang Province, Harbin, China
| | - Shichuan Xu
- Department of Medical Instruments, Second Hospital of Harbin, Harbin, 150001, China.
| | - Kezheng Wang
- Department of PET-CT, Harbin Medical University Cancer Hospital, Harbin, 150001, China.
| |
Collapse
|
2
|
Bian J, Liu X, Dong G, Hou C, Huang S, Zhang D. ACP-ML: A sequence-based method for anticancer peptide prediction. Comput Biol Med 2024; 170:108063. [PMID: 38301519 DOI: 10.1016/j.compbiomed.2024.108063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/08/2024] [Accepted: 01/27/2024] [Indexed: 02/03/2024]
Abstract
Cancer is a serious malignant tumor and is difficult to cure. Chemotherapy, as a primary treatment for cancer, causes significant harm to normal cells in the body and is often accompanied by serious side effects. Recently, anti-cancer peptides (ACPs) as a type of protein for treating cancers dominated research into the development of new anti-tumor drugs because of their ability to specifically target and destroy cancer cells. The screening of proteins with cancer-inhibiting properties from a large pool of proteins is key to the development of anti-tumor drugs. However, it is expensive and inefficient to accurately identify protein functions only through biological experiments due to their complex structure. Therefore, we propose a new prediction model ACP-ML to effectively predict ACPs. In terms of feature extraction, DPC, PseAAC, CTDC, CTDT and CS-Pse-PSSM features were used and the most optimal feature set was selected by comparing combinations of these features. Then, a two-step feature selection process using MRMD and RFE algorithms was performed to determine the most crucial features from the most optimal feature set for identifying ACPs. Furthermore, we assessed the classification accuracy of single learning models and different strategies-based ensemble models through ten-fold cross-validation. Ultimately, a voting-based ensemble learning method is developed to predict ACPs. To validate its effectiveness, two independent test sets were used to perform tests, achieving accuracy of 90.891 % and 92.578 % respectively. Compared with existing anticancer peptide prediction algorithms, the proposed feature processing method is more effective, and the proposed ensemble model ACP-ML exhibits stronger generalization capability and higher accuracy.
Collapse
Affiliation(s)
- Jilong Bian
- Northeast Forestry University, College of Computer and Control Engineering, Harbin, Heilongjiang, China.
| | - Xuan Liu
- Northeast Forestry University, College of Computer and Control Engineering, Harbin, Heilongjiang, China
| | - Guanghui Dong
- Northeast Forestry University, College of Computer and Control Engineering, Harbin, Heilongjiang, China
| | - Chang Hou
- Northeast Forestry University, College of Computer and Control Engineering, Harbin, Heilongjiang, China
| | - Shan Huang
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin, Heilongjiang, China.
| | - Dandan Zhang
- Department of Obstetrics and Gynecology, The First Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang, China.
| |
Collapse
|
3
|
Lin LS, Kao CH, Li YJ, Chen HH, Chen HY. Improved support vector machine classification for imbalanced medical datasets by novel hybrid sampling combining modified mega-trend-diffusion and bagging extreme learning machine model. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:17672-17701. [PMID: 38052532 DOI: 10.3934/mbe.2023786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
To handle imbalanced datasets in machine learning or deep learning models, some studies suggest sampling techniques to generate virtual examples of minority classes to improve the models' prediction accuracy. However, for kernel-based support vector machines (SVM), some sampling methods suggest generating synthetic examples in an original data space rather than in a high-dimensional feature space. This may be ineffective in improving SVM classification for imbalanced datasets. To address this problem, we propose a novel hybrid sampling technique termed modified mega-trend-diffusion-extreme learning machine (MMTD-ELM) to effectively move the SVM decision boundary toward a region of the majority class. By this movement, the prediction of SVM for minority class examples can be improved. The proposed method combines α-cut fuzzy number method for screening representative examples of majority class and MMTD method for creating new examples of the minority class. Furthermore, we construct a bagging ELM model to monitor the similarity between new examples and original data. In this paper, four datasets are used to test the efficiency of the proposed MMTD-ELM method in imbalanced data prediction. Additionally, we deployed two SVM models to compare prediction performance of the proposed MMTD-ELM method with three state-of-the-art sampling techniques in terms of geometric mean (G-mean), F-measure (F1), index of balanced accuracy (IBA) and area under curve (AUC) metrics. Furthermore, paired t-test is used to elucidate whether the suggested method has statistically significant differences from the other sampling techniques in terms of the four evaluation metrics. The experimental results demonstrated that the proposed method achieves the best average values in terms of G-mean, F1, IBA and AUC. Overall, the suggested MMTD-ELM method outperforms these sampling methods for imbalanced datasets.
Collapse
Affiliation(s)
- Liang-Sian Lin
- Department of Information Management, National Taipei University of Nursing and Health Sciences, Taipei 112303, Taiwan
| | - Chen-Huan Kao
- Department of Information Management, National Taipei University of Nursing and Health Sciences, Taipei 112303, Taiwan
| | - Yi-Jie Li
- Department of Information Management, National Taipei University of Nursing and Health Sciences, Taipei 112303, Taiwan
| | - Hao-Hsuan Chen
- Department of Information Management, National Taipei University of Nursing and Health Sciences, Taipei 112303, Taiwan
| | - Hung-Yu Chen
- Department of Information Management, National Chin-Yi University of Technology, Taichung 411030, Taiwan
| |
Collapse
|
4
|
Zhou J, Li X, Ma Y, Wu Z, Xie Z, Zhang Y, Wei Y. Optimal modeling of anti-breast cancer candidate drugs screening based on multi-model ensemble learning with imbalanced data. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:5117-5134. [PMID: 36896538 DOI: 10.3934/mbe.2023237] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
The imbalanced data makes the machine learning model seriously biased, which leads to false positive in screening of therapeutic drugs for breast cancer. In order to deal with this problem, a multi-model ensemble framework based on tree-model, linear model and deep-learning model is proposed. Based on the methodology constructed in this study, we screened the 20 most critical molecular descriptors from 729 molecular descriptors of 1974 anti-breast cancer drug candidates and, in order to measure the pharmacokinetic properties and safety of the drug candidates, the screened molecular descriptors were used in this study for subsequent bioactivity, absorption, distribution metabolism, excretion, toxicity, and other prediction tasks. The results show that the method constructed in this study is superior and more stable than the individual models used in the ensemble approach.
Collapse
Affiliation(s)
- Juan Zhou
- School of Software, East China Jiaotong University, Nanchang 330013, China
| | - Xiong Li
- School of Software, East China Jiaotong University, Nanchang 330013, China
| | - Yuanting Ma
- School of Economics and Management, East China Jiaotong University, Nanchang 330013, China
| | - Zejiu Wu
- School of Science, East China Jiaotong University, Nanchang 330013, China
| | - Ziruo Xie
- School of Software, East China Jiaotong University, Nanchang 330013, China
| | - Yuqi Zhang
- School of Foreign Languages, East China Jiaotong University, Nanchang 330013, China
| | - Yiming Wei
- School of Software, East China Jiaotong University, Nanchang 330013, China
| |
Collapse
|
5
|
Jiang L, Chen S, Wu Y, Zhou D, Duan L. Prediction of coronary heart disease in gout patients using machine learning models. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:4574-4591. [PMID: 36896513 DOI: 10.3934/mbe.2023212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Growing evidence shows that there is an increased risk of cardiovascular diseases among gout patients, especially coronary heart disease (CHD). Screening for CHD in gout patients based on simple clinical factors is still challenging. Here we aim to build a diagnostic model based on machine learning so as to avoid missed diagnoses or over exaggerated examinations as much as possible. Over 300 patient samples collected from Jiangxi Provincial People's Hospital were divided into two groups (gout and gout+CHD). The prediction of CHD in gout patients has thus been modeled as a binary classification problem. A total of eight clinical indicators were selected as features for machine learning classifiers. A combined sampling technique was used to overcome the imbalanced problem in the training dataset. Eight machine learning models were used including logistic regression, decision tree, ensemble learning models (random forest, XGBoost, LightGBM, GBDT), support vector machine (SVM) and neural networks. Our results showed that stepwise logistic regression and SVM achieved more excellent AUC values, while the random forest and XGBoost models achieved more excellent performances in terms of recall and accuracy. Furthermore, several high-risk factors were found to be effective indices in predicting CHD in gout patients, which provide insights into the clinical diagnosis.
Collapse
Affiliation(s)
- Lili Jiang
- Department of Rheumatology and Clinical Immunology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, China
| | - Sirong Chen
- School of Mathematical Sciences, Soochow University, Suzhou, China
| | - Yuanhui Wu
- Department of Rheumatology and Clinical Immunology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, China
| | - Da Zhou
- School of Mathematical Sciences, Xiamen University, Xiamen, China
| | - Lihua Duan
- Department of Rheumatology and Clinical Immunology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, China
| |
Collapse
|
6
|
A Methylation Diagnostic Model Based on Random Forests and Neural Networks for Asthma Identification. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:2679050. [PMID: 36213574 PMCID: PMC9534672 DOI: 10.1155/2022/2679050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 09/11/2022] [Accepted: 09/12/2022] [Indexed: 11/17/2022]
Abstract
Background Asthma significantly impacts human life and health as a chronic disease. Traditional treatments for asthma have several limitations. Artificial intelligence aids in cancer treatment and may also accelerate our understanding of asthma mechanisms. We aimed to develop a new clinical diagnosis model for asthma using artificial neural networks (ANN). Methods Datasets (GSE85566, GSE40576, and GSE13716) were downloaded from Gene Expression Omnibus (GEO) and identified differentially expressed CpGs (DECs) enriched by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. Random forest (RF) and ANN algorithms further identified gene characteristics and built clinical models. In addition, two external validation datasets (GSE40576 and GSE137716) were used to validate the diagnostic ability of the model. Results The methylation analysis tool (ChAMP) considered DECs that were up-regulated (n =121) and down-regulated (n =20). GO results showed enrichment of actin cytoskeleton organization and cell-substrate adhesion, shigellosis, and serotonergic synapses. RF (random forest) analysis identified 10 crucial DECs (cg05075579, cg20434422, cg03907390, cg00712106, cg05696969, cg22862094, cg11733958, cg00328720, and cg13570822). ANN constructed the clinical model according to 10 DECs. In two external validation datasets (GSE40576 and GSE137716), the Area Under Curve (AUC) for GSE137716 was 1.000, and AUC for GSE40576 was 0.950, confirming the reliability of the model. Conclusion Our findings provide new methylation markers and clinical diagnostic models for asthma diagnosis and treatment.
Collapse
|