1
|
Optimisation and Calibration of Bayesian Neural Network for Probabilistic Prediction of Biogas Performance in an Anaerobic Lagoon. SENSORS (BASEL, SWITZERLAND) 2024; 24:2537. [PMID: 38676155 PMCID: PMC11053646 DOI: 10.3390/s24082537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/12/2024] [Accepted: 04/12/2024] [Indexed: 04/28/2024]
Abstract
This study aims to enhance diagnostic capabilities for optimising the performance of the anaerobic sewage treatment lagoon at Melbourne Water's Western Treatment Plant (WTP) through a novel machine learning (ML)-based monitoring strategy. This strategy employs ML to make accurate probabilistic predictions of biogas performance by leveraging diverse real-life operational and inspection sensor and other measurement data for asset management, decision making, and structural health monitoring (SHM). The paper commences with data analysis and preprocessing of complex irregular datasets to facilitate efficient learning in an artificial neural network. Subsequently, a Bayesian mixture density neural network model incorporating an attention-based mechanism in bidirectional long short-term memory (BiLSTM) was developed. This probabilistic approach uses a distribution output layer based on the Gaussian mixture model and Monte Carlo (MC) dropout technique in estimating data and model uncertainties, respectively. Furthermore, systematic hyperparameter optimisation revealed that the optimised model achieved a negative log-likelihood (NLL) of 0.074, significantly outperforming other configurations. It achieved an accuracy approximately 9 times greater than the average model performance (NLL = 0.753) and 22 times greater than the worst performing model (NLL = 1.677). Key factors influencing the model's accuracy, such as the input window size and the number of hidden units in the BiLSTM layer, were identified, while the number of neurons in the fully connected layer was found to have no significant impact on accuracy. Moreover, model calibration using the expected calibration error was performed to correct the model's predictive uncertainty. The findings suggest that the inherent data significantly contribute to the overall uncertainty of the model, highlighting the need for more high-quality data to enhance learning. This study lays the groundwork for applying ML in transforming high-value assets into intelligent structures and has broader implications for ML in asset management, SHM applications, and renewable energy sectors.
Collapse
|
2
|
Clinical assessment of deep learning-based uncertainty maps in lung cancer segmentation. Phys Med Biol 2024; 69:035007. [PMID: 38171012 DOI: 10.1088/1361-6560/ad1a26] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 01/03/2024] [Indexed: 01/05/2024]
Abstract
Objective. Prior to radiation therapy planning, accurate delineation of gross tumour volume (GTVs) and organs at risk (OARs) is crucial. In the current clinical practice, tumour delineation is performed manually by radiation oncologists, which is time-consuming and prone to large inter-observer variability. With the advent of deep learning (DL) models, automated contouring has become possible, speeding up procedures and assisting clinicians. However, these tools are currently used in the clinic mostly for contouring OARs, since these systems are not reliable yet for contouring GTVs. To improve the reliability of these systems, researchers have started exploring the topic of probabilistic neural networks. However, there is still limited knowledge of the practical implementation of such networks in real clinical settings.Approach. In this work, we developed a 3D probabilistic system that generates DL-based uncertainty maps for lung cancer CT segmentations. We employed the Monte Carlo (MC) dropout technique to generate probabilistic and uncertainty maps, while the model calibration was evaluated by using reliability diagrams. A clinical validation was conducted in collaboration with a radiation oncologist to qualitatively assess the value of the uncertainty estimates. We also proposed two novel metrics, namely mean uncertainty (MU) and relative uncertainty volume (RUV), as potential indicators for clinicians to assess the need for independent visual checks of the DL-based segmentation. Main results. Our study showed that uncertainty mapping effectively identified cases of under or over-contouring. Although the overconfidence of the model, a strong correlation was observed between the clinical opinion and MU metric. Moreover, both MU and RUV revealed high AUC values in discretising between low and high uncertainty cases.Significance. Our study is one of the first attempts to clinically validate uncertainty estimates in DL-based contouring. The two proposed metrics exhibited promising potential as indicators for clinicians to independently assess the quality of tumour delineation.
Collapse
|
3
|
Uncertainty Evaluation of a Gas Turbine Model Based on a Nonlinear Autoregressive Exogenous Model and Monte Carlo Dropout. SENSORS (BASEL, SWITZERLAND) 2024; 24:465. [PMID: 38257558 PMCID: PMC10818747 DOI: 10.3390/s24020465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 11/29/2023] [Accepted: 12/01/2023] [Indexed: 01/24/2024]
Abstract
Gas turbines are thermoelectric plants with various applications, such as large-scale electricity production, petrochemical industry, and steam generation. In order to optimize the operation of a gas turbine, it is necessary to develop system identification models that allow for the development of studies and analyses to increase the system's reliability. Current strategies for modeling complex and non-linear systems can be based on artificial intelligence techniques, using autoregressive neural networks of the NARX and LSTM type. In this context, this work aims to develop a model of a gas turbine capable of estimating the rotation speed of the turbine and simultaneously estimating the uncertainty associated with the estimation. These methodologies are based on artificial neural networks and the Monte Carlo dropout simulation method. The results were obtained from experimental data from a 215 MW gas turbine, getting the best model with a MAPE of 0.02% and an uncertainty associated with the turbine rotation speed of 2.2 RPM.
Collapse
|
4
|
A Hybrid Method for Performance Degradation Probability Prediction of Proton Exchange Membrane Fuel Cell. MEMBRANES 2023; 13:426. [PMID: 37103853 PMCID: PMC10142057 DOI: 10.3390/membranes13040426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 04/08/2023] [Accepted: 04/10/2023] [Indexed: 06/19/2023]
Abstract
The proton exchange membrane fuel cell (PEMFC) is a promising power source, but the short lifespan and high maintenance cost restrict its development and widespread application. Performance degradation prediction is an effective technique to extend the lifespan and reduce the maintenance cost of PEMFC. This paper proposed a novel hybrid method for the performance degradation prediction of PEMFC. Firstly, considering the randomness of PEMFC degradation, a Wiener process model is established to describe the degradation of the aging factor. Secondly, the unscented Kalman filter algorithm is used to estimate the degradation state of the aging factor from monitoring voltage. Then, in order to predict the degradation state of PEMFC, the transformer structure is used to capture the data characteristics and fluctuations of the aging factor. To quantify the uncertainty of the predicted results, we also add the Monte Carlo dropout technology to the transformer to obtain the confidence interval of the predicted result. Finally, the effectiveness and superiority of the proposed method are verified on the experimental datasets.
Collapse
|
5
|
Exploring uncertainty measures in convolutional neural network for semantic segmentation of oral cancer images. JOURNAL OF BIOMEDICAL OPTICS 2022; 27:115001. [PMID: 36329004 PMCID: PMC9630461 DOI: 10.1117/1.jbo.27.11.115001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Accepted: 10/13/2022] [Indexed: 06/16/2023]
Abstract
SIGNIFICANCE Oral cancer is one of the most prevalent cancers, especially in middle- and low-income countries such as India. Automatic segmentation of oral cancer images can improve the diagnostic workflow, which is a significant task in oral cancer image analysis. Despite the remarkable success of deep-learning networks in medical segmentation, they rarely provide uncertainty quantification for their output. AIM We aim to estimate uncertainty in a deep-learning approach to semantic segmentation of oral cancer images and to improve the accuracy and reliability of predictions. APPROACH This work introduced a UNet-based Bayesian deep-learning (BDL) model to segment potentially malignant and malignant lesion areas in the oral cavity. The model can quantify uncertainty in predictions. We also developed an efficient model that increased the inference speed, which is almost six times smaller and two times faster (inference speed) than the original UNet. The dataset in this study was collected using our customized screening platform and was annotated by oral oncology specialists. RESULTS The proposed approach achieved good segmentation performance as well as good uncertainty estimation performance. In the experiments, we observed an improvement in pixel accuracy and mean intersection over union by removing uncertain pixels. This result reflects that the model provided less accurate predictions in uncertain areas that may need more attention and further inspection. The experiments also showed that with some performance compromises, the efficient model reduced computation time and model size, which expands the potential for implementation on portable devices used in resource-limited settings. CONCLUSIONS Our study demonstrates the UNet-based BDL model not only can perform potentially malignant and malignant oral lesion segmentation, but also can provide informative pixel-level uncertainty estimation. With this extra uncertainty information, the accuracy and reliability of the model’s prediction can be improved.
Collapse
|
6
|
Bayesian deep learning-based 1 H-MRS of the brain: Metabolite quantification with uncertainty estimation using Monte Carlo dropout. Magn Reson Med 2022; 88:38-52. [PMID: 35344604 DOI: 10.1002/mrm.29214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 01/14/2022] [Accepted: 02/11/2022] [Indexed: 11/09/2022]
Abstract
PURPOSE To develop a Bayesian convolutional neural network (BCNN) with Monte Carlo dropout sampling for metabolite quantification with simultaneous uncertainty estimation in deep learning-based proton MRS of the brain. METHODS Human brain spectra were simulated using basis spectra for 17 metabolites and macromolecules (N = 100 000) at 3.0 Tesla. In addition, actual in vivo spectra (N = 5) were modified by adjusting SNR and linewidth with increasing severity of spectral degradation (N = 50). A BCNN was trained on the simulated spectra to generate a noise-free, line-narrowed, macromolecule signal-removed, metabolite-only spectrum from a typical human brain spectrum. At inference, each input spectrum was Monte Carlo dropout sampled (50 times), and the resulting mean spectrum and variance spectrum were used for metabolite quantification and uncertainty estimation, respectively. RESULTS Using the simulated spectra, the mean absolute percent errors of the BCNN-predicted metabolite content were < 10% for Cr, Glu, Gln, mI, NAA, and Tau (< 5% for Glu, NAA, and mI). For all metabolites, the correlations (r's) between the ground-truth error and BCNN-predicted uncertainty ranged 0.72-0.94 (0.83 ± 0.06; p < 0.001). Using the modified in vivo spectra, the extent of variation in the estimated metabolite content against the increasing severity of spectral degradation tended to be smaller with BCNN than with linear combination of model spectra (LCModel). Overall, the variation in metabolite content tended to be more highly correlated with the uncertainty from BCNN than with the Cramér-Rao lower-bounds from LCModel (0.938 ± 0.019 vs. 0.881 ± 0.057 [p = 0.115]). CONCLUSION The BCNN with Monte Carlo dropout sampling may be used in deep learning-based MRS for the estimation of uncertainty in the machine-predicted metabolite content, which is important in the clinical application of deep learning-based MRS.
Collapse
|
7
|
Automatic brain segmentation in preterm infants with post-hemorrhagic hydrocephalus using 3D Bayesian U-Net. Hum Brain Mapp 2022; 43:1895-1916. [PMID: 35023255 PMCID: PMC8933325 DOI: 10.1002/hbm.25762] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 12/08/2021] [Accepted: 12/11/2021] [Indexed: 12/17/2022] Open
Abstract
Post‐hemorrhagic hydrocephalus (PHH) is a severe complication of intraventricular hemorrhage (IVH) in very preterm infants. PHH monitoring and treatment decisions rely heavily on manual and subjective two‐dimensional measurements of the ventricles. Automatic and reliable three‐dimensional (3D) measurements of the ventricles may provide a more accurate assessment of PHH, and lead to improved monitoring and treatment decisions. To accurately and efficiently obtain these 3D measurements, automatic segmentation of the ventricles can be explored. However, this segmentation is challenging due to the large ventricular anatomical shape variability in preterm infants diagnosed with PHH. This study aims to (a) propose a Bayesian U‐Net method using 3D spatial concrete dropout for automatic brain segmentation (with uncertainty assessment) of preterm infants with PHH; and (b) compare the Bayesian method to three reference methods: DenseNet, U‐Net, and ensemble learning using DenseNets and U‐Nets. A total of 41 T2‐weighted MRIs from 27 preterm infants were manually segmented into lateral ventricles, external CSF, white and cortical gray matter, brainstem, and cerebellum. These segmentations were used as ground truth for model evaluation. All methods were trained and evaluated using 4‐fold cross‐validation and segmentation endpoints, with additional uncertainty endpoints for the Bayesian method. In the lateral ventricles, segmentation endpoint values for the DenseNet, U‐Net, ensemble learning, and Bayesian U‐Net methods were mean Dice score = 0.814 ± 0.213, 0.944 ± 0.041, 0.942 ± 0.042, and 0.948 ± 0.034 respectively. Uncertainty endpoint values for the Bayesian U‐Net were mean recall = 0.953 ± 0.037, mean negative predictive value = 0.998 ± 0.005, mean accuracy = 0.906 ± 0.032, and mean AUC = 0.949 ± 0.031. To conclude, the Bayesian U‐Net showed the best segmentation results across all methods and provided accurate uncertainty maps. This method may be used in clinical practice for automatic brain segmentation of preterm infants with PHH, and lead to better PHH monitoring and more informed treatment decisions.
Collapse
|
8
|
Towards targeted ultrasound-guided prostate biopsy by incorporating model and label uncertainty in cancer detection. Int J Comput Assist Radiol Surg 2021; 17:121-128. [PMID: 34783976 DOI: 10.1007/s11548-021-02485-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 08/16/2021] [Indexed: 10/19/2022]
Abstract
PURPOSE Systematic prostate biopsy is widely used for cancer diagnosis. The procedure is blind to underlying prostate tissue micro-structure; hence, it can lead to a high rate of false negatives. Development of a machine-learning model that can reliably identify suspicious cancer regions is highly desirable. However, the models proposed to-date do not consider the uncertainty present in their output or the data to benefit clinical decision making for targeting biopsy. METHODS We propose a deep network for improved detection of prostate cancer in systematic biopsy considering both the label and model uncertainty. The architecture of our model is based on U-Net, trained with temporal enhanced ultrasound (TeUS) data. We estimate cancer detection uncertainty using test-time augmentation and test-time dropout. We then use uncertainty metrics to report the cancer probability for regions with high confidence to help the clinical decision making during the biopsy procedure. RESULTS Experiments for prostate cancer classification includes data from 183 prostate biopsy cores of 41 patients. We achieve an area under the curve, sensitivity, specificity and balanced accuracy of 0.79, 0.78, 0.71 and 0.75, respectively. CONCLUSION Our key contribution is to automatically estimate model and label uncertainty towards enabling targeted ultrasound-guided prostate biopsy. We anticipate that such information about uncertainty can decrease the number of unnecessary biopsy with a higher rate of cancer yield.
Collapse
|
9
|
Monte Carlo Dropout for Uncertainty Estimation and Motor Imagery Classification. SENSORS 2021; 21:s21217241. [PMID: 34770553 PMCID: PMC8588128 DOI: 10.3390/s21217241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 10/24/2021] [Accepted: 10/28/2021] [Indexed: 11/16/2022]
Abstract
Motor Imagery (MI)-based Brain-Computer Interfaces (BCIs) have been widely used as an alternative communication channel to patients with severe motor disabilities, achieving high classification accuracy through machine learning techniques. Recently, deep learning techniques have spotlighted the state-of-the-art of MI-based BCIs. These techniques still lack strategies to quantify predictive uncertainty and may produce overconfident predictions. In this work, methods to enhance the performance of existing MI-based BCIs are proposed in order to obtain a more reliable system for real application scenarios. First, the Monte Carlo dropout (MCD) method is proposed on MI deep neural models to improve classification and provide uncertainty estimation. This approach was implemented using Shallow Convolutional Neural Network (SCNN-MCD) and with an ensemble model (E-SCNN-MCD). As another contribution, to discriminate MI task predictions of high uncertainty, a threshold approach is introduced and tested for both SCNN-MCD and E-SCNN-MCD approaches. The BCI Competition IV Databases 2a and 2b were used to evaluate the proposed methods for both subject-specific and non-subject-specific strategies, obtaining encouraging results for MI recognition.
Collapse
|
10
|
Uncertainty quantification in skin cancer classification using three-way decision-based Bayesian deep learning. Comput Biol Med 2021; 135:104418. [PMID: 34052016 DOI: 10.1016/j.compbiomed.2021.104418] [Citation(s) in RCA: 68] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 04/01/2021] [Accepted: 04/17/2021] [Indexed: 12/18/2022]
Abstract
Accurate automated medical image recognition, including classification and segmentation, is one of the most challenging tasks in medical image analysis. Recently, deep learning methods have achieved remarkable success in medical image classification and segmentation, clearly becoming the state-of-the-art methods. However, most of these methods are unable to provide uncertainty quantification (UQ) for their output, often being overconfident, which can lead to disastrous consequences. Bayesian Deep Learning (BDL) methods can be used to quantify uncertainty of traditional deep learning methods, and thus address this issue. We apply three uncertainty quantification methods to deal with uncertainty during skin cancer image classification. They are as follows: Monte Carlo (MC) dropout, Ensemble MC (EMC) dropout and Deep Ensemble (DE). To further resolve the remaining uncertainty after applying the MC, EMC and DE methods, we describe a novel hybrid dynamic BDL model, taking into account uncertainty, based on the Three-Way Decision (TWD) theory. The proposed dynamic model enables us to use different UQ methods and different deep neural networks in distinct classification phases. So, the elements of each phase can be adjusted according to the dataset under consideration. In this study, two best UQ methods (i.e., DE and EMC) are applied in two classification phases (the first and second phases) to analyze two well-known skin cancer datasets, preventing one from making overconfident decisions when it comes to diagnosing the disease. The accuracy and the F1-score of our final solution are, respectively, 88.95% and 89.00% for the first dataset, and 90.96% and 91.00% for the second dataset. Our results suggest that the proposed TWDBDL model can be used effectively at different stages of medical image analysis.
Collapse
|
11
|
Compensating for visibility artefacts in photoacoustic imaging with a deep learning approach providing prediction uncertainties. PHOTOACOUSTICS 2021; 21:100218. [PMID: 33364161 PMCID: PMC7750172 DOI: 10.1016/j.pacs.2020.100218] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 10/15/2020] [Accepted: 10/17/2020] [Indexed: 05/04/2023]
Abstract
Conventional photoacoustic imaging may suffer from the limited view and bandwidth of ultrasound transducers. A deep learning approach is proposed to handle these problems and is demonstrated both in simulations and in experiments on a multi-scale model of leaf skeleton. We employed an experimental approach to build the training and the test sets using photographs of the samples as ground truth images. Reconstructions produced by the neural network show a greatly improved image quality as compared to conventional approaches. In addition, this work aimed at quantifying the reliability of the neural network predictions. To achieve this, the dropout Monte-Carlo procedure is applied to estimate a pixel-wise degree of confidence on each predicted picture. Last, we address the possibility to use transfer learning with simulated data in order to drastically limit the size of the experimental dataset.
Collapse
|
12
|
Automated Detection of Presymptomatic Conditions in Spinocerebellar Ataxia Type 2 Using Monte Carlo Dropout and Deep Neural Network Techniques with Electrooculogram Signals. SENSORS (BASEL, SWITZERLAND) 2020; 20:E3032. [PMID: 32471077 PMCID: PMC7309035 DOI: 10.3390/s20113032] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 05/25/2020] [Accepted: 05/25/2020] [Indexed: 12/21/2022]
Abstract
Application of deep learning (DL) to the field of healthcare is aiding clinicians to make an accurate diagnosis. DL provides reliable results for image processing and sensor interpretation problems most of the time. However, model uncertainty should also be thoroughly quantified. This paper therefore addresses the employment of Monte Carlo dropout within the DL structure to automatically discriminate presymptomatic signs of spinocerebellar ataxia type 2 in saccadic samples obtained from electrooculograms. The current work goes beyond the common incorporation of this special type of dropout into deep neural networks and uses the uncertainty derived from the validation samples to construct a decision tree at the register level of the patients. The decision tree built from the uncertainty estimates obtained a classification accuracy of 81.18% in automatically discriminating control, presymptomatic and sick classes. This paper proposes a novel method to address both uncertainty quantification and explainability to develop reliable healthcare support systems.
Collapse
|
13
|
Accuracy, uncertainty, and adaptability of automatic myocardial ASL segmentation using deep CNN. Magn Reson Med 2019; 83:1863-1874. [PMID: 31729078 DOI: 10.1002/mrm.28043] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Revised: 09/23/2019] [Accepted: 09/24/2019] [Indexed: 01/25/2023]
Abstract
PURPOSE To apply deep convolution neural network to the segmentation task in myocardial arterial spin labeled perfusion imaging and to develop methods that measure uncertainty and that adapt the convolution neural network model to a specific false-positive versus false-negative tradeoff. METHODS The Monte Carlo dropout U-Net was trained on data from 22 subjects and tested on data from 6 heart transplant recipients. Manual segmentation and regional myocardial blood flow were available for comparison. We consider 2 global uncertainty measures, named "Dice uncertainty" and "Monte Carlo dropout uncertainty," which were calculated with and without the use of manual segmentation, respectively. Tversky loss function with a hyperparameter β was used to adapt the model to a specific false-positive versus false-negative tradeoff. RESULTS The Monte Carlo dropout U-Net achieved a Dice coefficient of 0.91 ± 0.04 on the test set. Myocardial blood flow measured using automatic segmentations was highly correlated to that measured using the manual segmentation (R2 = 0.96). Dice uncertainty and Monte Carlo dropout uncertainty were in good agreement (R2 = 0.64). As β increased, the false-positive rate systematically decreased and false-negative rate systematically increased. CONCLUSION We demonstrate the feasibility of deep convolution neural network for automatic segmentation of myocardial arterial spin labeling, with good accuracy. We also introduce 2 simple methods for assessing model uncertainty. Finally, we demonstrate the ability to adapt the convolution neural network model to a specific false-positive versus false-negative tradeoff. These findings are directly relevant to automatic segmentation in quantitative cardiac MRI and are broadly applicable to automatic segmentation problems in diagnostic imaging.
Collapse
|
14
|
Risk-Aware Machine Learning Classifier for Skin Lesion Diagnosis. J Clin Med 2019; 8:E1241. [PMID: 31426482 PMCID: PMC6723257 DOI: 10.3390/jcm8081241] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Revised: 08/12/2019] [Accepted: 08/15/2019] [Indexed: 01/01/2023] Open
Abstract
Knowing when a machine learning system is not confident about its prediction is crucial in medical domains where safety is critical. Ideally, a machine learning algorithm should make a prediction only when it is highly certain about its competency, and refer the case to physicians otherwise. In this paper, we investigate how Bayesian deep learning can improve the performance of the machine-physician team in the skin lesion classification task. We used the publicly available HAM10000 dataset, which includes samples from seven common skin lesion categories: Melanoma (MEL), Melanocytic Nevi (NV), Basal Cell Carcinoma (BCC), Actinic Keratoses and Intraepithelial Carcinoma (AKIEC), Benign Keratosis (BKL), Dermatofibroma (DF), and Vascular (VASC) lesions. Our experimental results show that Bayesian deep networks can boost the diagnostic performance of the standard DenseNet-169 model from 81.35% to 83.59% without incurring additional parameters or heavy computation. More importantly, a hybrid physician-machine workflow reaches a classification accuracy of 90 % while only referring 35 % of the cases to physicians. The findings are expected to generalize to other medical diagnosis applications. We believe that the availability of risk-aware machine learning methods will enable a wider adoption of machine learning technology in clinical settings.
Collapse
|