1
|
Balendran A, Benchoufi M, Evgeniou T, Ravaud P. Algorithmovigilance, lessons from pharmacovigilance. NPJ Digit Med 2024; 7:270. [PMID: 39358559 PMCID: PMC11447237 DOI: 10.1038/s41746-024-01237-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 08/27/2024] [Indexed: 10/04/2024] Open
Abstract
Artificial Intelligence (AI) systems are increasingly being deployed across various high-risk applications, especially in healthcare. Despite significant attention to evaluating these systems, post-deployment incidents are not uncommon, and effective mitigation strategies remain challenging. Drug safety has a well-established history of assessing, monitoring, understanding, and preventing adverse effects in real-world usage, known as pharmacovigilance. Drawing inspiration from pharmacovigilance methods, we discuss concepts that can be adapted for monitoring AI systems in healthcare. This discussion aims to improve responses to adverse effects and potential incidents and risks associated with AI deployment in healthcare but also beyond.
Collapse
Affiliation(s)
- Alan Balendran
- Université Paris Cité and Université Sorbonne Paris Nord, Inserm, INRAE, Center for Research in Epidemiology and StatisticS (CRESS), Paris, France.
| | - Mehdi Benchoufi
- Université Paris Cité and Université Sorbonne Paris Nord, Inserm, INRAE, Center for Research in Epidemiology and StatisticS (CRESS), Paris, France
| | | | - Philippe Ravaud
- Université Paris Cité and Université Sorbonne Paris Nord, Inserm, INRAE, Center for Research in Epidemiology and StatisticS (CRESS), Paris, France
- Centre d'Epidémiologie Clinique, AP-HP, Hôpital Hôtel-Dieu, Paris, France
- Columbia University Mailman School of Public Health, Department of Epidemiology, New York, NY, USA
| |
Collapse
|
2
|
Angelini M, Blasilli G, Lenti S, Santucci G. A Visual Analytics Conceptual Framework for Explorable and Steerable Partial Dependence Analysis. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:4497-4513. [PMID: 37027262 DOI: 10.1109/tvcg.2023.3263739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Machine learning techniques are a driving force for research in various fields, from credit card fraud detection to stock analysis. Recently, a growing interest in increasing human involvement has emerged, with the primary goal of improving the interpretability of machine learning models. Among different techniques, Partial Dependence Plots (PDP) represent one of the main model-agnostic approaches for interpreting how the features influence the prediction of a machine learning model. However, its limitations (i.e., visual interpretation, aggregation of heterogeneous effects, inaccuracy, and computability) could complicate or misdirect the analysis. Moreover, the resulting combinatorial space can be challenging to explore both computationally and cognitively when analyzing the effects of more features at the same time. This article proposes a conceptual framework that enables effective analysis workflows, mitigating state-of-the-art limitations. The proposed framework allows for exploring and refining computed partial dependences, observing incrementally accurate results, and steering the computation of new partial dependences on user-selected subspaces of the combinatorial and intractable space. With this approach, the user can save both computational and cognitive costs, in contrast with the standard monolithic approach that computes all the possible combinations of features on all their domains in batch. The framework is the result of a careful design process involving experts' knowledge during its validation and informed the development of a prototype, W4SP1, that demonstrates its applicability traversing its different paths. A case study shows the advantages of the proposed approach.
Collapse
|
3
|
Lee S, Kang M. A Data-Driven Approach to Predicting Recreational Activity Participation Using Machine Learning. RESEARCH QUARTERLY FOR EXERCISE AND SPORT 2024:1-13. [PMID: 38875156 DOI: 10.1080/02701367.2024.2343815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 04/07/2024] [Indexed: 06/16/2024]
Abstract
Purpose: With the popularity of recreational activities, the study aimed to develop prediction models for recreational activity participation and explore the key factors affecting participation in recreational activities. Methods: A total of 12,712 participants, excluding individuals under 20, were selected from the National Health and Nutrition Examination Survey (NHANES) from 2011 to 2018. The mean age of the sample was 46.86 years (±16.97), with a gender distribution of 6,721 males and 5,991 females. The variables included demographic, physical-related variables, and lifestyle variables. This study developed 42 prediction models using six machine learning methods, including logistic regression, Support Vector Machine (SVM), decision tree, random forest, eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM). The relative importance of each variable was evaluated by permutation feature importance. Results: The results illustrated that the LightGBM was the most effective algorithm for predicting recreational activity participation (accuracy: .838, precision: .783, recall: .967, F1-score: .865, AUC: .826). In particular, prediction performance increased when the demographic and lifestyle datasets were used together. Next, as the result of the permutation feature importance based on the top models, education level and moderate-vigorous physical activity (MVPA) were found to be essential variables. Conclusion: These findings demonstrated the potential of a data-driven approach utilizing machine learning in a recreational discipline. Furthermore, this study interpreted the prediction model through feature importance analysis to overcome the limitation of machine learning interpretability.
Collapse
|
4
|
Gökmen Inan N, Kocadağlı O, Yıldırım D, Meşe İ, Kovan Ö. Multi-class classification of thyroid nodules from automatic segmented ultrasound images: Hybrid ResNet based UNet convolutional neural network approach. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 243:107921. [PMID: 37950926 DOI: 10.1016/j.cmpb.2023.107921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 10/20/2023] [Accepted: 11/06/2023] [Indexed: 11/13/2023]
Abstract
BACKGROUND AND OBJECTIVES Early detection and diagnosis of thyroid nodule types are important because they can be treated more effectively in their early stages. The types of thyroid nodules are generally stated as atypia of undetermined significance/follicular lesion of undetermined significance (AUS/FLUS), benign follicular, and papillary follicular. The risk of malignancy for AUS/FLUS is typically stated to be between 5 and 15 %, while some studies indicate a risk as high as 25 %. Without complete histology, it is difficult to classify nodules and these diagnostic operations are pricey and risky. To minimize laborious workload and misdiagnosis, recently various AI-based decision support systems have been developed. METHODS In this study, a novel AI-based decision support system has been developed for the automated segmentation and classification of the types of thyroid nodules. This system is based on a hybrid deep-learning procedure that makes both an automatic thyroid nodule segmentation and classification tasks, respectively. In this framework, the segmentation is executed with some U-Net architectures such as ResUNet and ResUNet++ integrating with the feature extraction and upsampling with dropout operations to prevent overfitting. The nodule classification task is achieved by various deep nets architecture such as VGG-16, DenseNet121, ResNet-50, and Inception ResNet-v2 considering some accurate classification criteria such as Intersection over Union (IOU), Dice coefficient, accuracy, precision, and recall. RESULTS In analysis, a total of 880 patients with ages ranging from 10 to 90 years were included by taking the ultrasound images and demographics. The experimental evaluations showed that ResUNet++ demonstrated excellent segmentation outcomes, attaining remarkable evaluation scores including a dice coefficient of 92.4 % and a mean IOU of 89.7 %. ResNet-50 and Inception ResNet-v2 trained over the images segmented with UNets have shown better performance in terms of achieving high evaluation scores for the classification accuracy such as 96.6 % and 95.0 %, respectively. In addition, ResNet-50 and Inception ResNet-v2 classified AUS/FLUS from the images segmented with UNets with AUC=97.0 % and 96.0 %, respectively. CONCLUSIONS The proposed AI-based decision support system improves the automatic segmentation performance of AUS/FLUS and it has shown better performance than available approaches in the literature with respect to ACC, Jaccard and DICE losses. This system has great potential for clinical use by both radiologists and surgeons as well.
Collapse
Affiliation(s)
- Neslihan Gökmen Inan
- College of Engineering, Computer Engineering Department, Koç University, Türkiye
| | - Ozan Kocadağlı
- Department of Statistics, Faculty of Science and Letters, Mimar Sinan Fine Arts University, Silahsör Cad. No. 81, 34380 Bomonti/Sisli, Istanbul, Türkiye.
| | | | - İsmail Meşe
- Department of Radiology, Erenkoy Mental Health and Neurology Training and Research Hospital, Health Sciences University, Türkiye
| | - Özge Kovan
- Vocational School of Health Services, Medical Imaging Techniques, Acıbadem University, Türkiye
| |
Collapse
|
5
|
Delaforge A, Aze J, Bringay S, Mollevi C, Sallaberry A, Servajean M. EBBE-Text: Explaining Neural Networks by Exploring Text Classification Decision Boundaries. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:4154-4171. [PMID: 35724275 DOI: 10.1109/tvcg.2022.3184247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
While neural networks (NN) have been successfully applied to many NLP tasks, the way they function is often difficult to interpret. In this article, we focus on binary text classification via NNs and propose a new tool, which includes a visualization of the decision boundary and the distances of data elements to this boundary. This tool increases the interpretability of NN. Our approach uses two innovative views: (1) an overview of the text representation space and (2) a local view allowing data exploration around the decision boundary for various localities of this representation space. These views are integrated into a visual platform, EBBE-Text, which also contains state-of-the-art visualizations of NN representation spaces and several kinds of information obtained from the classification process. The various views are linked through numerous interactive functionalities that enable easy exploration of texts and classification results via the various complementary views. A user study shows the effectiveness of the visual encoding and a case study illustrates the benefits of using our tool for the analysis of the classifications obtained with several recent NNs and two datasets.
Collapse
|
6
|
Ma H, Prosperino D, Räth C. A novel approach to minimal reservoir computing. Sci Rep 2023; 13:12970. [PMID: 37563235 PMCID: PMC10415382 DOI: 10.1038/s41598-023-39886-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 08/01/2023] [Indexed: 08/12/2023] Open
Abstract
Reservoir computers are powerful machine learning algorithms for predicting nonlinear systems. Unlike traditional feedforward neural networks, they work on small training data sets, operate with linear optimization, and therefore require minimal computational resources. However, the traditional reservoir computer uses random matrices to define the underlying recurrent neural network and has a large number of hyperparameters that need to be optimized. Recent approaches show that randomness can be taken out by running regressions on a large library of linear and nonlinear combinations constructed from the input data and their time lags and polynomials thereof. However, for high-dimensional and nonlinear data, the number of these combinations explodes. Here, we show that a few simple changes to the traditional reservoir computer architecture further minimizing computational resources lead to significant and robust improvements in short- and long-term predictive performances compared to similar models while requiring minimal sizes of training data sets.
Collapse
Affiliation(s)
- Haochun Ma
- Department of Physics, Ludwig-Maximilians-Universität, Schellingstraße 4, 80799, Munich, Germany
| | - Davide Prosperino
- Department of Physics, Ludwig-Maximilians-Universität, Schellingstraße 4, 80799, Munich, Germany
| | - Christoph Räth
- Deutsches Zentrum für Luft- und Raumfahrt (DLR), Institut für KI Sicherheit, Wilhelm-Runge-Straße 10, 89081, Ulm, Germany.
| |
Collapse
|
7
|
Ma H, Prosperino D, Haluszczynski A, Räth C. Efficient forecasting of chaotic systems with block-diagonal and binary reservoir computing. CHAOS (WOODBURY, N.Y.) 2023; 33:2895979. [PMID: 37307160 DOI: 10.1063/5.0151290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 05/12/2023] [Indexed: 06/14/2023]
Abstract
The prediction of complex nonlinear dynamical systems with the help of machine learning has become increasingly popular in different areas of science. In particular, reservoir computers, also known as echo-state networks, turned out to be a very powerful approach, especially for the reproduction of nonlinear systems. The reservoir, the key component of this method, is usually constructed as a sparse, random network that serves as a memory for the system. In this work, we introduce block-diagonal reservoirs, which implies that a reservoir can be composed of multiple smaller reservoirs, each with its own dynamics. Furthermore, we take out the randomness of the reservoir by using matrices of ones for the individual blocks. This breaks with the widespread interpretation of the reservoir as a single network. In the example of the Lorenz and Halvorsen systems, we analyze the performance of block-diagonal reservoirs and their sensitivity to hyperparameters. We find that the performance is comparable to sparse random networks and discuss the implications with regard to scalability, explainability, and hardware realizations of reservoir computers.
Collapse
Affiliation(s)
- Haochun Ma
- Department of Physics, Ludwig-Maximilians-Universität, Schellingstraße 4, 80799 Munich, Germany
- Allianz Global Investors, risklab, Seidlstraße 24, 80335 Munich, Germany
| | - Davide Prosperino
- Department of Physics, Ludwig-Maximilians-Universität, Schellingstraße 4, 80799 Munich, Germany
- Allianz Global Investors, risklab, Seidlstraße 24, 80335 Munich, Germany
| | | | - Christoph Räth
- Deutsches Zentrum für Luft- und Raumfahrt (DLR), Institut für KI Sicherheit, Wilhelm-Runge-Straße 10, 89081 Ulm, Germany
| |
Collapse
|
8
|
Tyagi A, Xie C, Mueller K. NAS-Navigator: Visual Steering for Explainable One-Shot Deep Neural Network Synthesis. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:299-309. [PMID: 36166525 DOI: 10.1109/tvcg.2022.3209361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The success of DL can be attributed to hours of parameter and architecture tuning by human experts. Neural Architecture Search (NAS) techniques aim to solve this problem by automating the search procedure for DNN architectures making it possible for non-experts to work with DNNs. Specifically, One-shot NAS techniques have recently gained popularity as they are known to reduce the search time for NAS techniques. One-Shot NAS works by training a large template network through parameter sharing which includes all the candidate NNs. This is followed by applying a procedure to rank its components through evaluating the possible candidate architectures chosen randomly. However, as these search models become increasingly powerful and diverse, they become harder to understand. Consequently, even though the search results work well, it is hard to identify search biases and control the search progression, hence a need for explainability and human-in-the-loop (HIL) One-Shot NAS. To alleviate these problems, we present NAS-Navigator, a visual analytics (VA) system aiming to solve three problems with One-Shot NAS; explainability, HIL design, and performance improvements compared to existing state-of-the-art (SOTA) techniques. NAS-Navigator gives full control of NAS back in the hands of the users while still keeping the perks of automated search, thus assisting non-expert users. Analysts can use their domain knowledge aided by cues from the interface to guide the search. Evaluation results confirm the performance of our improved One-Shot NAS algorithm is comparable to other SOTA techniques. While adding Visual Analytics (VA) using NAS-Navigator shows further improvements in search time and performance. We designed our interface in collaboration with several deep learning researchers and evaluated NAS-Navigator through a control experiment and expert interviews.
Collapse
|
9
|
Leveraging Deep Learning for Designing Healthcare Analytics Heuristic for Diagnostics. Neural Process Lett 2023; 55:53-79. [PMID: 33551665 PMCID: PMC7852051 DOI: 10.1007/s11063-021-10425-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/01/2021] [Indexed: 11/25/2022]
Abstract
Healthcare Informatics is a phenomenon being talked about from the early 21st century in the era in which we are living. With evolution of new computing technologies huge amount of data in healthcare is produced opening several research areas. Managing the massiveness of this data is required while extracting knowledge for decision making is the main concern of today. For this task researchers are doing explorations in big data analytics, deep learning (advanced form of machine learning known as deep neural nets), predictive analytics and various other algorithms to bring innovation in healthcare. Through all these innovations happening it is not wrong to establish that disease prediction with anticipation of its cure is no longer unrealistic. First, Dengue Fever (DF) and then Covid-19 likewise are new outbreak in infectious lethal diseases and diagnosing at all stages is crucial to decrease mortality rate. In case of Diabetes, clinicians and experts are finding challenging the timely diagnosis and analyzing the chances of developing underlying diseases. In this paper, Louvain Mani-Hierarchical Fold Learning healthcare analytics, a hybrid deep learning technique is proposed for medical diagnostics and is tested and validated using real-time dataset of 104 instances of patients with dengue fever made available by Holy Family Hospital, Pakistan and 810 instances found for infectious diseases including prognosis of; Covid-19, SARS, ARDS, Pneumocystis, Streptococcus, Chlamydophila, Klebsiella, Legionella, Lipoid, etc. on GitHub. Louvain Mani-Hierarchical Fold Learning healthcare analytics showed maximum 0.952 correlations between two clusters with Spearman when applied on 240 instances extracted from comorbidities diagnostic data model derived from 15696 endocrine records of multiple visits of 100 patients identified by a unique ID. Accuracy for induced rules is evaluated by Laplace (Fig. 8) as 0.727, 0.701 and 0.203 for 41, 18 and 24 rules, respectively. Endocrine diagnostic data is made available by Shifa International Hospital, Islamabad, Pakistan. Our results show that in future this algorithm may be tested for diagnostics on healthcare big data.
Collapse
|
10
|
Wang X, Chen W, Xia J, Wen Z, Zhu R, Schreck T. HetVis: A Visual Analysis Approach for Identifying Data Heterogeneity in Horizontal Federated Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:310-319. [PMID: 36197857 DOI: 10.1109/tvcg.2022.3209347] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Horizontal federated learning (HFL) enables distributed clients to train a shared model and keep their data privacy. In training high-quality HFL models, the data heterogeneity among clients is one of the major concerns. However, due to the security issue and the complexity of deep learning models, it is challenging to investigate data heterogeneity across different clients. To address this issue, based on a requirement analysis we developed a visual analytics tool, HetVis, for participating clients to explore data heterogeneity. We identify data heterogeneity through comparing prediction behaviors of the global federated model and the stand-alone model trained with local data. Then, a context-aware clustering of the inconsistent records is done, to provide a summary of data heterogeneity. Combining with the proposed comparison techniques, we develop a novel set of visualizations to identify heterogeneity issues in HFL. We designed three case studies to introduce how HetVis can assist client analysts in understanding different types of heterogeneity issues. Expert reviews and a comparative study demonstrate the effectiveness of HetVis.
Collapse
|
11
|
Zhang X, Ono JP, Song H, Gou L, Ma KL, Ren L. SliceTeller: A Data Slice-Driven Approach for Machine Learning Model Validation. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:842-852. [PMID: 36179005 DOI: 10.1109/tvcg.2022.3209465] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Real-world machine learning applications need to be thoroughly evaluated to meet critical product requirements for model release, to ensure fairness for different groups or individuals, and to achieve a consistent performance in various scenarios. For example, in autonomous driving, an object classification model should achieve high detection rates under different conditions of weather, distance, etc. Similarly, in the financial setting, credit-scoring models must not discriminate against minority groups. These conditions or groups are called as "Data Slices". In product MLOps cycles, product developers must identify such critical data slices and adapt models to mitigate data slice problems. Discovering where models fail, understanding why they fail, and mitigating these problems, are therefore essential tasks in the MLOps life-cycle. In this paper, we present SliceTeller, a novel tool that allows users to debug, compare and improve machine learning models driven by critical data slices. SliceTeller automatically discovers problematic slices in the data, helps the user understand why models fail. More importantly, we present an efficient algorithm, SliceBoosting, to estimate trade-offs when prioritizing the optimization over certain slices. Furthermore, our system empowers model developers to compare and analyze different model versions during model iterations, allowing them to choose the model version best suitable for their applications. We evaluate our system with three use cases, including two real-world use cases of product development, to demonstrate the power of SliceTeller in the debugging and improvement of product-quality ML models.
Collapse
|
12
|
Fraternali P, Milani F, Torres RN, Zangrando N. Black-box error diagnosis in Deep Neural Networks for computer vision: a survey of tools. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-08100-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
13
|
Wang J, Zhang W, Yang H, Yeh CCM, Wang L. Visual Analytics for RNN-Based Deep Reinforcement Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:4141-4155. [PMID: 33929961 DOI: 10.1109/tvcg.2021.3076749] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Deep reinforcement learning (DRL) targets to train an autonomous agent to interact with a pre-defined environment and strives to achieve specific goals through deep neural networks (DNN). Recurrent neural network (RNN) based DRL has demonstrated superior performance, as RNNs can effectively capture the temporal evolution of the environment and respond with proper agent actions. However, apart from the outstanding performance, little is known about how RNNs understand the environment internally and what has been memorized over time. Revealing these details is extremely important for deep learning experts to understand and improve DRLs, which in contrast, is also challenging due to the complicated data transformations inside these models. In this article, we propose Deep Reinforcement Learning Interactive Visual Explorer (DRLIVE), a visual analytics system to effectively explore, interpret, and diagnose RNN-based DRLs. Having focused on DRL agents trained for different Atari games, DRLIVE accomplishes three tasks: game episode exploration, RNN hidden/cell state examination, and interactive model perturbation. Using the system, one can flexibly explore a DRL agent through interactive visualizations, discover interpretable RNN cells by prioritizing RNN hidden/cell states with a set of metrics, and further diagnose the DRL model by interactively perturbing its inputs. Through concrete studies with multiple deep learning experts, we validated the efficacy of DRLIVE.
Collapse
|
14
|
Yan F, Wen S, Nepal S, Paris C, Xiang Y. Explainable machine learning in cybersecurity: A survey. INT J INTELL SYST 2022. [DOI: 10.1002/int.23088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Feixue Yan
- School of Science, Computing and Engineering Technologies Swinburne University of Technology Melbourne Victoria Australia
- Distributed Systems Security CSIRO's Data61 Sydney New South Wales Australia
| | - Sheng Wen
- School of Science, Computing and Engineering Technologies Swinburne University of Technology Melbourne Victoria Australia
| | - Surya Nepal
- Distributed Systems Security CSIRO's Data61 Sydney New South Wales Australia
| | - Cecile Paris
- Knowledge Discovery and Management CSIRO's Data61 Sydney New South Wales Australia
| | - Yang Xiang
- School of Science, Computing and Engineering Technologies Swinburne University of Technology Melbourne Victoria Australia
| |
Collapse
|
15
|
Eldrandaly KA, Abdel-Basset M, Ibrahim M, Abdel-Aziz NM. Explainable and secure artificial intelligence: taxonomy, cases of study, learned lessons, challenges and future directions. ENTERP INF SYST-UK 2022. [DOI: 10.1080/17517575.2022.2098537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
Affiliation(s)
| | | | - Mahmoud Ibrahim
- Faculty of Computers and Informatics, Zagazig University, Zagazig, Egypt
| | | |
Collapse
|
16
|
SDA-Vis: A Visualization System for Student Dropout Analysis Based on Counterfactual Exploration. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12125785] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
High and persistent dropout rates represent one of the biggest challenges for improving the efficiency of the educational system, particularly in underdeveloped countries. A range of features influence college dropouts, with some belonging to the educational field and others to non-educational fields. Understanding the interplay of these variables to identify a student as a potential dropout could help decision makers interpret the situation and decide what they should do next to reduce student dropout rates based on corrective actions. This paper presents SDA-Vis, a visualization system that supports counterfactual explanations for student dropout dynamics, considering various academic, social, and economic variables. In contrast to conventional systems, our approach provides information about feature-perturbed versions of a student using counterfactual explanations. SDA-Vis comprises a set of linked views that allow users to identify variables alteration to chance predefined students situations. This involves perturbing the variables of a dropout student to achieve synthetic non-dropout students. SDA-Vis has been developed under the guidance and supervision of domain experts, in line with some analytical objectives. We demonstrate the usefulness of SDA-Vis through case studies run in collaboration with domain experts, using a real data set from a Latin American university. The analysis reveals the effectiveness of SDA-Vis in identifying students at risk of dropping out and proposes corrective actions, even for particular cases that have not been shown to be at risk with the traditional tools that experts use.
Collapse
|
17
|
Xuan X, Zhang X, Kwon OH, Ma KL. VAC-CNN: A Visual Analytics System for Comparative Studies of Deep Convolutional Neural Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:2326-2337. [PMID: 35389868 DOI: 10.1109/tvcg.2022.3165347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The rapid development of Convolutional Neural Networks (CNNs) in recent years has triggered significant breakthroughs in many machine learning (ML) applications. The ability to understand and compare various CNN models available is thus essential. The conventional approach with visualizing each model's quantitative features, such as classification accuracy and computational complexity, is not sufficient for a deeper understanding and comparison of the behaviors of different models. Moreover, most of the existing tools for assessing CNN behaviors only support comparison between two models and lack the flexibility of customizing the analysis tasks according to user needs. This paper presents a visual analytics system, VAC-CNN (Visual Analytics for Comparing CNNs), that supports the in-depth inspection of a single CNN model as well as comparative studies of two or more models. The ability to compare a larger number of (e.g., tens of) models especially distinguishes our system from previous ones. With a carefully designed model visualization and explaining support, VAC-CNN facilitates a highly interactive workflow that promptly presents both quantitative and qualitative information at each analysis stage. We demonstrate VAC-CNN's effectiveness for assisting novice ML practitioners in evaluating and comparing multiple CNN models through two use cases and one preliminary evaluation study using the image classification tasks on the ImageNet dataset.
Collapse
|
18
|
ConfusionVis: Comparative evaluation and selection of multi-class classifiers based on confusion matrices. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108651] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
19
|
Hinterreiter A, Ruch P, Stitz H, Ennemoser M, Bernard J, Strobelt H, Streit M. ConfusionFlow: A Model-Agnostic Visualization for Temporal Analysis of Classifier Confusion. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:1222-1236. [PMID: 32746284 DOI: 10.1109/tvcg.2020.3012063] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Classifiers are among the most widely used supervised machine learning algorithms. Many classification models exist, and choosing the right one for a given task is difficult. During model selection and debugging, data scientists need to assess classifiers' performances, evaluate their learning behavior over time, and compare different models. Typically, this analysis is based on single-number performance measures such as accuracy. A more detailed evaluation of classifiers is possible by inspecting class errors. The confusion matrix is an established way for visualizing these class errors, but it was not designed with temporal or comparative analysis in mind. More generally, established performance analysis systems do not allow a combined temporal and comparative analysis of class-level information. To address this issue, we propose ConfusionFlow, an interactive, comparative visualization tool that combines the benefits of class confusion matrices with the visualization of performance characteristics over time. ConfusionFlow is model-agnostic and can be used to compare performances for different model types, model architectures, and/or training and test datasets. We demonstrate the usefulness of ConfusionFlow in a case study on instance selection strategies in active learning. We further assess the scalability of ConfusionFlow and present a use case in the context of neural network pruning.
Collapse
|
20
|
Zytek A, Liu D, Vaithianathan R, Veeramachaneni K. Sibyl: Understanding and Addressing the Usability Challenges of Machine Learning In High-Stakes Decision Making. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:1161-1171. [PMID: 34587081 DOI: 10.1109/tvcg.2021.3114864] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Machine learning (ML) is being applied to a diverse and ever-growing set of domains. In many cases, domain experts - who often have no expertise in ML or data science - are asked to use ML predictions to make high-stakes decisions. Multiple ML usability challenges can appear as result, such as lack of user trust in the model, inability to reconcile human-ML disagreement, and ethical concerns about oversimplification of complex problems to a single algorithm output. In this paper, we investigate the ML usability challenges that present in the domain of child welfare screening through a series of collaborations with child welfare screeners. Following the iterative design process between the ML scientists, visualization researchers, and domain experts (child screeners), we first identified four key ML challenges and honed in on one promising explainable ML technique to address them (local factor contributions). Then we implemented and evaluated our visual analytics tool, Sibyl, to increase the interpretability and interactivity of local factor contributions. The effectiveness of our tool is demonstrated by two formal user studies with 12 non-expert participants and 13 expert participants respectively. Valuable feedback was collected, from which we composed a list of design implications as a useful guideline for researchers who aim to develop an interpretable and interactive visualization tool for ML prediction models deployed for child welfare screeners and other similar domain experts.
Collapse
|
21
|
Oppermann M, Munzner T. VizSnippets: Compressing Visualization Bundles Into Representative Previews for Browsing Visualization Collections. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:747-757. [PMID: 34596545 DOI: 10.1109/tvcg.2021.3114841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Visualization collections, accessed by platforms such as Tableau Online or Power Bl, are used by millions of people to share and access diverse analytical knowledge in the form of interactive visualization bundles. Result snippets, compact previews of these bundles, are presented to users to help them identify relevant content when browsing collections. Our engagement with Tableau product teams and review of existing snippet designs on five platforms showed us that current practices fail to help people judge the relevance of bundles because they include only the title and one image. Users frequently need to undertake the time-consuming endeavour of opening a bundle within its visualization system to examine its many views and dashboards. In response, we contribute the first systematic approach to visualization snippet design. We propose a framework for snippet design that addresses eight key challenges that we identify. We present a computational pipeline to compress the visual and textual content of bundles into representative previews that is adaptive to a provided pixel budget and provides high information density with multiple images and carefully chosen keywords. We also reflect on the method of visual inspection through random sampling to gain confidence in model and parameter choices.
Collapse
|
22
|
Williams RD, Reps JM, Kors JA, Ryan PB, Steyerberg E, Verhamme KM, Rijnbeek PR. Using Iterative Pairwise External Validation to Contextualize Prediction Model Performance: A Use Case Predicting 1-Year Heart Failure Risk in Patients with Diabetes Across Five Data Sources. Drug Saf 2022; 45:563-570. [PMID: 35579818 PMCID: PMC9114056 DOI: 10.1007/s40264-022-01161-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/09/2022] [Indexed: 01/28/2023]
Abstract
INTRODUCTION External validation of prediction models is increasingly being seen as a minimum requirement for acceptance in clinical practice. However, the lack of interoperability of healthcare databases has been the biggest barrier to this occurring on a large scale. Recent improvements in database interoperability enable a standardized analytical framework for model development and external validation. External validation of a model in a new database lacks context, whereby the external validation can be compared with a benchmark in this database. Iterative pairwise external validation (IPEV) is a framework that uses a rotating model development and validation approach to contextualize the assessment of performance across a network of databases. As a use case, we predicted 1-year risk of heart failure in patients with type 2 diabetes mellitus. METHODS The method follows a two-step process involving (1) development of baseline and data-driven models in each database according to best practices and (2) validation of these models across the remaining databases. We introduce a heatmap visualization that supports the assessment of the internal and external model performance in all available databases. As a use case, we developed and validated models to predict 1-year risk of heart failure in patients initializing a second pharmacological intervention for type 2 diabetes mellitus. We leveraged the power of the Observational Medical Outcomes Partnership common data model to create an open-source software package to increase the consistency, speed, and transparency of this process. RESULTS A total of 403,187 patients from five databases were included in the study. We developed five models that, when assessed internally, had a discriminative performance ranging from 0.73 to 0.81 area under the receiver operating characteristic curve with acceptable calibration. When we externally validated these models in a new database, three models achieved consistent performance and in context often performed similarly to models developed in the database itself. The visualization of IPEV provided valuable insights. From this, we identified the model developed in the Commercial Claims and Encounters (CCAE) database as the best performing model overall. CONCLUSION Using IPEV lends weight to the model development process. The rotation of development through multiple databases provides context to model assessment, leading to improved understanding of transportability and generalizability. The inclusion of a baseline model in all modelling steps provides further context to the performance gains of increasing model complexity. The CCAE model was identified as a candidate for clinical use. The use case demonstrates that IPEV provides a huge opportunity in a new era of standardised data and analytics to improve insight into and trust in prediction models at an unprecedented scale.
Collapse
Affiliation(s)
- Ross D. Williams
- Department of Medical Informatics, Erasmus MC, University Medical Center Rotterdam, Doctor Molewaterplein 40, 3015 GD Rotterdam, The Netherlands
| | - Jenna M. Reps
- Janssen Research and Development, Titusville, NJ USA
| | - Jan A. Kors
- Department of Medical Informatics, Erasmus MC, University Medical Center Rotterdam, Doctor Molewaterplein 40, 3015 GD Rotterdam, The Netherlands
| | | | - Ewout Steyerberg
- Department of Public Health, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Katia M. Verhamme
- Department of Medical Informatics, Erasmus MC, University Medical Center Rotterdam, Doctor Molewaterplein 40, 3015 GD Rotterdam, The Netherlands
| | - Peter R. Rijnbeek
- Department of Medical Informatics, Erasmus MC, University Medical Center Rotterdam, Doctor Molewaterplein 40, 3015 GD Rotterdam, The Netherlands
| |
Collapse
|
23
|
Wang X, He J, Jin Z, Yang M, Wang Y, Qu H. M2Lens: Visualizing and Explaining Multimodal Models for Sentiment Analysis. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:802-812. [PMID: 34587037 DOI: 10.1109/tvcg.2021.3114794] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multimodal sentiment analysis aims to recognize people's attitudes from multiple communication channels such as verbal content (i.e., text), voice, and facial expressions. It has become a vibrant and important research topic in natural language processing. Much research focuses on modeling the complex intra- and inter-modal interactions between different communication channels. However, current multimodal models with strong performance are often deep-learning-based techniques and work like black boxes. It is not clear how models utilize multimodal information for sentiment predictions. Despite recent advances in techniques for enhancing the explainability of machine learning models, they often target unimodal scenarios (e.g., images, sentences), and little research has been done on explaining multimodal models. In this paper, we present an interactive visual analytics system, M2 Lens, to visualize and explain multimodal models for sentiment analysis. M2 Lens provides explanations on intra- and inter-modal interactions at the global, subset, and local levels. Specifically, it summarizes the influence of three typical interaction types (i.e., dominance, complement, and conflict) on the model predictions. Moreover, M2 Lens identifies frequent and influential multimodal features and supports the multi-faceted exploration of model behaviors from language, acoustic, and visual modalities. Through two case studies and expert interviews, we demonstrate our system can help users gain deep insights into the multimodal models for sentiment analysis.
Collapse
|
24
|
Meng L, Wei Y, Pan R, Zhou S, Zhang J, Chen W. VADAF: Visualization for Abnormal Client Detection and Analysis in Federated Learning. ACM T INTERACT INTEL 2021. [DOI: 10.1145/3426866] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Federated Learning (FL) provides a powerful solution to distributed machine learning on a large corpus of decentralized data. It ensures privacy and security by performing computation on devices (which we refer to as clients) based on local data to improve the shared global model. However, the inaccessibility of the data and the invisibility of the computation make it challenging to interpret and analyze the training process, especially to distinguish potential client anomalies. Identifying these anomalies can help experts diagnose and improve FL models. For this reason, we propose a visual analytics system, VADAF, to depict the training dynamics and facilitate analyzing potential client anomalies. Specifically, we design a visualization scheme that supports massive training dynamics in the FL environment. Moreover, we introduce an anomaly detection method to detect potential client anomalies, which are further analyzed based on both the client model’s visual and objective estimation. Three case studies have demonstrated the effectiveness of our system in understanding the FL training process and supporting abnormal client detection and analysis.
Collapse
Affiliation(s)
- Linhao Meng
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
| | - Yating Wei
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
| | - Rusheng Pan
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
| | - Shuyue Zhou
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
| | - Jianwei Zhang
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
| | - Wei Chen
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
| |
Collapse
|
25
|
|
26
|
Gärtler M, Khaydarov V, Klöpper B, Urbas L. The Machine Learning Life Cycle in Chemical Operations – Status and Open Challenges. CHEM-ING-TECH 2021. [DOI: 10.1002/cite.202100134] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Marco Gärtler
- ABB Corporate Research Center Wallstadter Straße 59 68526 Ladenburg Germany
| | - Valentin Khaydarov
- Technische Universität Dresden Professur für Prozessleittechnik 01062 Dresden Germany
| | - Benjamin Klöpper
- ABB Corporate Research Center Wallstadter Straße 59 68526 Ladenburg Germany
| | - Leon Urbas
- Technische Universität Dresden Professur für Prozessleittechnik 01062 Dresden Germany
| |
Collapse
|
27
|
Stetson PD, Cantor MN, Gonen M. When Predictive Models Collide. JCO Clin Cancer Inform 2021; 4:547-550. [PMID: 32543898 DOI: 10.1200/cci.20.00024] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Affiliation(s)
- Peter D Stetson
- Department of Medicine, Digital Informatics and Technology Solutions, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Michael N Cantor
- Department of Medicine, Digital Informatics and Technology Solutions, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Mithat Gonen
- Department of Medicine, Digital Informatics and Technology Solutions, Memorial Sloan Kettering Cancer Center, New York, NY
| |
Collapse
|
28
|
Classification of Explainable Artificial Intelligence Methods through Their Output Formats. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2021. [DOI: 10.3390/make3030032] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Machine and deep learning have proven their utility to generate data-driven models with high accuracy and precision. However, their non-linear, complex structures are often difficult to interpret. Consequently, many scholars have developed a plethora of methods to explain their functioning and the logic of their inferences. This systematic review aimed to organise these methods into a hierarchical classification system that builds upon and extends existing taxonomies by adding a significant dimension—the output formats. The reviewed scientific papers were retrieved by conducting an initial search on Google Scholar with the keywords “explainable artificial intelligence”; “explainable machine learning”; and “interpretable machine learning”. A subsequent iterative search was carried out by checking the bibliography of these articles. The addition of the dimension of the explanation format makes the proposed classification system a practical tool for scholars, supporting them to select the most suitable type of explanation format for the problem at hand. Given the wide variety of challenges faced by researchers, the existing XAI methods provide several solutions to meet the requirements that differ considerably between the users, problems and application fields of artificial intelligence (AI). The task of identifying the most appropriate explanation can be daunting, thus the need for a classification system that helps with the selection of methods. This work concludes by critically identifying the limitations of the formats of explanations and by providing recommendations and possible future research directions on how to build a more generally applicable XAI method. Future work should be flexible enough to meet the many requirements posed by the widespread use of AI in several fields, and the new regulations.
Collapse
|
29
|
Kumar N, Narayan Das N, Gupta D, Gupta K, Bindra J. Efficient Automated Disease Diagnosis Using Machine Learning Models. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:9983652. [PMID: 34035886 PMCID: PMC8101482 DOI: 10.1155/2021/9983652] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 04/07/2021] [Accepted: 04/24/2021] [Indexed: 01/01/2023]
Abstract
Recently, many researchers have designed various automated diagnosis models using various supervised learning models. An early diagnosis of disease may control the death rate due to these diseases. In this paper, an efficient automated disease diagnosis model is designed using the machine learning models. In this paper, we have selected three critical diseases such as coronavirus, heart disease, and diabetes. In the proposed model, the data are entered into an android app, the analysis is then performed in a real-time database using a pretrained machine learning model which was trained on the same dataset and deployed in firebase, and finally, the disease detection result is shown in the android app. Logistic regression is used to carry out computation for prediction. Early detection can help in identifying the risk of coronavirus, heart disease, and diabetes. Comparative analysis indicates that the proposed model can help doctors to give timely medications for treatment.
Collapse
Affiliation(s)
- Naresh Kumar
- Department of Computer Science & Engineering, Maharaja Surajmal Institute of Technology, C-4, Janakpuri, New Delhi 110058, India
| | - Nripendra Narayan Das
- Department of Information Technology, School of Computing and Information Technology, Manipal University Jaipur, Jaipur, Rajasthan 303007, India
| | - Deepali Gupta
- Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, India
| | - Kamali Gupta
- Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, India
| | - Jatin Bindra
- Department of Computer Science & Engineering, Maharaja Surajmal Institute of Technology, C-4, Janakpuri, New Delhi 110058, India
| |
Collapse
|
30
|
Chatzimparmpas A, Martins RM, Kucher K, Kerren A. StackGenVis: Alignment of Data, Algorithms, and Models for Stacking Ensemble Learning Using Performance Metrics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1547-1557. [PMID: 33048687 DOI: 10.1109/tvcg.2020.3030352] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In machine learning (ML), ensemble methods-such as bagging, boosting, and stacking-are widely-established approaches that regularly achieve top-notch predictive performance. Stacking (also called "stacked generalization") is an ensemble method that combines heterogeneous base models, arranged in at least one layer, and then employs another metamodel to summarize the predictions of those models. Although it may be a highly-effective approach for increasing the predictive performance of ML, generating a stack of models from scratch can be a cumbersome trial-and-error process. This challenge stems from the enormous space of available solutions, with different sets of data instances and features that could be used for training, several algorithms to choose from, and instantiations of these algorithms using diverse parameters (i.e., models) that perform differently according to various metrics. In this work, we present a knowledge generation model, which supports ensemble learning with the use of visualization, and a visual analytics system for stacked generalization. Our system, StackGenVis, assists users in dynamically adapting performance metrics, managing data instances, selecting the most important features for a given data set, choosing a set of top-performant and diverse algorithms, and measuring the predictive performance. In consequence, our proposed tool helps users to decide between distinct models and to reduce the complexity of the resulting stack by removing overpromising and underperforming models. The applicability and effectiveness of StackGenVis are demonstrated with two use cases: a real-world healthcare data set and a collection of data related to sentiment/stance detection in texts. Finally, the tool has been evaluated through interviews with three ML experts.
Collapse
|
31
|
Li G, Wang J, Shen HW, Chen K, Shan G, Lu Z. CNNPruner: Pruning Convolutional Neural Networks with Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1364-1373. [PMID: 33048744 DOI: 10.1109/tvcg.2020.3030461] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Convolutional neural networks (CNNs) have demonstrated extraordinarily good performance in many computer vision tasks. The increasing size of CNN models, however, prevents them from being widely deployed to devices with limited computational resources, e.g., mobile/embedded devices. The emerging topic of model pruning strives to address this problem by removing less important neurons and fine-tuning the pruned networks to minimize the accuracy loss. Nevertheless, existing automated pruning solutions often rely on a numerical threshold of the pruning criteria, lacking the flexibility to optimally balance the trade-off between efficiency and accuracy. Moreover, the complicated interplay between the stages of neuron pruning and model fine-tuning makes this process opaque, and therefore becomes difficult to optimize. In this paper, we address these challenges through a visual analytics approach, named CNNPruner. It considers the importance of convolutional filters through both instability and sensitivity, and allows users to interactively create pruning plans according to a desired goal on model size or accuracy. Also, CNNPruner integrates state-of-the-art filter visualization techniques to help users understand the roles that different filters played and refine their pruning plans. Through comprehensive case studies on CNNs with real-world sizes, we validate the effectiveness of CNNPruner.
Collapse
|
32
|
Ma Y, Fan A, He J, Nelakurthi AR, Maciejewski R. A Visual Analytics Framework for Explaining and Diagnosing Transfer Learning Processes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1385-1395. [PMID: 33035164 DOI: 10.1109/tvcg.2020.3028888] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Many statistical learning models hold an assumption that the training data and the future unlabeled data are drawn from the same distribution. However, this assumption is difficult to fulfill in real-world scenarios and creates barriers in reusing existing labels from similar application domains. Transfer Learning is intended to relax this assumption by modeling relationships between domains, and is often applied in deep learning applications to reduce the demand for labeled data and training time. Despite recent advances in exploring deep learning models with visual analytics tools, little work has explored the issue of explaining and diagnosing the knowledge transfer process between deep learning models. In this paper, we present a visual analytics framework for the multi-level exploration of the transfer learning processes when training deep neural networks. Our framework establishes a multi-aspect design to explain how the learned knowledge from the existing model is transferred into the new learning task when training deep neural networks. Based on a comprehensive requirement and task analysis, we employ descriptive visualization with performance measures and detailed inspections of model behaviors from the statistical, instance, feature, and model structure levels. We demonstrate our framework through two case studies on image classification by fine-tuning AlexNets to illustrate how analysts can utilize our framework.
Collapse
|
33
|
Ebrahimi-Khusfi Z, Taghizadeh-Mehrjardi R, Nafarzadegan AR. Accuracy, uncertainty, and interpretability assessments of ANFIS models to predict dust concentration in semi-arid regions. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2021; 28:6796-6810. [PMID: 33011943 DOI: 10.1007/s11356-020-10957-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 09/20/2020] [Indexed: 06/11/2023]
Abstract
Accurate prediction of the dust concentration (DC) is necessary to reduce its undesirable environmental effects in different geographical areas. Although the adaptive neuro-fuzzy inference system (ANFIS) is a powerful model for predicting dust events, no attempt has been made to investigate its uncertainty and interpretability. In this study, therefore, the uncertainty of the ANFIS model was quantified using uncertainty estimation based on local errors and clustering methods. Furthermore, we used a model-agnostic interpretation to make the ANFIS model interpretable. In addition, we used the bat optimization algorithm (BAT) to increase the prediction accuracy of the ANFIS model. Seven explanatory variables were chosen for predicting DC in the cold and warm months across semi-arid regions of Iran. The results showed that the ANFIS+BAT model increased the correlation coefficient by 10% and 16% for predicting DC in the cold and warm months, respectively, compared with the ANFIS model. Furthermore, the uncertainty analysis indicated a lower prediction interval (i.e., lower uncertainty) for the ANFIS+BAT model compared with the ANFIS model for predicting DC in the cold and warm months. In addition, the model-agnostic interpretation tool findings indicated the highest contributions of air temperature and maximum wind speed for predicting DC in the cold and warm months, respectively. Prediction of DC using the proposed model will allow decision-makers to better plan for measures to mitigate the risks of wind erosion and air pollution.
Collapse
Affiliation(s)
- Zohre Ebrahimi-Khusfi
- Department of Natural Science, Faculty of Natural Resources, University of Jiroft, Jiroft, Iran.
| | - Ruhollah Taghizadeh-Mehrjardi
- Department of Geosciences, Soil Science and Geomorphology, University of Tübingen, Tubingen, Germany.
- Faculty of Agriculture and Natural Resources, Ardakan University, Ardakan, Iran.
| | - Ali Reza Nafarzadegan
- Department of Natural Resources Engineering, University of Hormozgan, Bandar-Abbas, Hormozgan, Iran
| |
Collapse
|
34
|
Wang Q, Alexander W, Pegg J, Qu H, Chen M. HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1417-1426. [PMID: 33048739 DOI: 10.1109/tvcg.2020.3030449] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this paper, we present a visual analytics tool for enabling hypothesis-based evaluation of machine learning (ML) models. We describe a novel ML-testing framework that combines the traditional statistical hypothesis testing (commonly used in empirical research) with logical reasoning about the conclusions of multiple hypotheses. The framework defines a controlled configuration for testing a number of hypotheses as to whether and how some extra information about a "concept" or "feature" may benefit or hinder an ML model. Because reasoning multiple hypotheses is not always straightforward, we provide HypoML as a visual analysis tool, with which, the multi-thread testing results are first transformed to analytical results using statistical and logical inferences, and then to a visual representation for rapid observation of the conclusions and the logical flow between the testing results and hypotheses. We have applied HypoML to a number of hypothesized concepts, demonstrating the intuitive and explainable nature of the visual analysis.
Collapse
|
35
|
Rahman P, Nandi A, Hebert C. Amplifying Domain Expertise in Clinical Data Pipelines. JMIR Med Inform 2020; 8:e19612. [PMID: 33151150 PMCID: PMC7677017 DOI: 10.2196/19612] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 07/07/2020] [Accepted: 07/22/2020] [Indexed: 11/28/2022] Open
Abstract
Digitization of health records has allowed the health care domain to adopt data-driven algorithms for decision support. There are multiple people involved in this process: a data engineer who processes and restructures the data, a data scientist who develops statistical models, and a domain expert who informs the design of the data pipeline and consumes its results for decision support. Although there are multiple data interaction tools for data scientists, few exist to allow domain experts to interact with data meaningfully. Designing systems for domain experts requires careful thought because they have different needs and characteristics from other end users. There should be an increased emphasis on the system to optimize the experts' interaction by directing them to high-impact data tasks and reducing the total task completion time. We refer to this optimization as amplifying domain expertise. Although there is active research in making machine learning models more explainable and usable, it focuses on the final outputs of the model. However, in the clinical domain, expert involvement is needed at every pipeline step: curation, cleaning, and analysis. To this end, we review literature from the database, human-computer information, and visualization communities to demonstrate the challenges and solutions at each of the data pipeline stages. Next, we present a taxonomy of expertise amplification, which can be applied when building systems for domain experts. This includes summarization, guidance, interaction, and acceleration. Finally, we demonstrate the use of our taxonomy with a case study.
Collapse
Affiliation(s)
| | - Arnab Nandi
- The Ohio State University, Columbus, OH, United States
| | | |
Collapse
|
36
|
Hazarika S, Li H, Wang KC, Shen HW, Chou CS. NNVA: Neural Network Assisted Visual Analysis of Yeast Cell Polarization Simulation. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:34-44. [PMID: 31425114 DOI: 10.1109/tvcg.2019.2934591] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Complex computational models are often designed to simulate real-world physical phenomena in many scientific disciplines. However, these simulation models tend to be computationally very expensive and involve a large number of simulation input parameters, which need to be analyzed and properly calibrated before the models can be applied for real scientific studies. We propose a visual analysis system to facilitate interactive exploratory analysis of high-dimensional input parameter space for a complex yeast cell polarization simulation. The proposed system can assist the computational biologists, who designed the simulation model, to visually calibrate the input parameters by modifying the parameter values and immediately visualizing the predicted simulation outcome without having the need to run the original expensive simulation for every instance. Our proposed visual analysis system is driven by a trained neural network-based surrogate model as the backend analysis framework. In this work, we demonstrate the advantage of using neural networks as surrogate models for visual analysis by incorporating some of the recent advances in the field of uncertainty quantification, interpretability and explainability of neural network-based models. We utilize the trained network to perform interactive parameter sensitivity analysis of the original simulation as well as recommend optimal parameter configurations using the activation maximization framework of neural networks. We also facilitate detail analysis of the trained network to extract useful insights about the simulation model, learned by the network, during the training process. We performed two case studies, and discovered multiple new parameter configurations, which can trigger high cell polarization results in the original simulation model. We evaluated our results by comparing with the original simulation model outcomes as well as the findings from previous parameter analysis performed by our experts.
Collapse
|
37
|
Spinner T, Schlegel U, Schafer H, El-Assady M. explAIner: A Visual Analytics Framework for Interactive and Explainable Machine Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1064-1074. [PMID: 31442998 DOI: 10.1109/tvcg.2019.2934629] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We propose a framework for interactive and explainable machine learning that enables users to (1) understand machine learning models; (2) diagnose model limitations using different explainable AI methods; as well as (3) refine and optimize the models. Our framework combines an iterative XAI pipeline with eight global monitoring and steering mechanisms, including quality monitoring, provenance tracking, model comparison, and trust building. To operationalize the framework, we present explAIner, a visual analytics system for interactive and explainable machine learning that instantiates all phases of the suggested pipeline within the commonly used TensorBoard environment. We performed a user-study with nine participants across different expertise levels to examine their perception of our workflow and to collect suggestions to fill the gap between our system and framework. The evaluation confirms that our tightly integrated system leads to an informed machine learning process while disclosing opportunities for further extensions.
Collapse
|
38
|
Ma Y, Xie T, Li J, Maciejewski R. Explaining Vulnerabilities to Adversarial Machine Learning through Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1075-1085. [PMID: 31478859 DOI: 10.1109/tvcg.2019.2934631] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Machine learning models are currently being deployed in a variety of real-world applications where model predictions are used to make decisions about healthcare, bank loans, and numerous other critical tasks. As the deployment of artificial intelligence technologies becomes ubiquitous, it is unsurprising that adversaries have begun developing methods to manipulate machine learning models to their advantage. While the visual analytics community has developed methods for opening the black box of machine learning models, little work has focused on helping the user understand their model vulnerabilities in the context of adversarial attacks. In this paper, we present a visual analytics framework for explaining and exploring model vulnerabilities to adversarial attacks. Our framework employs a multi-faceted visualization scheme designed to support the analysis of data poisoning attacks from the perspective of models, data instances, features, and local structures. We demonstrate our framework through two case studies on binary classifiers and illustrate model vulnerabilities with respect to varying attack strategies.
Collapse
|
39
|
Ahn Y, Lin YR. FairSight: Visual Analytics for Fairness in Decision Making. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1086-1095. [PMID: 31425083 DOI: 10.1109/tvcg.2019.2934262] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Data-driven decision making related to individuals has become increasingly pervasive, but the issue concerning the potential discrimination has been raised by recent studies. In response, researchers have made efforts to propose and implement fairness measures and algorithms, but those efforts have not been translated to the real-world practice of data-driven decision making. As such, there is still an urgent need to create a viable tool to facilitate fair decision making. We propose FairSight, a visual analytic system to address this need; it is designed to achieve different notions of fairness in ranking decisions through identifying the required actions - understanding, measuring, diagnosing and mitigating biases - that together lead to fairer decision making. Through a case study and user study, we demonstrate that the proposed visual analytic and diagnostic modules in the system are effective in understanding the fairness-aware decision pipeline and obtaining more fair outcomes.
Collapse
|
40
|
Wexler J, Pushkarna M, Bolukbasi T, Wattenberg M, Viegas F, Wilson J. The What-If Tool: Interactive Probing of Machine Learning Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:56-65. [PMID: 31442996 DOI: 10.1109/tvcg.2019.2934619] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
A key challenge in developing and deploying Machine Learning (ML) systems is understanding their performance across a wide range of inputs. To address this challenge, we created the What-If Tool, an open-source application that allows practitioners to probe, visualize, and analyze ML systems, with minimal coding. The What-If Tool lets practitioners test performance in hypothetical situations, analyze the importance of different data features, and visualize model behavior across multiple models and subsets of input data. It also lets practitioners measure systems according to multiple ML fairness metrics. We describe the design of the tool, and report on real-life usage at different organizations.
Collapse
|
41
|
Gehrmann S, Strobelt H, Kruger R, Pfister H, Rush AM. Visual Interaction with Deep Learning Models through Collaborative Semantic Inference. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:884-894. [PMID: 31425116 DOI: 10.1109/tvcg.2019.2934595] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Automation of tasks can have critical consequences when humans lose agency over decision processes. Deep learning models are particularly susceptible since current black-box approaches lack explainable reasoning. We argue that both the visual interface and model structure of deep learning systems need to take into account interaction design. We propose a framework of collaborative semantic inference (CSI) for the co-design of interactions and models to enable visual collaboration between humans and algorithms. The approach exposes the intermediate reasoning process of models which allows semantic interactions with the visual metaphors of a problem, which means that a user can both understand and control parts of the model reasoning process. We demonstrate the feasibility of CSI with a co-designed case study of a document summarization system.
Collapse
|
42
|
Khayat M, Karimzadeh M, Ebert DS, Ghafoor A. The Validity, Generalizability and Feasibility of Summative Evaluation Methods in Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:353-363. [PMID: 31425085 DOI: 10.1109/tvcg.2019.2934264] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Many evaluation methods have been used to assess the usefulness of Visual Analytics (VA) solutions. These methods stem from a variety of origins with different assumptions and goals, which cause confusion about their proofing capabilities. Moreover, the lack of discussion about the evaluation processes may limit our potential to develop new evaluation methods specialized for VA. In this paper, we present an analysis of evaluation methods that have been used to summatively evaluate VA solutions. We provide a survey and taxonomy of the evaluation methods that have appeared in the VAST literature in the past two years. We then analyze these methods in terms of validity and generalizability of their findings, as well as the feasibility of using them. We propose a new metric called summative quality to compare evaluation methods according to their ability to prove usefulness, and make recommendations for selecting evaluation methods based on their summative quality in the VA domain.
Collapse
|
43
|
Das S, Cashman D, Chang R, Endert A. BEAMES: Interactive Multimodel Steering, Selection, and Inspection for Regression Tasks. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2019; 39:20-32. [PMID: 31199255 DOI: 10.1109/mcg.2019.2922592] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Interactive model steering helps people incrementally build machine learning models that are tailored to their domain and task. Existing visual analytic tools allow people to steer a single model (e.g., assignment attribute weights used by a dimension reduction model). However, the choice of model is critical in such situations. What if the model chosen is suboptimal for the task, dataset, or question being asked? What if instead of parameterizing and steering this model, a different model provides a better fit? This paper presents a technique to allow users to inspect and steer multiple machine learning models. The technique steers and samples models from a broader set of learning algorithms and model types. We incorporate this technique into a visual analytic prototype, BEAMES, that allows users to perform regression tasks via multimodel steering. This paper demonstrates the effectiveness of BEAMES via a use case, and discusses broader implications for multimodel steering.
Collapse
|
44
|
Abstract
Machine learning systems are becoming increasingly ubiquitous. These systems’s adoption has been expanding, accelerating the shift towards a more algorithmic society, meaning that algorithmically informed decisions have greater potential for significant social impact. However, most of these accurate decision support systems remain complex black boxes, meaning their internal logic and inner workings are hidden to the user and even experts cannot fully understand the rationale behind their predictions. Moreover, new regulations and highly regulated domains have made the audit and verifiability of decisions mandatory, increasing the demand for the ability to question, understand, and trust machine learning systems, for which interpretability is indispensable. The research community has recognized this interpretability problem and focused on developing both interpretable models and explanation methods over the past few years. However, the emergence of these methods shows there is no consensus on how to assess the explanation quality. Which are the most suitable metrics to assess the quality of an explanation? The aim of this article is to provide a review of the current state of the research field on machine learning interpretability while focusing on the societal impact and on the developed methods and metrics. Furthermore, a complete literature review is presented in order to identify future directions of work on this field.
Collapse
|
45
|
Wang J, Gou L, Zhang W, Yang H, Shen HW. DeepVID: Deep Visual Interpretation and Diagnosis for Image Classifiers via Knowledge Distillation. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2019; 25:2168-2180. [PMID: 30892211 DOI: 10.1109/tvcg.2019.2903943] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Deep Neural Networks (DNNs) have been extensively used in multiple disciplines due to their superior performance. However, in most cases, DNNs are considered as black-boxes and the interpretation of their internal working mechanism is usually challenging. Given that model trust is often built on the understanding of how a model works, the interpretation of DNNs becomes more important, especially in safety-critical applications (e.g., medical diagnosis, autonomous driving). In this paper, we propose DeepVID, a Deep learning approach to Visually Interpret and Diagnose DNN models, especially image classifiers. In detail, we train a small locally-faithful model to mimic the behavior of an original cumbersome DNN around a particular data instance of interest, and the local model is sufficiently simple such that it can be visually interpreted (e.g., a linear model). Knowledge distillation is used to transfer the knowledge from the cumbersome DNN to the small model, and a deep generative model (i.e., variational auto-encoder) is used to generate neighbors around the instance of interest. Those neighbors, which come with small feature variances and semantic meanings, can effectively probe the DNN's behaviors around the interested instance and help the small model to learn those behaviors. Through comprehensive evaluations, as well as case studies conducted together with deep learning experts, we validate the effectiveness of DeepVID.
Collapse
|