1
|
Dhanoa V, Walchshofer C, Hinterreiter A, Groller E, Streit M. Fuzzy Spreadsheet: Understanding and Exploring Uncertainties in Tabular Calculations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1463-1477. [PMID: 34633930 DOI: 10.1109/tvcg.2021.3119212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Spreadsheet-based tools provide a simple yet effective way of calculating values, which makes them the number-one choice for building and formalizing simple models for budget planning and many other applications. A cell in a spreadsheet holds one specific value and gives a discrete, overprecise view of the underlying model. Therefore, spreadsheets are of limited use when investigating the inherent uncertainties of such models and answering what-if questions. Existing extensions typically require a complex modeling process that cannot easily be embedded in a tabular layout. In Fuzzy Spreadsheet, a cell can hold and display a distribution of values. This integrated uncertainty-handling immediately conveys sensitivity and robustness information. The fuzzification of the cells enables calculations not only with precise values but also with distributions, and probabilities. We conservatively added and carefully crafted visuals to maintain the look and feel of a traditional spreadsheet while facilitating what-if analyses. Given a user-specified reference cell, Fuzzy Spreadsheet automatically extracts and visualizes contextually relevant information, such as impact, uncertainty, and degree of neighborhood, for the selected and related cells. To evaluate its usability and the perceived mental effort required, we conducted a user study. The results show that our approach outperforms traditional spreadsheets in terms of answer correctness, response time, and perceived mental effort in almost all tasks tested.
Collapse
|
2
|
Yu Y, Kruyff D, Jiao J, Becker T, Behrisch M. PSEUDo: Interactive Pattern Search in Multivariate Time Series with Locality-Sensitive Hashing and Relevance Feedback. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:33-42. [PMID: 36170404 DOI: 10.1109/tvcg.2022.3209431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
We present PSEUDo, a visual pattern retrieval tool for multivariate time series. It aims to overcome the uneconomic (re-)training problem accompanying deep learning-based methods. Very high-dimensional time series emerge on an unprecedented scale due to increasing sensor usage and data storage. Visual pattern search is one of the most frequent tasks on time series. Automatic pattern retrieval methods often suffer from inefficient training data, a lack of ground truth labels, and a discrepancy between the similarity perceived by the algorithm and required by the user or the task. Our proposal is based on the query-aware locality-sensitive hashing technique to create a representation of multivariate time series windows. It features sub-linear training and inference time with respect to data dimensions. This performance gain allows an instantaneous relevance-feedback-driven adaption to converge to users' similarity notion. We demonstrate PSEUDo's performance in terms of accuracy, speed, steerability, and usability through quantitative benchmarks with representative time series retrieval methods and a case study. We find that PSEUDo detects patterns in high-dimensional time series efficiently, improves the result with relevance feedback through feature selection, and allows an understandable as well as user-friendly retrieval process.
Collapse
|
3
|
Quadri GJ, Rosen P. A Survey of Perception-Based Visualization Studies by Task. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:5026-5048. [PMID: 34283717 DOI: 10.1109/tvcg.2021.3098240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Knowledge of human perception has long been incorporated into visualizations to enhance their quality and effectiveness. The last decade, in particular, has shown an increase in perception-based visualization research studies. With all of this recent progress, the visualization community lacks a comprehensive guide to contextualize their results. In this report, we provide a systematic and comprehensive review of research studies on perception related to visualization. This survey reviews perception-focused visualization studies since 1980 and summarizes their research developments focusing on low-level tasks, further breaking techniques down by visual encoding and visualization type. In particular, we focus on how perception is used to evaluate the effectiveness of visualizations, to help readers understand and apply the principles of perception of their visualization designs through a task-optimized approach. We concluded our report with a summary of the weaknesses and open research questions in the area.
Collapse
|
4
|
Representation and analysis of time-series data via deep embedding and visual exploration. J Vis (Tokyo) 2022. [DOI: 10.1007/s12650-022-00890-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
5
|
Real-Time Water Level Prediction in Open Channel Water Transfer Projects Based on Time Series Similarity. WATER 2022. [DOI: 10.3390/w14132070] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
Changes in the opening of gates in open channel water transfer projects will cause fluctuations in the water level and flow of adjacent open channels and thus bring great challenges for real-time water level prediction. In this paper, a novel slope-similar shape method is proposed for real-time water level prediction when the change of gate opening at the next moment is known. The water level data points of three consecutive moments constitute the query. The slope similarity is used to find the historical water level datasets with similar change trend to the query, and then the best slope similarity dataset is determined according to the similarity index and the gate opening change. The water level difference of the next moment of the best similar data point is the water level difference of the predicted moment, and thus the water level at the next moment can be obtained. A case study is performed with the Middle Route of the South-to-North Water Diversion Project of China. The results show that 87.5% of datasets with a water level variation of less than 0.06 m have an error less than 0.03 m, 71.4% of which have an error less than 0.02 m. In conclusion, the proposed method is feasible, effective, and interpretable, and the study provides valuable insights into the development of scheduling schemes.
Collapse
|
6
|
Hinterreiter A, Ruch P, Stitz H, Ennemoser M, Bernard J, Strobelt H, Streit M. ConfusionFlow: A Model-Agnostic Visualization for Temporal Analysis of Classifier Confusion. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:1222-1236. [PMID: 32746284 DOI: 10.1109/tvcg.2020.3012063] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Classifiers are among the most widely used supervised machine learning algorithms. Many classification models exist, and choosing the right one for a given task is difficult. During model selection and debugging, data scientists need to assess classifiers' performances, evaluate their learning behavior over time, and compare different models. Typically, this analysis is based on single-number performance measures such as accuracy. A more detailed evaluation of classifiers is possible by inspecting class errors. The confusion matrix is an established way for visualizing these class errors, but it was not designed with temporal or comparative analysis in mind. More generally, established performance analysis systems do not allow a combined temporal and comparative analysis of class-level information. To address this issue, we propose ConfusionFlow, an interactive, comparative visualization tool that combines the benefits of class confusion matrices with the visualization of performance characteristics over time. ConfusionFlow is model-agnostic and can be used to compare performances for different model types, model architectures, and/or training and test datasets. We demonstrate the usefulness of ConfusionFlow in a case study on instance selection strategies in active learning. We further assess the scalability of ConfusionFlow and present a use case in the context of neural network pruning.
Collapse
|
7
|
Ao C, Jiao S, Wang Y, Yu L, Zou Q. Biological Sequence Classification: A Review on Data and General Methods. RESEARCH 2022. [DOI: 10.34133/research.0011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
With the rapid development of biotechnology, the number of biological sequences has grown exponentially. The continuous expansion of biological sequence data promotes the application of machine learning in biological sequences to construct predictive models for mining biological sequence information. There are many branches of biological sequence classification research. In this review, we mainly focus on the function and modification classification of biological sequences based on machine learning. Sequence-based prediction and analysis are the basic tasks to understand the biological functions of DNA, RNA, proteins, and peptides. However, there are hundreds of classification models developed for biological sequences, and the quite varied specific methods seem dizzying at first glance. Here, we aim to establish a long-term support website (
http://lab.malab.cn/~acy/BioseqData/home.html
), which provides readers with detailed information on the classification method and download links to relevant datasets. We briefly introduce the steps to build an effective model framework for biological sequence data. In addition, a brief introduction to single-cell sequencing data analysis methods and applications in biology is also included. Finally, we discuss the current challenges and future perspectives of biological sequence classification research.
Collapse
Affiliation(s)
- Chunyan Ao
- School of Computer Science and Technology, Xidian University, Xi’an, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Shihu Jiao
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Yansu Wang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi’an, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
8
|
Hinterreiter A, Steinparz C, SchÖfl M, Stitz H, Streit M. Projection Path Explorer: Exploring Visual Patterns in Projected Decision-making Paths. ACM T INTERACT INTEL 2021. [DOI: 10.1145/3387165] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
In problem-solving, a path towards a solutions can be viewed as a sequence of decisions. The decisions, made by humans or computers, describe a trajectory through a high-dimensional representation space of the problem. By means of dimensionality reduction, these trajectories can be visualized in lower-dimensional space. Such embedded trajectories have previously been applied to a wide variety of data, but analysis has focused almost exclusively on the self-similarity of single trajectories. In contrast, we describe patterns emerging from drawing many trajectories—for different initial conditions, end states, and solution strategies—in the same embedding space. We argue that general statements about the problem-solving tasks and solving strategies can be made by interpreting these patterns. We explore and characterize such patterns in trajectories resulting from human and machine-made decisions in a variety of application domains: logic puzzles (Rubik’s cube), strategy games (chess), and optimization problems (neural network training). We also discuss the importance of suitably chosen representation spaces and similarity metrics for the embedding.
Collapse
Affiliation(s)
- Andreas Hinterreiter
- Johannes Kepler University Linz, Austria and Imperial College London, London, UK
| | | | | | - Holger Stitz
- Johannes Kepler University Linz, Austria and datavisyn GmbH, Austria
| | - Marc Streit
- Johannes Kepler University Linz, Linz, Austria
| |
Collapse
|
9
|
Dong X, Gao Y, Dong J, Chantler MJ. The Importance of Phase to Texture Discrimination and Similarity. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:3755-3768. [PMID: 32191889 DOI: 10.1109/tvcg.2020.2981063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this article, we investigate the importance of phase for texture discrimination and similarity estimation tasks. We first use two psychophysical experiments to investigate the relative importance of phase and magnitude spectra for human texture discrimination and similarity estimation. The results show that phase is more important to humans for both tasks. We further examine the ability of 51 computational feature sets to perform these two tasks. In contrast with the psychophysical experiments, it is observed that the magnitude data is more important to these computational feature sets than the phase data. We hypothesise that this inconsistency is due to the difference between the abilities of humans and the computational feature sets to utilise phase data. This motivates us to investigate the application of the 51 feature sets to phase-only images in addition to their use on the original data set. This investigation is extended to exploit Convolutional Neural Network (CNN) features. The results show that our feature fusion scheme improves the average performance of those feature sets for estimating humans' perceptual texture similarity. The superior performance should be attributed to the importance of phase to texture similarity.
Collapse
|
10
|
|
11
|
Rosen P, Quadri GJ. LineSmooth: An Analytical Framework for Evaluating the Effectiveness of Smoothing Techniques on Line Charts. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1536-1546. [PMID: 33048725 DOI: 10.1109/tvcg.2020.3030421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
We present a comprehensive framework for evaluating line chart smoothing methods under a variety of visual analytics tasks. Line charts are commonly used to visualize a series of data samples. When the number of samples is large, or the data are noisy, smoothing can be applied to make the signal more apparent. However, there are a wide variety of smoothing techniques available, and the effectiveness of each depends upon both nature of the data and the visual analytics task at hand. To date, the visualization community lacks a summary work for analyzing and classifying the various smoothing methods available. In this paper, we establish a framework, based on 8 measures of the line smoothing effectiveness tied to 8 low-level visual analytics tasks. We then analyze 12 methods coming from 4 commonly used classes of line chart smoothing-rank filters, convolutional filters, frequency domain filters, and subsampling. The results show that while no method is ideal for all situations, certain methods, such as Gaussian filters and TOPOLOGY-based subsampling, perform well in general. Other methods, such as low-pass CUTOFF filters and Douglas-peucker subsampling, perform well for specific visual analytics tasks. Almost as importantly, our framework demonstrates that several methods, including the commonly used UNIFORM subsampling, produce low-quality results, and should, therefore, be avoided, if possible.
Collapse
|
12
|
The effects of baseline length in Computed Tomography perfusion of liver. Biomed Signal Process Control 2020. [DOI: 10.1016/j.bspc.2020.102135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
13
|
Ma X, Si Y, Wang Z, Wang Y. Length of stay prediction for ICU patients using individualized single classification algorithm. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2020; 186:105224. [PMID: 31765937 DOI: 10.1016/j.cmpb.2019.105224] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 10/31/2019] [Accepted: 11/15/2019] [Indexed: 05/24/2023]
Abstract
BACKGROUND AND OBJECTIVE In intensive care units (ICUs), length of stay (LOS) prediction is critical to help doctors and nurses select appropriate treatment options and predict patients' condition. Considering that most hospitals use universal models to predict patients' condition, which cannot meet the individual needs of special ICU patients. Our goal is to create a personalized model for patients to determine the number of hospital stays. METHODS In this study, a new combination of just-in-time learning (JITL) and one-class extreme learning machine (one-class ELM) is proposed to predict the number of days a patient stays in hospital. This combination is shortened as one-class JITL-ELM, where JITL is used to search for personalized cases for a new patient and one-class ELM is used to determine whether the patient can be discharged within 10 days. RESULTS The experimental results show that the one-class JITL-ELM model has an area under the curve (AUC) index of 0.8510, lift value of 2.1390, precision of 1, and G-mean is 0.7842. Its accuracy, specificity, and sensitivity were found as 0.82, 1, and 0.6150, respectively. Moreover, a novel simple mortality risk level estimation system that can determine the mortality rate of a patient by combining LOS and age is proposed. It has an accuracy rate of 66% and the miss rate of only 6.25%. CONCLUSIONS Overall, the one-class JITL-ELM can accurately predict hospitalization days and mortality using early physiological parameters. Moreover, a simple mortality risk level estimation system based on a combination of LOS and age is proposed; the system is simple, highly interpretable, and has strong application value.
Collapse
Affiliation(s)
- Xin Ma
- Beijing University of Chemical Technology, China
| | - Yabin Si
- Beijing University of Chemical Technology, China
| | - Zifan Wang
- Beijing University of Chemical Technology, China
| | - Youqing Wang
- Beijing University of Chemical Technology, China; Shandong University of Science and Technology, China.
| |
Collapse
|
14
|
Waldner M, Diehl A, Gracanin D, Splechtna R, Delrieux C, Matkovic K. A Comparison of Radial and Linear Charts for Visualizing Daily Patterns. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1033-1042. [PMID: 31443015 DOI: 10.1109/tvcg.2019.2934784] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Radial charts are generally considered less effective than linear charts. Perhaps the only exception is in visualizing periodical time-dependent data, which is believed to be naturally supported by the radial layout. It has been demonstrated that the drawbacks of radial charts outweigh the benefits of this natural mapping. Visualization of daily patterns, as a special case, has not been systematically evaluated using radial charts. In contrast to yearly or weekly recurrent trends, the analysis of daily patterns on a radial chart may benefit from our trained skill on reading radial clocks that are ubiquitous in our culture. In a crowd-sourced experiment with 92 non-expert users, we evaluated the accuracy, efficiency, and subjective ratings of radial and linear charts for visualizing daily traffic accident patterns. We systematically compared juxtaposed 12-hours variants and single 24-hours variants for both layouts in four low-level tasks and one high-level interpretation task. Our results show that over all tasks, the most elementary 24-hours linear bar chart is most accurate and efficient and is also preferred by the users. This provides strong evidence for the use of linear layouts - even for visualizing periodical daily patterns.
Collapse
|
15
|
Zhao Y, Luo X, Lin X, Wang H, Kui X, Zhou F, Wang J, Chen Y, Chen W. Visual Analytics for Electromagnetic Situation Awareness in Radio Monitoring and Management. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:590-600. [PMID: 31443001 DOI: 10.1109/tvcg.2019.2934655] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Traditional radio monitoring and management largely depend on radio spectrum data analysis, which requires considerable domain experience and heavy cognition effort and frequently results in incorrect signal judgment and incomprehensive situation awareness. Faced with increasingly complicated electromagnetic environments, radio supervisors urgently need additional data sources and advanced analytical technologies to enhance their situation awareness ability. This paper introduces a visual analytics approach for electromagnetic situation awareness. Guided by a detailed scenario and requirement analysis, we first propose a signal clustering method to process radio signal data and a situation assessment model to obtain qualitative and quantitative descriptions of the electromagnetic situations. We then design a two-module interface with a set of visualization views and interactions to help radio supervisors perceive and understand the electromagnetic situations by a joint analysis of radio signal data and radio spectrum data. Evaluations on real-world data sets and an interview with actual users demonstrate the effectiveness of our prototype system. Finally, we discuss the limitations of the proposed approach and provide future work directions.
Collapse
|
16
|
Abstract
The indoor climate is closely related to human health, well-being, and comfort. Thus, an understanding of the indoor climate is vital. One way to improve the indoor climates is to place an aesthetically pleasing active plant wall in the environment. By collecting data using sensors placed in and around the plant wall both the indoor climate and the status of the plant wall can be monitored and analyzed. This manuscript presents a user study with domain experts in this field with a focus on the representation of such data. The experts explored this data with a Line graph, a Horizon graph, and a Stacked area graph to better understand the status of the active plant wall and the indoor climate. Qualitative measures were collected with Think-aloud protocol and semi-structured interviews. The study resulted in four categories of analysis tasks: Overview, Detail, Perception, and Complexity. The Line graph was found to be preferred for use in providing an overview, and the Horizon graph for detailed analysis, revealing patterns and showing discernible trends, while the Stacked area graph was generally not preferred. Based on these findings, directions for future research are discussed and formulated. The results and future directions of this research can facilitate the analysis of multivariate temporal data, both for domain users and visualization researchers.
Collapse
|