Uncertainty Quantification in Brain Stimulation using UncertainSCI|
J. Tate, S. Rampersad, C. Charlebois, Z. Liu, J. Bergquist, D. White, L. Rupp, D. Brooks, A. Narayan, R. MacLeod. In Brain Stimulation: Basic, Translational, and Clinical Research in Neuromodulation, Vol. 14, No. 6, Elsevier, pp. 1659-1660. 2021.
Predicting the effects of brain stimulation with computer models presents many challenges, including estimating the possible error from the propagation of uncertain input parameters through the model. Quantification and control of these errors through uncertainty quantification (UQ) provide statistics on the likely impact of parameter variation on solution accuracy, including total variance and sensitivity associated to each parameter. While the need and importance of UQ in clinical modeling is generally accepted, tools for implementing UQ techniques remain limited or inaccessible for many researchers.
Direct Volume Rendering with Nonparametric Models of Uncertainty|
T. M. Athawale, B. Ma, E. Sakhaee, C. R. Johnson,, A. Entezari. In IEEE Transactions on Visualization and Computer Graphics, Vol. 27, No. 2, pp. 1797-1807. 2021.
We present a nonparametric statistical framework for the quantification, analysis, and propagation of data uncertainty in direct volume rendering (DVR). The state-of-the-art statistical DVR framework allows for preserving the transfer function (TF) of the ground truth function when visualizing uncertain data; however, the existing framework is restricted to parametric models of uncertainty. In this paper, we address the limitations of the existing DVR framework by extending the DVR framework for nonparametric distributions. We exploit the quantile interpolation technique to derive probability distributions representing uncertainty in viewing-ray sample intensities in closed form, which allows for accurate and efficient computation. We evaluate our proposed nonparametric statistical models through qualitative and quantitative comparisons with the mean-field and parametric statistical models, such as uniform and Gaussian, as well as Gaussian mixtures. In addition, we present an extension of the state-of-the-art rendering parametric framework to 2D TFs for improved DVR classifications. We show the applicability of our uncertainty quantification framework to ensemble, downsampled, and bivariate versions of scalar field datasets.
Uncertainty Visualization of the Marching Squares and Marching Cubes Topology Cases|
Subtitled arXiv:2108.03066, T. M. Athawale, S. Sane, C. R. Johnson. 2021.
Marching squares (MS) and marching cubes (MC) are widely used algorithms for level-set visualization of scientific data. In this paper, we address the challenge of uncertainty visualization of the topology cases of the MS and MC algorithms for uncertain scalar field data sampled on a uniform grid. The visualization of the MS and MC topology cases for uncertain data is challenging due to their exponential nature and the possibility of multiple topology cases per cell of a grid. We propose the topology case count and entropy-based techniques for quantifying uncertainty in the topology cases of the MS and MC algorithms when noise in data is modeled with probability distributions. We demonstrate the applicability of our techniques for independent and correlated uncertainty assumptions. We visualize the quantified topological uncertainty via color mapping proportional to uncertainty, as well as with interactive probability queries in the MS case and entropy isosurfaces in the MC case. We demonstrate the utility of our uncertainty quantification framework in identifying the isovalues exhibiting relatively high topological uncertainty. We illustrate the effectiveness of our techniques via results on synthetic, simulation, and hixel datasets.
Data-Driven Estimation of Temporal-Sampling Errors in Unsteady Flows|
H. Bhatia, S. N. Petruzza, R. Anirudh, A. G. Gyulassy, R. M. Kirby, V. Pascucci, P. T. Bremer. 2021.
While computer simulations typically store data at the highest available spatial resolution, it is often infeasible to do so for the temporal dimension. Instead, the common practice is to store data at regular intervals, the frequency of which is strictly limited by the available storage and I/O bandwidth. However, this manner of temporal subsampling can cause significant errors in subsequent analysis steps. More importantly, since the intermediate data is lost, there is no direct way of measuring this error after the fact. One particularly important use case that is affected is the analysis of unsteady flows using pathlines, as it depends on an accurate interpolation across time. Although the potential problem with temporal undersampling is widely acknowledged, there currently does not exist a practical way to estimate the potential impact. This paper presents a simple-to-implement yet powerful technique to estimate the error in pathlines due to temporal subsampling. Given an unsteady flow, we compute pathlines at the given temporal resolution as well as subsamples thereof. We then compute the error induced due to various levels of subsampling and use it to estimate the error between the given resolution and the unknown ground truth. Using two turbulent flows, we demonstrate that our approach, for the first time, provides an accurate, a posteriori error estimate for pathline computations. This estimate will enable scientists to better understand the uncertainties involved in pathline-based analysis techniques and can lead to new uncertainty visualization approaches using the predicted errors.
Reusing Interactive Analysis Workflows|
Subtitled OSF Preprints, K. Gadhave, Z.T. Cutler, A. Lex. 2021.
Interactive visual analysis has many advantages, but has the disadvantage that analysis processes and workflows cannot be easily stored and reused, which is in contrast to scripted analysis workflows using a programming language such as Python. In this paper, we introduce methods to semantically capture workflows in interactive visualization systems for different interactions such as selections, filters, categorizing/grouping, labeling, and aggregation. We design these workflows to be robust to updates in the dataset by capturing the semantics of underlying interactions, and, hence, they can be applied to updated datasets. We demonstrate this specification using a prototype that visualizes the data, shows interaction provenance, and allows generating workflows from this provenance. Finally, we introduce a Python library that can consume the workflow and apply it to the datasets, providing a seamless bridge between computational workflows and interactive visualization tools. We demonstrate our techniques using our UI prototype and Jupyter notebooks.
Towards replacing physical testing of granular materials with a Topology-based Model|
Subtitled arXiv preprint arXiv:2109.08777, A. Venkat, A. Gyulassy, G. Kosiba, A. Maiti, H. Reinstein, R. Gee, P.-T. Bremer, V. Pascucci. 2021.
In the study of packed granular materials, the performance of a sample (e.g., the detonation of a high-energy explosive) often correlates to measurements of a fluid flowing through it. The "effective surface area," the surface area accessible to the airflow, is typically measured using a permeametry apparatus that relates the flow conductance to the permeable surface area via the Carman-Kozeny equation. This equation allows calculating the flow rate of a fluid flowing through the granules packed in the sample for a given pressure drop. However, Carman-Kozeny makes inherent assumptions about tunnel shapes and flow paths that may not accurately hold in situations where the particles possess a wide distribution in shapes, sizes, and aspect ratios, as is true with many powdered systems of technological and commercial interest. To address this challenge, we replicate these measurements virtually on micro-CT images of the powdered material, introducing a new Pore Network Model based on the skeleton of the Morse-Smale complex. Pores are identified as basins of the complex, their incidence encodes adjacency, and the conductivity of the capillary between them is computed from the cross-section at their interface. We build and solve a resistive network to compute an approximate laminar fluid flow through the pore structure. We provide two means of estimating flow-permeable surface area: (i) by direct computation of conductivity, and (ii) by identifying dead-ends in the flow coupled with isosurface extraction and the application of the Carman-Kozeny equation, with the aim of establishing consistency over a range of particle shapes, sizes, porosity levels, and void distribution patterns.
Visualizing Interactions Between Solar Photovoltaic Farms and the Atmospheric Boundary Layer|
T. M. Athawale, B. J. Stanislawski, S. Sane,, C. R. Johnson. In Twelfth ACM International Conference on Future Energy Systems, pp. 377--381. 2021.
The efficiency of solar panels depends on the operating temperature. As the panel temperature rises, efficiency drops. Thus, the solar energy community aims to understand the factors that influence the operating temperature, which include wind speed, wind direction, turbulence, ambient temperature, mounting configuration, and solar cell material. We use high-resolution numerical simulations to model the flow and thermal behavior of idealized solar farms. Because these simulations model such complex behavior, advanced visualization techniques are needed to investigate and understand the results. Here, we present advanced 3D visualizations of numerical simulation results to illustrate the flow and heat transport in an idealized solar farm. The findings can be used to understand how flow behavior influences module temperatures, and vice versa.
Predicting intent behind selections in scatterplot visualizations|
K. Gadhave, J. Görtler, Z. Cutler, C. Nobre, O. Deussen, M. Meyer, J.M. Phillips, A. Lex. In Information Visualization, Vol. 20, No. 4, pp. 207-228. 2021.
Predicting and capturing an analyst’s intent behind a selection in a data visualization is valuable in two scenarios: First, a successful prediction of a pattern an analyst intended to select can be used to auto-complete a partial selection which, in turn, can improve the correctness of the selection. Second, knowing the intent behind a selection can be used to improve recall and reproducibility. In this paper, we introduce methods to infer analyst’s intents behind selections in data visualizations, such as scatterplots. We describe intents based on patterns in the data, and identify algorithms that can capture these patterns. Upon an interactive selection, we compare the selected items with the results of a large set of computed patterns, and use various ranking approaches to identify the best pattern for an analyst’s selection. We store annotations and the metadata to reconstruct a selection, such as the type of algorithm and its parameterization, in a provenance graph. We present a prototype system that implements these methods for tabular data and scatterplots. Analysts can select a prediction to auto-complete partial selections and to seamlessly log their intents. We discuss implications of our approach for reproducibility and reuse of analysis workflows. We evaluate our approach in a crowd-sourced study, where we show that auto-completing selection improves accuracy, and that we can accurately capture pattern-based intent.
Leveraging Topological Events in Tracking Graphs for Understanding Particle Diffusion|
T. McDonald, R. Shrestha, X. Yi, H. Bhatia, D. Chen, D. Goswami, V. Pascucci, T. Turbyville, P‐T Bremer. In Computer Graphics Forum, Vol. 40, No. 3, pp. 251-262. 2021.
Single particle tracking (SPT) of fluorescent molecules provides significant insights into the diffusion and relative motion of tagged proteins and other structures of interest in biology. However, despite the latest advances in high-resolution microscopy, individual particles are typically not distinguished from clusters of particles. This lack of resolution obscures potential evidence for how merging and splitting of particles affect their diffusion and any implications on the biological environment. The particle tracks are typically decomposed into individual segments at observed merge and split events, and analysis is performed without knowing the true count of particles in the resulting segments. Here, we address the challenges in analyzing particle tracks in the context of cancer biology. In particular, we study the tracks of KRAS protein, which is implicated in nearly 20% of all human cancers, and whose clustering and aggregation have been linked to the signaling pathway leading to uncontrolled cell growth. We present a new analysis approach for particle tracks by representing them as tracking graphs and using topological events – merging and splitting, to disambiguate the tracks. Using this analysis, we infer a lower bound on the count of particles as they cluster and create conditional distributions of diffusion speeds before and after merge and split events. Using thousands of time-steps of simulated and in-vitro SPT data, we demonstrate the efficacy of our method, as it offers the biologists a new, detailed look into the relationship between KRAS clustering and diffusion speeds.
|Investigating In Situ Reduction via Lagrangian Representations for Cosmology and Seismology Applications,
S. Sane, C. R. Johnson, H. Childs. In Computational Science -- ICCS 2021, Springer International Publishing, pp. 436--450. 2021.
Although many types of computational simulations produce time-varying vector fields, subsequent analysis is often limited to single time slices due to excessive costs. Fortunately, a new approach using a Lagrangian representation can enable time-varying vector field analysis while mitigating these costs. With this approach, a Lagrangian representation is calculated while the simulation code is running, and the result is explored after the simulation. Importantly, the effectiveness of this approach varies based on the nature of the vector field, requiring in-depth investigation for each application area. With this study, we evaluate the effectiveness for previously unexplored cosmology and seismology applications. We do this by considering encumbrance (on the simulation) and accuracy (of the reconstructed result). To inform encumbrance, we integrated in situ infrastructure with two simulation codes, and evaluated on representative HPC environments, performing Lagrangian in situ reduction using GPUs as well as CPUs. To inform accuracy, our study conducted a statistical analysis across a range of spatiotemporal configurations as well as a qualitative evaluation. In all, we demonstrate effectiveness for both cosmology and seismology—time-varying vector fields from these domains can be reduced to less than 1% of the total data via Lagrangian representations, while maintaining accurate reconstruction and requiring under 10% of total execution time in over 80% of our experiments.
Scalable In Situ Computation of Lagrangian Representations via Local Flow Maps|
S. Sane, A. Yenpure, R. Bujack, M. Larsen, K. Moreland, C. Garth, C. R. Johnson,, H. Childs. In Eurographics Symposium on Parallel Graphics and Visualization, The Eurographics Association, 2021.
In situ computation of Lagrangian flow maps to enable post hoc time-varying vector field analysis has recently become an active area of research. However, the current literature is largely limited to theoretical settings and lacks a solution to address scalability of the technique in distributed memory. To improve scalability, we propose and evaluate the benefits and limitations of a simple, yet novel, performance optimization. Our proposed optimization is a communication-free model resulting in local Lagrangian flow maps, requiring no message passing or synchronization between processes, intrinsically improving scalability, and thereby reducing overall execution time and alleviating the encumbrance placed on simulation codes from communication overheads. To evaluate our approach, we computed Lagrangian flow maps for four time-varying simulation vector fields and investigated how execution time and reconstruction accuracy are impacted by the number of GPUs per compute node, the total number of compute nodes, particles per rank, and storage intervals. Our study consisted of experiments computing Lagrangian flow maps with up to 67M particle trajectories over 500 cycles and used as many as 2048 GPUs across 512 compute nodes. In all, our study contributes an evaluation of a communication-free model as well as a scalability study of computing distributed Lagrangian flow maps at scale using in situ infrastructure on a modern supercomputer.
Distributed merge forest: a new fast and scalable approach for topological analysis at scale|
X. Huang, P. Klacansky, S. Petruzza, A. Gyulassy, P.T. Bremer, V. Pascucci. In Proceedings of the ACM International Conference on Supercomputing, pp. 367-377. 2021.
Topological analysis is used in several domains to identify and characterize important features in scientific data, and is now one of the established classes of techniques of proven practical use in scientific computing. The growth in parallelism and problem size tackled by modern simulations poses a particular challenge for these approaches. Fundamentally, the global encoding of topological features necessitates inter process communication that limits their scaling. In this paper, we extend a new topological paradigm to the case of distributed computing, where the construction of a global merge tree is replaced by a distributed data structure, the merge forest, trading slower individual queries on the structure for faster end-to-end performance and scaling. Empirically, the queries that are most negatively affected also tend to have limited practical use. Our experimental results demonstrate the scalability of both the merge forest construction and the parallel queries needed in scientific workflows, and contrast this scalability with the two established alternatives that construct variations of a global tree.
NViSII: A Scriptable Tool for Photorealistic Image Generation|
Subtitled arXiv preprint arXiv:2105.13962, N. Morrical, J. Tremblay, Y. Lin, S. Tyree, S. Birchfield, V. Pascucci, I. Wald. 2021.
We present a Python-based renderer built on NVIDIA's OptiX ray tracing engine and the OptiX AI denoiser, designed to generate high-quality synthetic images for research in computer vision and deep learning. Our tool enables the description and manipulation of complex dynamic 3D scenes containing object meshes, materials, textures, lighting, volumetric data (e.g., smoke), and backgrounds. Metadata, such as 2D/3D bounding boxes, segmentation masks, depth maps, normal maps, material properties, and optical flow vectors, can also be generated. In this work, we discuss design goals, architecture, and performance. We demonstrate the use of data generated by path tracing for training an object detector and pose estimator, showing improved performance in sim-to-real transfer in situations that are difficult for traditional raster-based renderers. We offer this tool as an easy-to-use, performant, high-quality renderer for advancing research in synthetic data generation and deep learning.
Interactive Analysis for Large Volume Data from Fluorescence Microscopy at Cellular Precision|
Y. Wan, H.A. Holman, C. Hansen. In Computers & Graphics, Vol. 98, Pergamon, pp. 138-149. 2021.
The main objective for understanding fluorescence microscopy data is to investigate and evaluate the fluorescent signal intensity distributions as well as their spatial relationships across multiple channels. The quantitative analysis of 3D fluorescence microscopy data needs interactive tools for researchers to select and focus on relevant biological structures. We developed an interactive tool based on volume visualization techniques and GPU computing for streamlining rapid data analysis. Our main contribution is the implementation of common data quantification functions on streamed volumes, providing interactive analyses on large data without lengthy preprocessing. Data segmentation and quantification are coupled with brushing and executed at an interactive speed. A large volume is partitioned into data bricks, and only user-selected structures are analyzed to constrain the computational load. We designed a framework to assemble a sequence of GPU programs to handle brick borders and stitch analysis results. Our tool was developed in collaboration with domain experts and has been used to identify cell types. We demonstrate a workflow to analyze cells in vestibular epithelia of transgenic mice.
Spatio-Temporal Visualization of Interdependent Battery Bus Transit and Power Distribution Systems|
A. Bagherinezhad, M. Young, Bei Wang, M. Parvania. In IEEE PES Innovative Smart Grid Technologies Conference(ISGT), IEEE, 2021.
The high penetration of transportation electrification and its associated charging requirements magnify the interdependency of the transportation and power distribution systems. The emergent interdependency requires that system operators fully understand the status of both systems. To this end,a visualization tool is presented to illustrate the inter dependency of battery bus transit and power distribution systems and the associated components. The tool aims at monitoring components from both systems, such as the locations of electric buses, the state of charge of batteries, the price of electricity, voltage, current,and active/reactive power flow. The results showcase the success of the visualization tool in monitoring the bus transit and power distribution components to determine a reliable cost-effective scheme for spatio-temporal charging of electric buses.
TopoAct: Visually Exploring the Shape of Activations in Deep Learning|
A. Rathore, N. Chalapathi, S. Palande, Bei Wang. In Computer Graphics Forum, Vol. 40, No. 1, pp. 382-397. 2021.
Deep neural networks such as GoogLeNet, ResNet, and BERT have achieved impressive performance in tasks such as image and text classification. To understand how such performance is achieved, we probe a trained deep neural network by studying neuron activations, i.e., combinations of neuron firings, at various layers of the network in response to a particular input. With a large number of inputs, we aim to obtain a global view of what neurons detect by studying their activations. In particular, we develop visualizations that show the shape of the activation space, the organizational principle behind neuron activations, and the relationships of these activations within a layer. Applying tools from topological data analysis, we present TopoAct, a visual exploration system to study topological summaries of activation vectors. We present exploration scenarios using TopoAct that provide valuable insights into learned representations of neural networks. We expect TopoAct to give a topological perspective that enriches the current toolbox of neural network analysis, and to provide a basis for network architecture diagnosis and data anomaly detection.
Mapper Interactive: A Scalable, Extendable, and Interactive Toolbox for the Visual Exploration of High-Dimensional Data.|
Y. Zhou, N. Chalapathi, A. Rathore, Y. Zhao, Bei Wang. In IEEE Pacific Visualization Symposium, 2021.
The mapper algorithm is a popular tool from topological data analysis for extracting topological summaries of high-dimensional datasets. In this paper, we present Mapper Interactive, a web-based framework for the interactive analysis and visualization of high-dimensional point cloud data. It implements the mapper algorithm in an interactive, scalable, and easily extendable way, thus supporting practical data analysis. In particular, its command-line API can compute mapper graphs for 1 million points of 256 dimensions in about 3 minutes (4 times faster than the vanilla implementation). Its visual interface allows on-the-fly computation and manipulation of the mapper graph based on user-specified parameters and supports the addition of new analysis modules with a few lines of code. Mapper Interactive makes the mapper algorithm accessible to nonspecialists and accelerates topological analytics workflows.
Loon: Using Exemplars to Visualize Large Scale Microscopy Data|
D. Lange, E. Polanco, R. Judson-Torres, T. Zangle, A. Lex. In OSF Preprints, 2021.
Which drug is most promising for a cancer patient? This is a question a new microscopy-based approach for measuring the mass of individual cancer cells treated with different drugs promises to answer in only a few hours. However, the analysis pipeline for extracting data from these images is still far from complete automation: human intervention is necessary for quality control for preprocessing steps such as segmentation, to adjust filters, and remove noise, and for the analysis of the result. To address this workflow, we developed Loon, a visualization tool for analyzing drug screening data based on quantitative phase microscopy imaging. Loon visualizes both, derived data such as growth rates, and imaging data. Since the images are collected automatically at a large scale, manual inspection of images and segmentations is infeasible. However, reviewing representative samples of cells is essential, both for quality control and for data analysis. We introduce a new approach of choosing and visualizing representative exemplar cells that retain a close connection to the low-level data. By tightly integrating the derived data visualization capabilities with the novel exemplar visualization and providing selection and filtering capabilities, Loon is well suited for making decisions about which drugs are suitable for a specific patient.
Adaptive Spatially Aware I/O for Multiresolution Particle Data Layouts|
W. Usher, X. Huang, S. Petruzza, S. Kumar, S. R. Slattery, S. T. Reeve, F. Wang, C. R. Johnson,, V. Pascucci. In IPDPS, 2021.
Evaluation of GPU Volume Rendering in PyTorch Using Data-Parallel Primitives|
N. Marshak, P. Grosset, A. Knoll, J. P. Ahrens, C. R. Johnson. In Eurographics Symposium on Parallel Graphics and Visualization (EGPGV), 2021.
Data-parallel programming (DPP) has attracted considerable interest from the visualization community, fostering major software initiatives such as VTK-m. However, there has been relatively little recent investigation of data-parallel APIs in higherlevel languages such as Python, which could help developers sidestep the need for low-level application programming in C++ and CUDA. Moreover, machine learning frameworks exposing data-parallel primitives, such as PyTorch and TensorFlow, have exploded in popularity, making them attractive platforms for parallel visualization and data analysis. In this work, we benchmark data-parallel primitives in PyTorch, and investigate its application to GPU volume rendering using two distinct DPP formulations: a parallel scan and reduce over the entire volume, and repeated application of data-parallel operators to an array of rays. We find that most relevant DPP primitives exhibit performance similar to a native CUDA library. However, our volume rendering implementation reveals that PyTorch is limited in expressiveness when compared to other DPP APIs. Furthermore, while render times are sufficient for an early ''proof of concept'', memory usage acutely limits scalability.