|Scalable CPU Ray Tracing for In Situ Visualization Using OSPRay,
W. Usher, J. Amstutz, J. Günther, A. Knoll, G. P. Johnson, C. Brownlee, A. Hota, B. Cherniak, T. Rowley, J. Jeffers, V. Pascucci . In In Situ Visualization for Computational Science, Springer International Publishing, pp. 353--374. 2022.
In situ visualization increasingly involves rendering large numbers of images for post hoc exploration. As both the number of images to be rendered and the data being rendered are large, the scalability of the rendering component is of key concern. Furthermore, the renderer must be able to support a wide range of data distributions, simulation configurations, and HPC systems to provide the flexibility required for a portable, general purpose in situ rendering package. In this chapter, we discuss recent developments in OSPRay’s support for MPI-parallel applications to provide a flexible and scalable rendering API, with a focus on how these developments can be applied to enable scalable, high-quality in situ visualization.
A Review of Three-Dimensional Medical Image Visualization|
L. Zhou, M. Fan, C. Hansen, C. R. Johnson, D. Weiskopf. In Health Data Science, Vol. 2022, 2022.
Importance. Medical images are essential for modern medicine and an important research subject in visualization. However, medical experts are often not aware of the many advanced three-dimensional (3D) medical image visualization techniques that could increase their capabilities in data analysis and assist the decision-making process for specific medical problems. Our paper provides a review of 3D visualization techniques for medical images, intending to bridge the gap between medical experts and visualization researchers. Highlights. Fundamental visualization techniques are revisited for various medical imaging modalities, from computational tomography to diffusion tensor imaging, featuring techniques that enhance spatial perception, which is critical for medical practices. The state-of-the-art of medical visualization is reviewed based on a procedure-oriented classification of medical problems for studies of individuals and populations. This paper summarizes free software tools for different modalities of medical images designed for various purposes, including visualization, analysis, and segmentation, and it provides respective Internet links. Conclusions. Visualization techniques are a useful tool for medical experts to tackle specific medical problems in their daily work. Our review provides a quick reference to such techniques given the medical problem and modalities of associated medical images. We summarize fundamental techniques and readily available visualization tools to help medical experts to better understand and utilize medical imaging data. This paper could contribute to the joint effort of the medical and visualization communities to advance precision medicine.
Exploratory Lagrangian-Based Particle Tracing Using Deep Learning|
M. Han, S. Sane, C. R. Johnson. In Journal of Flow Visualization and Image Processing, Begell, 2022.
Time-varying vector fields produced by computational fluid dynamics simulations are often prohibitively large and pose challenges for accurate interactive analysis and exploration. To address these challenges, reduced Lagrangian representations have been increasingly researched as a means to improve scientific time-varying vector field exploration capabilities. This paper presents a novel deep neural network-based particle tracing method to explore time-varying vector fields represented by Lagrangian flow maps. In our workflow, in situ processing is first utilized to extract Lagrangian flow maps, and deep neural networks then use the extracted data to learn flow field behavior. Using a trained model to predict new particle trajectories offers a fixed small memory footprint and fast inference. To demonstrate and evaluate the proposed method, we perform an in-depth study of performance using a well-known analytical data set, the Double Gyre. Our study considers two flow map extraction strategies, the impact of the number of training samples and integration durations on efficacy, evaluates multiple sampling options for training and testing, and informs hyperparameter settings. Overall, we find our method requires a fixed memory footprint of 10.5 MB to encode a Lagrangian representation of a time-varying vector field while maintaining accuracy. For post hoc analysis, loading the trained model costs only two seconds, significantly reducing the burden of I/O when reading data for visualization. Moreover, our parallel implementation can infer one hundred locations for each of two thousand new pathlines in 1.3 seconds using one NVIDIA Titan RTX GPU.
Demonstrating the viability of Lagrangian in situ reduction on supercomputers|
S. Sane, C. R. Johnson, H. Childs. In Journal of Computational Science, Vol. 61, Elsevier, 2022.
Performing exploratory analysis and visualization of large-scale time-varying computational science applications is challenging due to inaccuracies that arise from under-resolved data. In recent years, Lagrangian representations of the vector field computed using in situ processing are being increasingly researched and have emerged as a potential solution to enable exploration. However, prior works have offered limited estimates of the encumbrance on the simulation code as they consider “theoretical” in situ environments. Further, the effectiveness of this approach varies based on the nature of the vector field, benefitting from an in-depth investigation for each application area. With this study, an extended version of Sane et al. (2021), we contribute an evaluation of Lagrangian analysis viability and efficacy for simulation codes executing at scale on a supercomputer. We investigated previously unexplored cosmology and seismology applications as well as conducted a performance benchmarking study by using a hydrodynamics mini-application targeting exascale computing. To inform encumbrance, we integrated in situ infrastructure with simulation codes, and evaluated Lagrangian in situ reduction in representative homogeneous and heterogeneous HPC environments. To inform post hoc accuracy, we conducted a statistical analysis across a range of spatiotemporal configurations as well as a qualitative evaluation. Additionally, our study contributes cost estimates for distributed-memory post hoc reconstruction. In all, we demonstrate viability for each application — data reduction to less than 1% of the total data via Lagrangian representations, while maintaining accurate reconstruction and requiring under 10% of total execution time in over 90% of our experiments.
Uncertainty Visualization of 2D Morse Complex Ensembles Using Statistical Summary Maps|
T. M. Athawale, D. Maljovec. L. Yan, C. R. Johnson, V. Pascucci, B. Wang. In IEEE Transactions on Visualization and Computer Graphics, Vol. 28, No. 4, pp. 1955-1966. April, 2022.
Morse complexes are gradient-based topological descriptors with close connections to Morse theory. They are widely applicable in scientific visualization as they serve as important abstractions for gaining insights into the topology of scalar fields. Data uncertainty inherent to scalar fields due to randomness in their acquisition and processing, however, limits our understanding of Morse complexes as structural abstractions. We, therefore, explore uncertainty visualization of an ensemble of 2D Morse complexes that arises from scalar fields coupled with data uncertainty. We propose several statistical summary maps as new entities for quantifying structural variations and visualizing positional uncertainties of Morse complexes in ensembles. Specifically, we introduce three types of statistical summary maps – the probabilistic map , the significance map , and the survival map – to characterize the uncertain behaviors of gradient flows. We demonstrate the utility of our proposed approach using wind, flow, and ocean eddy simulation datasets.
AMM: Adaptive Multilinear Meshes|
Subtitled arXiv:2007.15219, H. Bhatia, D. Hoang, N. Morrical, V. Pascucci, P.T. Bremer, P. Lindstrom. 2021.
Adaptive representations are increasingly indispensable for reducing the in-memory and on-disk footprints of large-scale data. Usual solutions are designed broadly along two themes: reducing data precision, e.g., through compression, or adapting data resolution, e.g., using spatial hierarchies. Recent research suggests that combining the two approaches, i.e., adapting both resolution and precision simultaneously, can offer significant gains over using them individually. However, there currently exist no practical solutions to creating and evaluating such representations at scale. In this work, we present a new resolution-precision-adaptive representation to support hybrid data reduction schemes and offer an interface to existing tools and algorithms. Through novelties in spatial hierarchy, our representation, Adaptive Multilinear Meshes (AMM), provides considerable reduction in the mesh size. AMM creates a piecewise multilinear representation of uniformly sampled scalar data and can selectively relax or enforce constraints on conformity, continuity, and coverage, delivering a flexible adaptive representation. AMM also supports representing the function using mixed-precision values to further the achievable gains in data reduction. We describe a practical approach to creating AMM incrementally using arbitrary orderings of data and demonstrate AMM on six types of resolution and precision datastreams. By interfacing with state-of-the-art rendering tools through VTK, we demonstrate the practical and computational advantages of our representation for visualization techniques. With an open-source release of our tool to create AMM, we make such evaluation of data reduction accessible to the community, which we hope will foster new opportunities and future data reduction schemes
Translational computer science at the scientific computing and imaging institute|
C. R. Johnson. In Journal of Computational Science, Vol. 52, pp. 101217. 2021.
The Scientific Computing and Imaging (SCI) Institute at the University of Utah evolved from the SCI research group, started in 1994 by Professors Chris Johnson and Rob MacLeod. Over time, research centers funded by the National Institutes of Health, Department of Energy, and State of Utah significantly spurred growth, and SCI became a permanent interdisciplinary research institute in 2000. The SCI Institute is now home to more than 150 faculty, students, and staff. The history of the SCI Institute is underpinned by a culture of multidisciplinary, collaborative research, which led to its emergence as an internationally recognized leader in the development and use of visualization, scientific computing, and image analysis research to solve important problems in a broad range of domains in biomedicine, science, and engineering. A particular hallmark of SCI Institute research is the creation of open source software systems, including the SCIRun scientific problem-solving environment, Seg3D, ImageVis3D, Uintah, ViSUS, Nektar++, VisTrails, FluoRender, and FEBio. At this point, the SCI Institute has made more than 50 software packages broadly available to the scientific community under open-source licensing and supports them through web pages, documentation, and user groups. While the vast majority of academic research software is written and maintained by graduate students, the SCI Institute employs several professional software developers to help create, maintain, and document robust, tested, well-engineered open source software. The story of how and why we worked, and often struggled, to make professional software engineers an integral part of an academic research institute is crucial to the larger story of the SCI Institute’s success in translational computer science (TCS).
Uncertainty Quantification in Brain Stimulation using UncertainSCI|
J. Tate, S. Rampersad, C. Charlebois, Z. Liu, J. Bergquist, D. White, L. Rupp, D. Brooks, A. Narayan, R. MacLeod. In Brain Stimulation: Basic, Translational, and Clinical Research in Neuromodulation, Vol. 14, No. 6, Elsevier, pp. 1659-1660. 2021.
Predicting the effects of brain stimulation with computer models presents many challenges, including estimating the possible error from the propagation of uncertain input parameters through the model. Quantification and control of these errors through uncertainty quantification (UQ) provide statistics on the likely impact of parameter variation on solution accuracy, including total variance and sensitivity associated to each parameter. While the need and importance of UQ in clinical modeling is generally accepted, tools for implementing UQ techniques remain limited or inaccessible for many researchers.
Direct Volume Rendering with Nonparametric Models of Uncertainty|
T. M. Athawale, B. Ma, E. Sakhaee, C. R. Johnson,, A. Entezari. In IEEE Transactions on Visualization and Computer Graphics, Vol. 27, No. 2, pp. 1797-1807. 2021.
We present a nonparametric statistical framework for the quantification, analysis, and propagation of data uncertainty in direct volume rendering (DVR). The state-of-the-art statistical DVR framework allows for preserving the transfer function (TF) of the ground truth function when visualizing uncertain data; however, the existing framework is restricted to parametric models of uncertainty. In this paper, we address the limitations of the existing DVR framework by extending the DVR framework for nonparametric distributions. We exploit the quantile interpolation technique to derive probability distributions representing uncertainty in viewing-ray sample intensities in closed form, which allows for accurate and efficient computation. We evaluate our proposed nonparametric statistical models through qualitative and quantitative comparisons with the mean-field and parametric statistical models, such as uniform and Gaussian, as well as Gaussian mixtures. In addition, we present an extension of the state-of-the-art rendering parametric framework to 2D TFs for improved DVR classifications. We show the applicability of our uncertainty quantification framework to ensemble, downsampled, and bivariate versions of scalar field datasets.
Uncertainty Visualization of the Marching Squares and Marching Cubes Topology Cases|
Subtitled arXiv:2108.03066, T. M. Athawale, S. Sane, C. R. Johnson. 2021.
Marching squares (MS) and marching cubes (MC) are widely used algorithms for level-set visualization of scientific data. In this paper, we address the challenge of uncertainty visualization of the topology cases of the MS and MC algorithms for uncertain scalar field data sampled on a uniform grid. The visualization of the MS and MC topology cases for uncertain data is challenging due to their exponential nature and the possibility of multiple topology cases per cell of a grid. We propose the topology case count and entropy-based techniques for quantifying uncertainty in the topology cases of the MS and MC algorithms when noise in data is modeled with probability distributions. We demonstrate the applicability of our techniques for independent and correlated uncertainty assumptions. We visualize the quantified topological uncertainty via color mapping proportional to uncertainty, as well as with interactive probability queries in the MS case and entropy isosurfaces in the MC case. We demonstrate the utility of our uncertainty quantification framework in identifying the isovalues exhibiting relatively high topological uncertainty. We illustrate the effectiveness of our techniques via results on synthetic, simulation, and hixel datasets.
Data-Driven Estimation of Temporal-Sampling Errors in Unsteady Flows|
H. Bhatia, S. N. Petruzza, R. Anirudh, A. G. Gyulassy, R. M. Kirby, V. Pascucci, P. T. Bremer. 2021.
While computer simulations typically store data at the highest available spatial resolution, it is often infeasible to do so for the temporal dimension. Instead, the common practice is to store data at regular intervals, the frequency of which is strictly limited by the available storage and I/O bandwidth. However, this manner of temporal subsampling can cause significant errors in subsequent analysis steps. More importantly, since the intermediate data is lost, there is no direct way of measuring this error after the fact. One particularly important use case that is affected is the analysis of unsteady flows using pathlines, as it depends on an accurate interpolation across time. Although the potential problem with temporal undersampling is widely acknowledged, there currently does not exist a practical way to estimate the potential impact. This paper presents a simple-to-implement yet powerful technique to estimate the error in pathlines due to temporal subsampling. Given an unsteady flow, we compute pathlines at the given temporal resolution as well as subsamples thereof. We then compute the error induced due to various levels of subsampling and use it to estimate the error between the given resolution and the unknown ground truth. Using two turbulent flows, we demonstrate that our approach, for the first time, provides an accurate, a posteriori error estimate for pathline computations. This estimate will enable scientists to better understand the uncertainties involved in pathline-based analysis techniques and can lead to new uncertainty visualization approaches using the predicted errors.
Reusing Interactive Analysis Workflows|
Subtitled OSF Preprints, K. Gadhave, Z.T. Cutler, A. Lex. 2021.
Interactive visual analysis has many advantages, but has the disadvantage that analysis processes and workflows cannot be easily stored and reused, which is in contrast to scripted analysis workflows using a programming language such as Python. In this paper, we introduce methods to semantically capture workflows in interactive visualization systems for different interactions such as selections, filters, categorizing/grouping, labeling, and aggregation. We design these workflows to be robust to updates in the dataset by capturing the semantics of underlying interactions, and, hence, they can be applied to updated datasets. We demonstrate this specification using a prototype that visualizes the data, shows interaction provenance, and allows generating workflows from this provenance. Finally, we introduce a Python library that can consume the workflow and apply it to the datasets, providing a seamless bridge between computational workflows and interactive visualization tools. We demonstrate our techniques using our UI prototype and Jupyter notebooks.
Towards replacing physical testing of granular materials with a Topology-based Model|
Subtitled arXiv preprint arXiv:2109.08777, A. Venkat, A. Gyulassy, G. Kosiba, A. Maiti, H. Reinstein, R. Gee, P.-T. Bremer, V. Pascucci. 2021.
In the study of packed granular materials, the performance of a sample (e.g., the detonation of a high-energy explosive) often correlates to measurements of a fluid flowing through it. The "effective surface area," the surface area accessible to the airflow, is typically measured using a permeametry apparatus that relates the flow conductance to the permeable surface area via the Carman-Kozeny equation. This equation allows calculating the flow rate of a fluid flowing through the granules packed in the sample for a given pressure drop. However, Carman-Kozeny makes inherent assumptions about tunnel shapes and flow paths that may not accurately hold in situations where the particles possess a wide distribution in shapes, sizes, and aspect ratios, as is true with many powdered systems of technological and commercial interest. To address this challenge, we replicate these measurements virtually on micro-CT images of the powdered material, introducing a new Pore Network Model based on the skeleton of the Morse-Smale complex. Pores are identified as basins of the complex, their incidence encodes adjacency, and the conductivity of the capillary between them is computed from the cross-section at their interface. We build and solve a resistive network to compute an approximate laminar fluid flow through the pore structure. We provide two means of estimating flow-permeable surface area: (i) by direct computation of conductivity, and (ii) by identifying dead-ends in the flow coupled with isosurface extraction and the application of the Carman-Kozeny equation, with the aim of establishing consistency over a range of particle shapes, sizes, porosity levels, and void distribution patterns.
Visualizing Interactions Between Solar Photovoltaic Farms and the Atmospheric Boundary Layer|
T. M. Athawale, B. J. Stanislawski, S. Sane,, C. R. Johnson. In Twelfth ACM International Conference on Future Energy Systems, pp. 377--381. 2021.
The efficiency of solar panels depends on the operating temperature. As the panel temperature rises, efficiency drops. Thus, the solar energy community aims to understand the factors that influence the operating temperature, which include wind speed, wind direction, turbulence, ambient temperature, mounting configuration, and solar cell material. We use high-resolution numerical simulations to model the flow and thermal behavior of idealized solar farms. Because these simulations model such complex behavior, advanced visualization techniques are needed to investigate and understand the results. Here, we present advanced 3D visualizations of numerical simulation results to illustrate the flow and heat transport in an idealized solar farm. The findings can be used to understand how flow behavior influences module temperatures, and vice versa.
Predicting intent behind selections in scatterplot visualizations|
K. Gadhave, J. Görtler, Z. Cutler, C. Nobre, O. Deussen, M. Meyer, J.M. Phillips, A. Lex. In Information Visualization, Vol. 20, No. 4, pp. 207-228. 2021.
Predicting and capturing an analyst’s intent behind a selection in a data visualization is valuable in two scenarios: First, a successful prediction of a pattern an analyst intended to select can be used to auto-complete a partial selection which, in turn, can improve the correctness of the selection. Second, knowing the intent behind a selection can be used to improve recall and reproducibility. In this paper, we introduce methods to infer analyst’s intents behind selections in data visualizations, such as scatterplots. We describe intents based on patterns in the data, and identify algorithms that can capture these patterns. Upon an interactive selection, we compare the selected items with the results of a large set of computed patterns, and use various ranking approaches to identify the best pattern for an analyst’s selection. We store annotations and the metadata to reconstruct a selection, such as the type of algorithm and its parameterization, in a provenance graph. We present a prototype system that implements these methods for tabular data and scatterplots. Analysts can select a prediction to auto-complete partial selections and to seamlessly log their intents. We discuss implications of our approach for reproducibility and reuse of analysis workflows. We evaluate our approach in a crowd-sourced study, where we show that auto-completing selection improves accuracy, and that we can accurately capture pattern-based intent.
Leveraging Topological Events in Tracking Graphs for Understanding Particle Diffusion|
T. McDonald, R. Shrestha, X. Yi, H. Bhatia, D. Chen, D. Goswami, V. Pascucci, T. Turbyville, P‐T Bremer. In Computer Graphics Forum, Vol. 40, No. 3, pp. 251-262. 2021.
Single particle tracking (SPT) of fluorescent molecules provides significant insights into the diffusion and relative motion of tagged proteins and other structures of interest in biology. However, despite the latest advances in high-resolution microscopy, individual particles are typically not distinguished from clusters of particles. This lack of resolution obscures potential evidence for how merging and splitting of particles affect their diffusion and any implications on the biological environment. The particle tracks are typically decomposed into individual segments at observed merge and split events, and analysis is performed without knowing the true count of particles in the resulting segments. Here, we address the challenges in analyzing particle tracks in the context of cancer biology. In particular, we study the tracks of KRAS protein, which is implicated in nearly 20% of all human cancers, and whose clustering and aggregation have been linked to the signaling pathway leading to uncontrolled cell growth. We present a new analysis approach for particle tracks by representing them as tracking graphs and using topological events – merging and splitting, to disambiguate the tracks. Using this analysis, we infer a lower bound on the count of particles as they cluster and create conditional distributions of diffusion speeds before and after merge and split events. Using thousands of time-steps of simulated and in-vitro SPT data, we demonstrate the efficacy of our method, as it offers the biologists a new, detailed look into the relationship between KRAS clustering and diffusion speeds.
|Investigating In Situ Reduction via Lagrangian Representations for Cosmology and Seismology Applications,
S. Sane, C. R. Johnson, H. Childs. In Computational Science -- ICCS 2021, Springer International Publishing, pp. 436--450. 2021.
Although many types of computational simulations produce time-varying vector fields, subsequent analysis is often limited to single time slices due to excessive costs. Fortunately, a new approach using a Lagrangian representation can enable time-varying vector field analysis while mitigating these costs. With this approach, a Lagrangian representation is calculated while the simulation code is running, and the result is explored after the simulation. Importantly, the effectiveness of this approach varies based on the nature of the vector field, requiring in-depth investigation for each application area. With this study, we evaluate the effectiveness for previously unexplored cosmology and seismology applications. We do this by considering encumbrance (on the simulation) and accuracy (of the reconstructed result). To inform encumbrance, we integrated in situ infrastructure with two simulation codes, and evaluated on representative HPC environments, performing Lagrangian in situ reduction using GPUs as well as CPUs. To inform accuracy, our study conducted a statistical analysis across a range of spatiotemporal configurations as well as a qualitative evaluation. In all, we demonstrate effectiveness for both cosmology and seismology—time-varying vector fields from these domains can be reduced to less than 1% of the total data via Lagrangian representations, while maintaining accurate reconstruction and requiring under 10% of total execution time in over 80% of our experiments.
Scalable In Situ Computation of Lagrangian Representations via Local Flow Maps|
S. Sane, A. Yenpure, R. Bujack, M. Larsen, K. Moreland, C. Garth, C. R. Johnson,, H. Childs. In Eurographics Symposium on Parallel Graphics and Visualization, The Eurographics Association, 2021.
In situ computation of Lagrangian flow maps to enable post hoc time-varying vector field analysis has recently become an active area of research. However, the current literature is largely limited to theoretical settings and lacks a solution to address scalability of the technique in distributed memory. To improve scalability, we propose and evaluate the benefits and limitations of a simple, yet novel, performance optimization. Our proposed optimization is a communication-free model resulting in local Lagrangian flow maps, requiring no message passing or synchronization between processes, intrinsically improving scalability, and thereby reducing overall execution time and alleviating the encumbrance placed on simulation codes from communication overheads. To evaluate our approach, we computed Lagrangian flow maps for four time-varying simulation vector fields and investigated how execution time and reconstruction accuracy are impacted by the number of GPUs per compute node, the total number of compute nodes, particles per rank, and storage intervals. Our study consisted of experiments computing Lagrangian flow maps with up to 67M particle trajectories over 500 cycles and used as many as 2048 GPUs across 512 compute nodes. In all, our study contributes an evaluation of a communication-free model as well as a scalability study of computing distributed Lagrangian flow maps at scale using in situ infrastructure on a modern supercomputer.
Distributed merge forest: a new fast and scalable approach for topological analysis at scale|
X. Huang, P. Klacansky, S. Petruzza, A. Gyulassy, P.T. Bremer, V. Pascucci. In Proceedings of the ACM International Conference on Supercomputing, pp. 367-377. 2021.
Topological analysis is used in several domains to identify and characterize important features in scientific data, and is now one of the established classes of techniques of proven practical use in scientific computing. The growth in parallelism and problem size tackled by modern simulations poses a particular challenge for these approaches. Fundamentally, the global encoding of topological features necessitates inter process communication that limits their scaling. In this paper, we extend a new topological paradigm to the case of distributed computing, where the construction of a global merge tree is replaced by a distributed data structure, the merge forest, trading slower individual queries on the structure for faster end-to-end performance and scaling. Empirically, the queries that are most negatively affected also tend to have limited practical use. Our experimental results demonstrate the scalability of both the merge forest construction and the parallel queries needed in scientific workflows, and contrast this scalability with the two established alternatives that construct variations of a global tree.
NViSII: A Scriptable Tool for Photorealistic Image Generation|
Subtitled arXiv preprint arXiv:2105.13962, N. Morrical, J. Tremblay, Y. Lin, S. Tyree, S. Birchfield, V. Pascucci, I. Wald. 2021.
We present a Python-based renderer built on NVIDIA's OptiX ray tracing engine and the OptiX AI denoiser, designed to generate high-quality synthetic images for research in computer vision and deep learning. Our tool enables the description and manipulation of complex dynamic 3D scenes containing object meshes, materials, textures, lighting, volumetric data (e.g., smoke), and backgrounds. Metadata, such as 2D/3D bounding boxes, segmentation masks, depth maps, normal maps, material properties, and optical flow vectors, can also be generated. In this work, we discuss design goals, architecture, and performance. We demonstrate the use of data generated by path tracing for training an object detector and pose estimator, showing improved performance in sim-to-real transfer in situations that are difficult for traditional raster-based renderers. We offer this tool as an easy-to-use, performant, high-quality renderer for advancing research in synthetic data generation and deep learning.