Designed especially for neurobiologists, FluoRender is an interactive tool for multi-channel fluorescence microscopy data visualization and analysis.

BrainStimulator is a set of networks that are used in SCIRun to perform simulations of brain stimulation such as transcranial direct current stimulation (tDCS) and magnetic transcranial stimulation (TMS).

Developing software tools for science has always been a central vision of the SCI Institute.

Visualization

Visualization, sometimes referred to as visual data analysis, uses the graphical representation of data as a means of gaining understanding and insight into the data. Visualization research at SCI has focused on applications spanning computational fluid dynamics, medical imaging and analysis, biomedical data analysis, healthcare data analysis, weather data analysis, poetry, network and graph analysis, financial data analysis, etc.

Research involves novel algorithm and technique development to building tools and systems that assist in the comprehension of massive amounts of (scientific) data. We also research the process of creating successful visualizations.

We strongly believe in the role of interactivity in visual data analysis. Therefore, much of our research is concerned with creating visualizations that are intuitive to interact with and also render at interactive rates.

Visualization at SCI includes the academic subfields of Scientific Visualization, Information Visualization and Visual Analytics.

Charles Hansen

Volume Rendering
Ray Tracing
Graphics

Valerio Pascucci

Topological Methods
Data Streaming
Big Data

Chris Johnson

Scalar, Vector, and
Tensor Field Visualization,
Uncertainty Visualization

Mike Kirby

Uncertainty Visualization

Ross Whitaker

Topological Methods
Uncertainty Visualization

Alex Lex

Information Visualization

Bei Wang

Information Visualization
Scientific Visualization
Topological Data Analysis

Centers and Labs:

Funded Research Projects:

SCALE MoDL: Advancing Theoretical Minimax Deep Learning: Optimization, Resilience, and Interpretability

Bei Wang
The past decade has witnessed the great success of deep learning in broad societal and commercial applications. However, conventional deep learning relies on fitting data with neural networks, which is known to produce models that lack resilience. The next-generation deep learning paradigm needs to deliver resilient models that promote robustness to malicious attacks, fairness among users, and privacy preservation. In this project, the investigators will collaboratively develop a comprehensive minimax learning theory that advances the fundamental understanding of minimax deep learning from the perspectives of optimization, resilience, and interpretability.

Enabling Reproducibility of Interactive Visual Data Analysis

Alex Lex
Reproducibility and justifiability are widely recognized as critical aspects of data-driven decision making in fields as varied as scientific research, business, healthcare, or intelligence analysis. This project is concerned with enabling reproducibility and justifiability of decisions in the data analysis process, specifically as it relates to visual data analysis. Visualization is an important tool for discovery, yet decisions made by humans based on visualizations of data are difficult to capture and to justify. This project will develop methods to justify, communicate, and audit decisions made based on visual analysis. This, in turn will lead to better outcomes, achieved with less effort and cost. The increasing use of visual analysis tools for decision making will make data analysis accessible to a broad variety of people, as visual analysis tools are generally easier to use than scripting languages and do not require extensive computational and statistical training. This research and its related activities increase accessibility and enhance the data analysis infrastructure for research and education.

To achieve these goals, this research will develop a framework for making visual analysis sessions not only reproducible but also reusable. The approach is based on tracking semantically meaningful provenance data during an interactive visual analysis session. Once a discovery is made, analysts can use this history to curate a succinct analysis story, adding justifications and explanations to make their analysis reproducible by others. Using a semi-automatic process, analysts will be able to make their actions data-aware, so that their analysis processes become robust to changes, such as updates in the data. A second contribution of the proposed work is the integration of visual analysis into computational analysis processes. While visualization is commonly used to present computational analysis results, the results of a visual analysis session are rarely used to feed into further computational processes. The techniques developed in this project will allow analysts to feed analysis results (selections, aggregations, filters, etc.) back into a computational environment. This will make it possible to use interactive visualization at any point in the data analysis process while maintaining reproducibility and enabling reuse. The expected results include new methods to capture user intent, create data stories from analysis processes, and to integrate computational and visual data analysis, leveraging the strength of both, human abilities and computational power. The results will be disseminated in publications and in the form of open source software, and accessible via the project website (http://vdl.sci.utah.edu/projects/2018-nsf-reproducibility/).

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Reproducible Visual Analysis of Multivariate Networks with Multinet

Miriah Meyer, Bryan Jones, Alexander Lex
Multivariate networks -- datasets that link together entities that are associated with multiple different variables -- are a critical data representation for a range of high-impact problems, from understanding how our bodies work to uncovering how social media influences society. These data representations are a rich and complex reflection of the multifaceted relationships that exist in the world. Reasoning about a problem using a multivariate network allows an analyst to ask questions beyond those about explicit connectivity alone: Do groups of social-media influencers have similar backgrounds or experiences? Do species that co-evolve live in similar climates? What patterns of cell-types support different types of brain functions? Questions like these require understanding patterns and trends about entities with respect to both their attributes and their connectivity, leading to inferences about relationships beyond the initial network structure. As data continues to become an increasingly important driver of scientific discovery, datasets of networks have also become increasingly complex. These networks capture information about relationships between entities as well as attributes of the entities and the connections. Tools used in practice today provide very limited support for reasoning about networks and are also limited in the how users can interact with them. This lack of support leaves analysts and scientists to piece together workflows using separate tools, and significant amounts of programming, especially in the data preparation step. This project aims fill this critical gap in the existing cyber-infrastructure ecosystem for reasoning about multivariate networks by developing MultiNet, a robust, flexible, secure, and sustainable open-source visual analysis system.

MultiNet aims to change the landscape of visual analysis capabilities for reasoning about and analyzing multivariate networks. The web-based tool, along with an underlying plug-in-based framework, will support three core capabilities: (1) interactive, task-driven visualization of both the connectivity and attributes of networks, (2) reshaping the underlying network structure to bring the network into a shape that is well suited to address analysis questions, and (3) leveraging provenance data to support reproducibility, communication, and integration in computational workflows. These capabilities will allow scientists to ask new classes of questions about network datasets, and lead to insights about a wide range of pressing topics. To meet this goal, we will ground the design of MultiNet in four deeply collaborative case studies with domain scientists in biology, neuroscience, sociology, and geology.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Visualizing Robust Features in Vector and Tensor Fields

Bei Wang
Vector and tensor fields provide a powerful language to describe physical phenomena in many scientific applications. In atmospheric sciences, vectors are used to represent air movements with speed and directions and to capture typical and atypical atmospheric conditions. In materials science, stress and strain tensors are used to specify the behaviors of material bodies experiencing deformations and to facilitate the study of material strength. The main objective of this project is to define and quantify robust features in vector and tensor fields and to derive scientifically meaningful visualization for knowledge discovery. Robust features are objects, structures, or regions of interest that are stable under small perturbations of the data that arise from measurement noise, numerical instability or simulation uncertainty. Robust features are defined and evaluated via close collaborations with domain scientists to help them discriminate spurious from essential structures in the data. In materials science, the extraction of robust features in stress tensor fields will help the materials scientists better characterize and predict 3D cracking for manufacturing stronger materials. In neuroscience, quantifying the robustness of degenerate elements in brain imaging will offer new metrics and visualization in characterizing tissue microstructure for disease diagnostics. In bioengineering, robust vortex extraction and tracking of 3D conduction velocity fields in the heart will help bioengineers develop new metrics that detect and characterize ischemic stress associated with a heart attack. In atmospheric sciences, extracting and visualizing robust features in wind data will help the atmospheric scientists establish situation awareness of hazardous weather conditions such as wildfires and to provide wildfire weather forecasting and resource planning for firefighting personnel. This project will also provide a unique environment for multidisciplinary activities and training opportunities for students in integrating visualization with scientific applications.

This project will establish a new approach to feature-based visualization with three interconnected aims. First, it will derive novel mathematical formulations of robust features for vector and tensor fields and their ensembles. Second, it will develop new robustness-driven algorithms in feature extraction, tracking, simplification, visual representation, and uncertainty visualization. Third, it will apply and evaluate the proposed framework via close collaborations with scientists in four high-impact application areas: materials science, neuroscience, bioengineering, and atmospheric sciences. Using simulated micro-mechanical fields in an uncracked polycrystal, the project will integrate robust features with visualization to improve the interpretability of micro-mechanical fields and the prediction of fatigue-failure surfaces. Using diffusion tensor imaging (DTI) from the Human Connectome Project, the project will investigate quantifiable characteristics of crossing fibers as part of a long-term goal for deep brain stimulator placement. Using 3D conduction velocity generated in volumes of swine and canine tissues, the project will generate feature-based signatures from vortex stability and evolution and use them, in the long term, for disease diagnostics and medical intervention. Using ensemble datasets generated from the High-Resolution Rapid Refresh Model (HRRR), the project will use robust features in the visualization and statistical analysis of atmospheric models to identify atypical atmospheric conditions for wildfire weather assessment. The research results will be instantiated by a collection of research papers and open-source software tools targeting the communities of collaborating scientists and the large research community. These software tools will be made available via GitHub under MIT or BSD licenses.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

EAGER: Understanding and Mitigating Misinformation in Visualizations on Social Media

Alexander Lex
In a time of crisis, such as during a hurricane or a global pandemic, social media is an important source of information for the general population. In these scenarios, data visualizations are often used to convey information that is critical for decision making by individuals. For example, a visualization of the path of a hurricane can inform the affected population about the need to prepare or evacuate; while a visualization about the prevalence of a disease in a certain area can inform personal choices, such as limiting interactions with others during a relevant time period. Visualizations, however, can be flawed, which can lead to misinterpretation of the data, and, in a crisis, lead to decisions with negative consequences. This project seeks to identify aspects of visualizations that makes them widely shared, identify flaws a visualization might have, and warn social media users about them. Ultimately, this project can lead to better responses to a crisis by the general population, and contribute to improving visualization literacy. Finally, this project will also enable the training of two graduate students, provide opportunities for undergraduate research, and curate material that can be leveraged by educators teaching about visualization design.

These goals will be achieved by applying existing and novel methods, such as topic modeling and calculating measures of social attention, to three large dataset of social media posts related to recent crisis. Using a qualitative coding approach, a taxonomy of design problems will be developed. This taxonomy will be used to label a large dataset. Finally, a prototype intervention in the form of a plug-in that warns of problematic visualizations, but also enables users to classify problems with visualizations they encounter, will be developed. The dataset and the annotations compiled in the course of this project will be shared publicly. The software created will be released under a permissive, non-viral open source license.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

FluoRender: Visualization-Based and Interactive Analysis for Multi-Channel Microscopy Data

Chuck Hansen
FluoRender is a software package for visualizing and analyzing 3D and 4D (3D over time) fluorescence microscopy data. This project will serve the needs of biologists utilizing confocal microscopy for understanding cell development in many organisms and addresses the big-data problem from the massive increase of imaging data from modern high-resolution fluorescence microscopes.

Specific Aim 1 : Visualization of an extended number of volume channels: FluoRender will be enhanced with the multichannel visualization capability by simultaneously supporting several tens to hundreds of channels, which can be acquired from multispectral imaging devices or by registering data of multiple scans. FluoRender will take advantage of the latest volume rendering techniques to visualize significantly improved signal intensity detail compared to pseudo-surfaces.

Specific Aim 2 : Interactive comparison and organization of volume channels: A package of measures will be implemented in FluoRender for directly comparing volume channels. Leveraging the OpenCL programming interface, shape comparisons will be performed interactively on graphics hardware, allowing compound measures for complex morphology as well as immediate visual feedback via multichannel visualization. Interactive comparison will further enable the development of functions for semiautomatic channel organization and multichannel colocalization analysis.

Specific Aim 3 : 4D tracking of structures with irregular and changing shapes: Tracking irregularly shaped and shape-changing structures will substantially expand FluoRender's application for developmental and morphological studies of intracellular organelles, cells, and tissues. This will include a comprehensive tracking system that integrates different modules and allows them to work in an iterative and integrated environment, allowing user-guided, progressive refinement of the segmentation and tracking results.

Specific Aim 4. Fully hardware-accelerated and customizable computing modules: FluoRender will be restructures using compute modules based on the OpenCL standard, which provides not only hardware-accelerated execution speed, but also convenience for customization and reuse. Computing modules will be integrated with visualization features, enabling interactive and visualization-centered analysis. Users will also be able to reorganize and build modules to customize specific workflows for great adaptability.

Public Health Relevance
FluoRender is a software package for visualizing and analyzing 3D and 4D (3D over time) fluorescence microscopy data. This project will serve the needs of biologists utilizing confocal microscopy for understanding cell development in many organisms and addresses the big-data problem from the massive increase of imaging data from modern high-resolution fluorescence microscopes.

CPS: Synergy: A Layered Framework of Sensors, Models, Land-Use Information and Citizens for Understanding Air Quality in Urban Environments

Miriah Meyer, Ross Whitaker, Kerry Kelly, Pierre-Emmanuel Gaillardon
Poor air quality has been linked to not just adverse health effects such as increased incidence of cardiac arrhythmia, lung cancer, heart disease, and mortality, but also to the vitality of a region’s economy. These issues are particularly important in cities such as Salt Lake City (SLC), where topography, climate, and urban expansion combine to create some of the worst air quality episodes in the country. Cities like SLC currently rely on small numbers of expensive sensors placed across a large geographic area to measure air quality, making local, neighborhood-level measurements impossible to determine. Meanwhile, new commodity technologies are leading to fine-grained, community-based strategies for measuring and communicating air quality. Leveraging both of these approaches, this project will develop and deploy a dense, distributed, and dynamic air quality cyber-physical framework -- focusing on fine particulate matter and using SLC as an urban testbed -- to produce neighborhood-level estimates of air quality. The framework includes a network of low-cost sensors, hosted and maintained through a citizen science effort and maker-kit approach.

This research will result in novel developments in three areas: (i) sensor development that focuses on dramatically reducing cost and a movement toward cheap, wearable, passive sensors; (ii) computational modeling that combines heterogeneous sensor measurements with information about weather, topography, and land use patterns; and (iii) visualization interface design that communicates air quality estimates over space and time, coupled with related uncertainty measurements. Each of these areas requires a multidisciplinary approach that integrates existing and novel insights about sensor networks, computational modeling, and sense-making of data, as well as leveraging an engaged and connected community of residents through citizen science.

SBIR Phase II Immediate Delivery of Massive Aerial Imagery to Farmers and Crop Consultants

Valerio Pascucci, Amy Gooch
This Small Business Innovation Research (SBIR) Phase II project will accelerate the adoption of data intensive precision agriculture, increasing yields while decreasing farm inputs such as fertilizers and pesticides. This project removes the software bottleneck (time and labor) in processing large aerial surveys taken by Unmanned Aerial Systems, enabling a cost-effective and timely process to deliver actionable information to farmers. Using frequent high-quality aerial scans, farmers may optimize the use of fertilizers and more finely control the amount of pesticides and herbicides necessary to increase crop yield. Furthermore, farmers mitigate costs and losses by being able to spot problem areas, minimize the spread of plant diseases, and identify issues such as standing water, irrigation malfunctions, and persistent automated machinery errors in planting or cultivation. This project provides special benefit for rural customers having inadequate internet infrastructure by eliminating the need to upload massive imagery to the cloud for processing. The technology is part of a broad initiative in agriculture addressing the need for large increases in food production by 2050 in response to the projected growth of the world’s population to over 9 Billion people.

This project will continue development of algorithms for on-the-fly orthorectification, stitching, and normalization of aerial image mosaics and their deployment in an easy-to-use software prototype. The Phase I already demonstrated industry-leading speeds for such image processing. The technology behind this research project is designed from the ground up to process massive data with less memory and increased speed relative to other approaches, enabled by a proprietary streaming image representation, that allows multichannel gigapixel and terapixel images to be treated as ordinary images. This Phase II supports new extensions to the software that simplify and accelerate delivering a stitched and analyzed map, such as prioritizing computation in regions of the image that a customer is exploring. This would effectively eliminate the delay between image acquisition on unmanned aerial vehicles and when it can be used. Crop consultants have identified this as a transformative capability, as it enables ground-truthing information derived from aerial imagery in the same field visit, saving time and labor. The performance gains in compute-limited environments supported by this project are a key link between new capabilities to gather information and a farmer’s ability to utilize it to increase productivity while reducing costs.

Topology-Preserving Data Sketching for Scientific Visualization

Bei Wang
We are experiencing an information overload from streams of data that arise from scientific instruments and simulations. For example, material scientists use molecular dynamics (MD) simulations to study how fluids (such as gas, oil, and water) interact with heterogeneous porous solids (such as ceramics, cement, and rock) to improve transport phenomena within porous materials, which play critical roles in our energy sector. Such simulations generate large, time-varying, and complex forms of data under different physical and chemical conditions. Keeping track of interesting phenomena and applying appropriate actions (such as storage, analysis, and visualization) while the simulation is running is necessary but challenging. To address this challenge, the goal is no longer to capture and store observations or simulation in detail, but rather to process data efficiently and approximately in order to create a summary - a sketch - which allows queries over large volumes of data to be answered quickly.

The objective of this research is to conduct a systematic study of topology-preserving data sketching techniques to improve visual exploration and understanding of large scientific data. The project will employ topological sketches, that is, compressed representations of the full data that preserve their important structural properties, to support analysis and visualization as the data are generated. Our proposed solution transforms data sketching ideas from statistics, geometry, and linear algebra to develop new topological sketches of complex data. Such sketches will exploit the high spatial resolution and temporal fidelity of in situ data in an intelligent and scalable way. They will reduce data in situ while preserving its structural properties, and subsequently support interactive data exploration. In addition, topological triggers will be integrated into an adaptive workflow to support anomaly detection, computational steering, and decision optimization. The multidisciplinary nature of the proposed work will be broadly applicable in many scientific areas, including applications in computational fluid dynamics and materials science.

Novel 3d Experiments and Simulations Combined with Genetic Optimization for Accelerated Design of Metallic Foams

Valerio Pascucci
Open-cell metallic foams are an exciting class of structural materials that comprise a network of interconnected metallic ligaments, resulting in an interesting foam architecture. These low-density materials have garnered much attention over the past two decades based on their recognized potential for use in multi-functional applications. For example, in addition to serving as light-weight, load-bearing structures, open-cell metallic foams have the potential to serve concurrently as electrodes for energy-storage devices, as hosts for newly generated bone and blood vessels in biomedical implants, or as impact absorbers and noise insulators for advanced high-speed ground transportation. Despite their potential, the widespread deployment of open-cell metallic foams for a broader range of multi-functional applications remains hampered by inefficient, trial-and-error manufacturing approaches. This Designing Materials to Revolutionize and Engineer our Future (DMREF) Grant Opportunities for Academic Liaison with Industry (GOALI) award supports a joint academic-industry research effort to enable more efficient and intelligent design of open-cell metallic foams, and to achieve precise control over their performance for targeted applications. The results will provide dramatic improvements for the industry by increasing both the manufacturing efficiency and the tailorability of the foams, which will help to expand deployment of the foams throughout the energy, defense, biomedical, aerospace, and automotive industries. The research team will host outreach activities to expose students in K-12, undergraduate, and graduate school to this multi-disciplinary STEM research.

This DMREF GOALI award supports research to enable an accelerated and performance-based design paradigm for open-cell metallic foams through the integration of emergent methods in 3D materials characterization with multi-scale modeling and Bayesian optimization. The new design paradigm will be made possible through the discovery of process-structure-property relationships in the foams. The specific objectives include: experimentally modifying manufacturing parameters to produce variants of open-cell metallic foams; performing 3D synchrotron-based crystal-orientation measurements and in-situ X-ray computed tomography experiments to gain unprecedented insight into the hierarchical structure and multi-scale deformation mechanisms of the foam; using high-fidelity, multi-scale (grain-to-continuum) finite-element modeling to investigate micromechanical behavior and predict performance of the as-manufactured foams; conducting virtual tests on synthetic-foam variants to further populate a metallic-foam design space; and using Bayesian optimization on the simulation-based results to enable selection of optimal hierarchical structures (i.e. topology and crystallography) for targeted performance metrics. The research will be a first to decouple the effects of ligament topology and underlying crystal structure on micromechanical behavior of open-cell metallic foams (including microbuckling, local accumulation of slip, and distribution of crack-nucleation sites), which is postulated to influence its performance.

A Scalable Framework for Visual Exploration and Hypotheses Extraction of Phenomics Data

Bei Wang
Understanding how gene by environment interactions result in specific phenotypes is a core goal of modern biology and has real-world impacts on such things as crop management. Developing and managing successful crop practices is a goal that is fundamentally tied to our national food security. By applying novel computational visual analytical methods, this project seeks to identify and unravel the complex web of interactions linking genotypes, environments and phenotypes. These methods will first need to be designed and developed into usable software applications that can handle large volumes of crop phenomics data. High-throughput sensing technologies collect large volumes of field data for many plant traits, such as flowering time, related to crop development and production. The maize cultivars used here come from multiple genotypes that have been grown under a variety of environmental conditions, in order to give the widest range of conditions for understanding the interactions. The resulting data sets are growing quickly, both in size and complexity, but the analytical tools needed to extract knowledge and catalyze scientific discoveries have significantly lagged behind. The methodologies to be developed in this project represent a systematic attempt at bridging this rapidly widening divide. The project is inherently interdisciplinary, involving close research partnerships among computer scientists, plant scientists, and mathematicians. The research outcomes will be tightly integrated with education using a multipronged approach that includes, among others, postdoctoral and student training (graduates and undergraduates), curriculum development for a new campus-wide interdisciplinary undergraduate degree in Data Analytics, conference tutorials for training phenomics data practitioners, and contribution to the recruitment and retention of underrepresented minorities (particularly women) in STEM fields through the Pacific Northwest Louis Stokes Alliance for Minority Participation.

This project will lead to the design and development of a new, scalable, visual analytics platform suitable for hypothesis extraction and refinement from complex phenomics data sets. Focus on hypothesis extraction is critical in the context of phenomics data sets because much of the high-throughput sensing data being generated in crop fields are generated in the absence of specifically formulated hypotheses. Extracting plausible hypotheses from the data represents an important but tedious task. To this end, this project will apply and develop new capabilities using emerging advanced algorithmic principles, particularly from the branch of mathematics called algebraic topology that studies shapes and structure of complex data. The research objectives are three-fold. First, the project will employ and extend emerging algorithmic techniques from algebraic topology to decode the structure of large, complex phenomics data. Second, an interactive visual analytic platform will be developed to facilitate knowledge discovery using the extracted topological structures. Lastly, the quality and validity of a new visual analytic platform designed by this team will be tested using real-world maize data sets as well as simulated inputs as testbeds. The developed framework will encode functions for scientists to delineate hypotheses of three kinds: i) genetic characterization of single complex traits; ii) genetic characterization of multiple traits that share potentially pleiotropic effects; and iii) decoding and detailed characterization of genotype-by-environmental interactions, in particular, through a collaborative pilot study of maize flowering and growth traits. The expected significance of the proposed work is that biologists will be able to extract different types of testable hypotheses from plant phenomics data sets by employing a new class of visual analytic tools, and thus obtain a deeper understanding of the interactions among genotypes, environments and phenotypes. The project is potentially transformative in two ways: i) it will introduce advanced mathematical and computational principles into mainstream phenomic data analysis; and ii) it will usher in a new era where biologists spearhead data-driven hypothesis extraction and discovery with the aid of interactive, informative, and intuitive tools. The project will have a direct impact on the state of software in phenomics for fundamental data-driven discovery. To facilitate broader community adoption, the project will integrate the tools into the CyVerse Institute, and to a community phenomics software outlet. It will also lead to the development of automated scientific workflows. Project website: http://tdaphenomics.eecs.wsu.edu/.

COVID - RAPID: Building a Visual Consensus Model of the SARS-CoV-2 Life Cycle

Janet Iwasa, Miriah Meyer
The COVID-19 epidemic has motivated hundreds (if not thousands) of biological researchers around the globe to redirect their research efforts towards the understanding of SARS-CoV-2. This is leading to an explosion of data and it will be essential to find ways to rapidly digest and integrate new information into a context that facilitates consensus building in the research community. How do researchers and the broader community stay abreast of this flood of information? And how can we quickly move towards building a consensus model of the SARS-CoV-2 life cycle that builds on this explosive body of scientific data and expertise? This work proposes to take a novel and intuitive approach to facilitate scientific discourse and dissemination through the development of: (1) detailed molecular 3-D depictions that put a diverse dataset into the context of the SARS-CoV-2 life cycle, and; (2) provide for annotation tools to be used by researchers to explore and capture scientific discussions that will speed up consensus building to promote a mechanistic understanding of how this virus works. If successful, the work will reduce the time of consensus building from years to months. In addition, a graduate student and postdoc will receive training at the intersection of biological and computer sciences.

Specifically, researchers will work with an international group of SARS-CoV-2 experts to develop detailed and accurate visualizations of all stages of the viral life cycle including cellular entry, RNA replication and transcription, and viral assembly and egress with known energy states, rates, and spatial accuracy. These 3-D visualizations, which will be made freely available online, will be used to stimulate discussions within the scientific community, and will be iteratively updated based on community feedback and new data. To facilitate consensus building, annotation tools will be developed to interactively describe the data used to generate the visualizations and will also mediate and capture scientific discourse surrounding the various molecular mechanisms involved in viral infection. This project will rapidly produce a rich and publicly accessible collection of knowledge about SARS-CoV-2 biology for the global community.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

OpenSpace: An Engine for Dynamic Visualization of Earth and Space Science for Informal Education and Beyond

Chuck Hansen
The American Museum of Natural History (AMNH), in collaboration with informal science institutions (ISI), NASA mission teams and Subject Matter Experts (SME), and academic partners, seeks support for a five-year project to enable STEM education and improve U.S. scientific literacy by engaging a broad spectrum of the American public and STEM learners in cutting-edge NASA science and engineering content.

This project will develop an open source software, called OpenSpace, for visualizing NASA astrophysics, heliophysics, planetary science, and Earth science mission engineering activities and science results for the general public, students, teachers, and citizen scientists everywhere. The project will develop and widely disseminate OpenSpace; create innovative and networked programs with ISI partners; produce educational resources for middle and high school teachers and students; and establish robust partnerships with NASA SMD missions, ISIs, and visualization research centers.

The project is based on the success of pilot efforts to visualize the New Horizons mission and heliophysics and space weather simulation data generated by NASA Goddard’s Community Coordinated Modeling Center. It builds on AMNH’s expertise in science visualization and its record of success in partnering with NASA to develop innovative programming, exhibitions, and Space Shows that engage, inspire, and educate students, teachers, and learners of all ages.

Drawing together a highly qualified and exceptionally talented team of scientists, educators, software engineers, and visualization specialists, the project’s aim is to build a pipeline for transmitting visualized science content from across NASA SMD divisions to ISIs, secondary school classrooms, and the public.

To do so, the project proposes the following objectives:

Develop OpenSpace into a robust and flexible interactive visualization software that supports the presentation of dynamic data sets and that is easily updated for the presentation of current science.
Form a network of ISIs to inform the development of OpenSpace and develop associated programming to engage and educate diverse audiences.
Disseminate OpenSpace via the web to individual users, including teachers as a key audience, with resources for leveraging it as an educational tool.

Project outcomes include:

The establishment of a pipeline connecting NASA SMD content and SMEs with ISIs, secondary school classrooms, and the public.
The development of a new and powerful educational tool for the visualization of a wide range of NASA SMD mission activities and data products.
Enhanced understanding and engagement in STEM among youth, informal and formal educators, and the general public.

Project objectives, activities, and outcomes are closely aligned with, and aim to fulfill, the SMD science education objectives of enabling STEM education, improving U.S. scientific literacy, and advancing national education goals of increasing and sustaining youth and public engagement in STEM and leveraging efforts through partnerships.

Because OpenSpace will be open source, it will be freely accessible to users. It is designed to be compatible with multi-video channel cluster operations for high-resolution wall displays and planetarium domes, as well as for single-channel polar rendering fisheye projections and flat screens, in 2D and 3D. A WebGL version will make it possible for anyone with Internet access to explore OpenSpace. Another core design principle of this project is the ability to network across the Internet to synchronize displays in different locations, creating opportunities for shared experiences of high profile NASA content, including live events. This open source project will have a life far beyond the award period, as it will provide science and education communities access to the source code to modify, enhance, and extend its functionality to best serve audiences in the future.

Extracting the Full Information Content of Astrophysical Data Cubes

Bei Wang
An IFU (Integral Field Unit Spectrometer) allows one to take a high-resolution spectrum at multiple physical locations within an external target. The signal from an astronomical target is distributed into a large number of spaxels (spatial pixels), each with noise from the sky and detectors, and a greatly varying signal to noise ratio across the bundle. IFU bundle technique gives rise to 3-dimensional astrophysical data cubes (two spatial directions and one frequency direction) that require advanced analysis techniques to extract their salient features. In many cases the complex kinematic structure of features of interest further complicates the problem. Furthermore, it is intrinsically difficult to visualize such data and common analysis techniques often involve slicing the data cube along a particular axis, either at a fixed frequency or a fixed spatial location.

A common type of data from IFU bundle technique is the Mapping Nearby Galaxies at APO (MaNGA) survey, which is part of the Sloan Digital Sky Survey IV (SDSS-IV). PI Phillips and PI Rosen have been working to analyze similar data cubes taken at radio frequencies with the ALMA telescope in Chile (see http://alma-tda.cspaul.com). They have been using sophisticated mathematical techniques known as topological data analysis, in particular the contour tree, in order to extract features and remove noise for visualizing data cubes very similar to the ones arise from IFU.

Objective
We would like to apply advanced data analysis and visualization techniques, in particular, those from topological data analysis, to data observed at UV, optical and infrared wavelengths, in order to extract features that are currently inaccessible. In particular, we would like to start by studying the SDSS-IV MaNGA dataset, to which Carnegie Institution for Science and the University of Utah (where the MaNGA reduction and analysis pipelines are run via the Center for High Performance Computing) have full access as Institutional members (the SDSS Data Scientist, Prof. Joel Brownstein of the University of Utah is a PI on this project).

Furthermore, we will explore the applicability of such techniques to other similar datasets that have been acquired using other IFU facilities.

Topological Analysis for Energetic Materials Characterization

Valerio Pascucci
This statement of work supports ongoing efforts towards improved analysis of characterization and surveillance data of energetic materials. The goals are to: 1) use topological segmentations to analyze microstructural changes under aging; 2) explore extending the analysis tools to characterize fine-prill materials; 3) develop techniques to quantify permeable surface area of a lower-density system; and 4) extract age-trendable features from2D-surface profile data.

Tasks

1. Analyze microstructural changes under aging: At various Aging points (in time-temperature space):

Determine matching scales and simplification levels to create best matching segmentations for each dataset
Develop techniques to affinely align pre- & post-aged data sets for maximal correspondence
Use per-grain matching to analyze material changes over time

2. Explore extending the analysis tools to the characterization of fine-prill materials:

In previous years the Utah technology could successfully analyze X-ray CT data for coarser-prill HE materials. Explore the effectiveness of such technology in performing similar analysis on X-ray CT data for fine-prill systems.

3. Develop techniques to quantify permeable surface area of lower-density systems:

The topological segmentation theory could be used to quantify the permeable surface area of lower-density (e.g., porous-powder) systems, and to compute the gas-flow rate through such a specimen under a given pressure-gradient. CONTINGENCY: Availability of high-quality micro-CT data.

4. Extract age-trendable features fromsurface profilometry data

Analyze 2D height-map data from pellet surfaces (measured using a surface profilometer) and device quantitative features that can be used to track age-related changes in material morphology and performance.

Advanced Visualization of Silent Error Propagation in HPC Applications

Valerio Pascucci
High Performance Computing (HPC) systems contain increasingly large numbers of components. This trend, combined with practical limitations on component reliability, makes HPC systems vulnerable to a wide range of faults. These faults degrade systems efficiency and even threaten the correctness of application results. The problem is expected to grow even more significant for Exascale systems. Designing resilient software to run efficiently on such hardware is challenging, and uncertainty about how failures affect programs only complicates the problem.

Disruptions to the micro‐architectural state of hardware components (e.g., caches, reorder buffers or pipeline registers), may cause these components to crash or compute erroneous results. These errors then propagate through layers of the software stack, including the runtime system, support libraries, and application logic. Local memory access to erroneous results can easily propagate the effects of errors across cores; and the remote memory access on modern networks propagates errors across nodes. The reordered memory accesses in use by memory systems introduces further difficulties by obscuring the consistency (ordering) of memory accesses when errors occur. Identifying the propagation of errors through space and time and quantifying it in terms developers can understand is a major problem for error recovery schemes. This is especially true for scientific applications that rely on complex physical or numerical invariants and for resilience techniques that need to identify consistent states.

The ultimate goal of this research is to provide a visualization of the propagation of errors through application and system software in order to identify for application developers the vulnerability of their data structures and code regions to different types of errors, and the way these errors propagate through application state and logic.

VisStore: Seamless Acquisition, Storage, and Distribution of Massive Imagery

Ease of Use and Deployment for a Fast, Scalable Data Movement Infrastructure

Publications in Visualization:

Page 2 of 23

Start
Prev
1
2
3
4
5
6
7
8
9
10
Next
End

Accelerated Probabilistic Marching Cubes by Deep Learning for Time-Varying Scalar Ensembles
M. Han, T.M. Athawale, D. Pugmire, C.R. Johnson. In 2022 IEEE Visualization and Visual Analytics (VIS), IEEE, pp. 155-159. 2022.
DOI: 10.1109/VIS54862.2022.00040

Visualizing the uncertainty of ensemble simulations is challenging due to the large size and multivariate and temporal features of en-semble data sets. One popular approach to studying the uncertainty of ensembles is analyzing the positional uncertainty of the level sets. Probabilistic marching cubes is a technique that performs Monte Carlo sampling of multivariate Gaussian noise distributions for positional uncertainty visualization of level sets. However, the technique suffers from high computational time, making interactive visualization and analysis impossible to achieve. This paper introduces a deep-learning-based approach to learning the level-set uncertainty for two-dimensional ensemble data with a multivariate Gaussian noise assumption. We train the model using the first few time steps from time-varying ensemble data in our workflow. We demonstrate that our trained model accurately infers uncertainty in level sets for new time steps and is up to 170X faster than that of the original probabilistic model with serial computation and 10X faster than that of the original parallel computation.

Adaptive elasticity policies for staging-based in situ visualization
Z. Wang, M. Dorier, P. Subedi, P.E. Davis, M. Parashar. In Future Generation Computer Systems, 2022.
ISSN: 0167-739X
DOI: https://doi.org/10.1016/j.future.2022.12.010

In situ processing aims to alleviate the growing gap between computation and I/O capabilities by performing data processing close to the data source. In situ processing is widely used to process data generated by multiple data sources, including observation data from edge devices or scientific observational facilities and the simulation data generated by scientific computation on a high-performance computing (HPC) platform. For a scientific workflow that is run on an HPC platform and composed of a simulation program and an in situ data analytics or visualization (abbreviated as ana/vis) task, there is an implicit assumption that the computing resources assigned to the workflow keep static during the workflow execution. However, with the converging trend between the HPC and cloud computing platform, running the in situ ana/vis task in an elastic way is promising to decrease its overhead and improve its resource utilization rate. Resource elasticity represents the ability to change resource configurations such as the number of computing nodes/processes during workflow execution. An elastic job may dynamically adjust resource configurations; it may use a few resources at the beginning and more resources toward the end of the job when interesting data appear. However, it is hard to predict a priori how many computing nodes/processes need to be added/removed during the workflow execution to adapt to changing workflow needs. How to efficiently guide elasticity operations, such as growing or shrinking the number of processes used for in situ analysis during workflow execution, is an open-ended research question. In this article, we present adaptive elasticity policies that adopt workflow runtime information collected during workflow execution to predict how to trigger the addition/removal of processes in order to minimize in situ processing overhead. Taking in situ visualization tasks as an example, we integrate the presented elasticity policies into a staging-based elastic workflow and evaluate its efficiency in multiple elasticity scenarios. Compared with the situation without elasticity or with a static elasticity policy that uses a fixed number of processes for each rescaling operation, the adaptive elasticity policy can save overhead in finding a proper resource configuration and improve resource utilization efficiency. For example, one experiment illustrates that the adaptive elasticity policy saves 41% of core-hours compared with the situation without the resource elasticity.

A Visual Comparison of Silent Error Propagation
Z. Li, H. Menon, K. Mohror, S. Liu, L. Guo, P.T. Bremer, V. Pascucci. In IEEE Transactions on Visualization and Computer Graphics, IEEE, 2022.
DOI: 10.1109/TVCG.2022.3230636

High-performance computing (HPC) systems play a critical role in facilitating scientific discoveries. Their scale and complexity (e.g., the number of computational units and software stack) continue to grow as new systems are expected to process increasingly more data and reduce computing time. However, with more processing elements, the probability that these systems will experience a random bit-flip error that corrupts a program's output also increases, which is often recognized as silent data corruption. Analyzing the resiliency of HPC applications in extreme-scale computing to silent data corruption is crucial but difficult. An HPC application often contains a large number of computation units that need to be tested, and error propagation caused by error corruption is complex and difficult to interpret. To accommodate this challenge, we propose an interactive visualization system that helps HPC researchers understand the resiliency of HPC applications and compare their error propagation. Our system models an application's error propagation to study a program's resiliency by constructing and visualizing its fault tolerance boundary. Coordinating with multiple interactive designs, our system enables domain experts to efficiently explore the complicated spatial and temporal correlation between error propagations. At the end, the system integrated a nonmonotonic error propagation analysis with an adjustable graph propagation visualization to help domain experts examine the details of error propagation and answer such questions as why an error is mitigated or amplified by program execution.

Interactive Visualization for Data Science Scripts
R. Faust, C. Scheidegger, K. Isaacs, W.Z. Bernstein, M. Sharp, C. North. In 2022 IEEE Visualization in Data Science (VDS), IEEE, pp. 37-45. 2022.

As the field of data science continues to grow, so does the need for adequate tools to understand and debug data science scripts. Current debugging practices fall short when applied to a data science setting, due to the exploratory and iterative nature of analysis scripts. Additionally, computational notebooks, the preferred scripting environment of many data scientists, present additional challenges to understanding and debugging workflows, including the non-linear execution of code snippets. This paper presents Anteater, a trace-based visual debugging method for data science scripts. Anteater automatically traces and visualizes execution data with minimal analyst input. The visualizations illustrate execution and value behaviors that aid in understanding the results of analysis scripts. To maximize the number of workflows supported, we present prototype implementations in both Python and Jupyter. Last, to demonstrate Anteater’s support for analysis understanding tasks, we provide two usage scenarios on real world analysis scripts.

Ferret: Reviewing Tabular Datasets for Manipulation
Subtitled “OSF Preprint,” D. Lange, S. Sahai, J.M. Phillips, A. Lex. 2022.

How do we ensure the veracity of science? The act of manipulating or fabricating scientific data has led to many high-profile fraud cases and retractions. Detecting manipulated data, however, is a challenging and time-consuming endeavor. Automated detection methods are limited due to the diversity of data types and manipulation techniques. Furthermore, patterns automatically flagged as suspicious can have reasonable explanations. Instead, we propose a nuanced approach where experts analyze tabular datasets, eg, as part of the peer-review process, using a guided, interactive visualization approach. In this paper, we present an analysis of how manipulated datasets are created and the artifacts these techniques generate. Based on these findings, we propose a suite of visualization methods to surface potential irregularities. We have implemented these methods in Ferret, a visualization tool for data forensics work. Ferret makes potential data issues salient and provides guidance on spotting signs of tampering and differentiating them from truthful data.

The Materials Commons Data Repository
G. Tarcea, B. Puchala, T. Berman, G. Scorzelli, V. Pascucci, M, Taufer, J. Allison. In 2022 IEEE 18th International Conference on e-Science (e-Science), pp. 405--406. 2022.
DOI: 10.1109/eScience55777.2022.00060

Repositories are increasingly used for publishing and sharing scientific data. The Materials Commons is a data repository that follows the FAIR (Findable, Accessible, Inter-operable, Reusable) principles. We demonstrate the challenges with FAIR and how Materials Commons solves them. We also discuss the Nationals Science Data Fabric (NSDF) [1], a project that is democratizing data access, and show how Materials Commons with the NSDF software stack accelerates data access and scientific research.

High-Quality Progressive Alignment of Large 3D Microscopy Data
A. Venkat, D. Hoang, A. Gyulassy, P.T. Bremer, F. Federer, V. Pascucci. In 2022 IEEE 12th Symposium on Large Data Analysis and Visualization (LDAV), pp. 1--10. 2022.
DOI: 10.1109/LDAV57265.2022.9966406

Large-scale three-dimensional (3D) microscopy acquisitions fre-quently create terabytes of image data at high resolution and magni-fication. Imaging large specimens at high magnifications requires acquiring 3D overlapping image stacks as tiles arranged on a two-dimensional (2D) grid that must subsequently be aligned and fused into a single 3D volume. Due to their sheer size, aligning many overlapping gigabyte-sized 3D tiles in parallel and at full resolution is memory intensive and often I/O bound. Current techniques trade accuracy for scalability, perform alignment on subsampled images, and require additional postprocess algorithms to refine the alignment quality, usually with high computational requirements. One common solution to the memory problem is to subdivide the overlap region into smaller chunks (sub-blocks) and align the sub-block pairs in parallel, choosing the pair with the most reliable alignment to determine the global transformation. Yet aligning all sub-block pairs at full resolution remains computationally expensive. The key to quickly developing a fast, high-quality, low-memory solution is to identify a single or a small set of sub-blocks that give good alignment at full resolution without touching all the overlapping data. In this paper, we present a new iterative approach that leverages coarse resolution alignments to progressively refine and align only the promising candidates at finer resolutions, thereby aligning only a small user-defined number of sub-blocks at full resolution to determine the lowest error transformation between pairwise overlapping tiles. Our progressive approach is 2.6x faster than the state of the art, requires less than 450MB of peak RAM (per parallel thread), and offers a higher quality alignment without the need for additional postprocessing refinement steps to correct for alignment errors.

UncertainSCI: Uncertainty quantification for computational models in biomedicine and bioengineering
A. Narayan, Z. Liu, J. A. Bergquist, C. Charlebois, S. Rampersad, L. Rupp, D. Brooks, D. White, J. Tate, R. S. MacLeod. In Computers in Biology and Medicine, 2022.
DOI: https://doi.org/10.1016/j.compbiomed.2022.106407

Background:

Computational biomedical simulations frequently contain parameters that model physical features, material coefficients, and physiological effects, whose values are typically assumed known a priori. Understanding the effect of variability in those assumed values is currently a topic of great interest. A general-purpose software tool that quantifies how variation in these parameters affects model outputs is not broadly available in biomedicine. For this reason, we developed the ‘UncertainSCI’ uncertainty quantification software suite to facilitate analysis of uncertainty due to parametric variability.

Methods:

We developed and distributed a new open-source Python-based software tool, UncertainSCI, which employs advanced parameter sampling techniques to build polynomial chaos (PC) emulators that can be used to predict model outputs for general parameter values. Uncertainty of model outputs is studied by modeling parameters as random variables, and model output statistics and sensitivities are then easily computed from the emulator. Our approaches utilize modern, near-optimal techniques for sampling and PC construction based on weighted Fekete points constructed by subsampling from a suitably randomized candidate set.

Results:

Concentrating on two test cases—modeling bioelectric potentials in the heart and electric stimulation in the brain—we illustrate the use of UncertainSCI to estimate variability, statistics, and sensitivities associated with multiple parameters in these models.

Conclusion:

UncertainSCI is a powerful yet lightweight tool enabling sophisticated probing of parametric variability and uncertainty in biomedical simulations. Its non-intrusive pipeline allows users to leverage existing software libraries and suites to accurately ascertain parametric uncertainty in a variety of applications.

NSDF-Catalog: Lightweight Indexing Service for Democratizing Data Delivering
J. Luettgau, C.R. Kirkpatrick, G. Scorzelli, V. Pascucci, G. Tarcea, M. Taufer. 2022.

Across domains massive amounts of scientific data are generated. Because of the large volume of information, data discoverability is often hard if not impossible, especially for scientists who have not generated the data or are from other domains. As part of the NSF-funded National Science Data Fabric (NSDF) initiative, we develop a testbed to demonstrate that these boundaries to data discoverability can be overcome. In support of this effort, we identify the need for indexing large-amounts of scientific data across scientific domains. We propose NSDF-Catalog, a lightweight indexing service with minimal metadata that complements existing domain-specific and rich-metadata collections. NSDF-Catalog is designed to facilitate multiple related objectives within a flexible microservice to: (i) coordinate data movements and replication of data from origin repositories within the NSDF federation; (ii) build an inventory of existing scientific data to inform the design of next-generation cyberinfrastructure; and (iii) provide a suite of tools for discovery of datasets for cross-disciplinary research. Our service indexes scientific data at a fine-granularity at the file or object level to inform data distribution strategies and to improve the experience for users from the consumer perspective, with the goal of allowing end-to-end dataflow optimizations

Comparing different nonlinear dimensionality reduction techniques for data-driven unsteady fluid flow modeling
H. Csala, S.T.M. Dawson, A. Arzani. In Physics of Fluids, AIP Publishing, 2022.
DOI: https://doi.org/10.1063/5.0127284

Computational fluid dynamics (CFD) is known for producing high-dimensional spatiotemporal data. Recent advances in machine learning (ML) have introduced a myriad of techniques for extracting physical information from CFD. Identifying an optimal set of coordinates for representing the data in a low-dimensional embedding is a crucial first step toward data-driven reduced-order modeling and other ML tasks. This is usually done via principal component analysis (PCA), which gives an optimal linear approximation. However, fluid flows are often complex and have nonlinear structures, which cannot be discovered or efficiently represented by PCA. Several unsupervised ML algorithms have been developed in other branches of science for nonlinear dimensionality reduction (NDR), but have not been extensively used for fluid flows. Here, four manifold learning and two deep learning (autoencoder)-based NDR methods are investigated and compared to PCA. These are tested on two canonical fluid flow problems (laminar and turbulent) and two biomedical flows in brain aneurysms. The data reconstruction capabilities of these methods are compared, and the challenges are discussed. The temporal vs spatial arrangement of data and its influence on NDR mode extraction is investigated. Finally, the modes are qualitatively compared. The results suggest that using NDR methods would be beneficial for building more efficient reduced-order models of fluid flows. All NDR techniques resulted in smaller reconstruction errors for spatial reduction. Temporal reduction was a harder task; nevertheless, it resulted in physically interpretable modes. Our work is one of the first comprehensive comparisons of various NDR methods in unsteady flows.

Reduced Connectivity for Local Bilinear Jacobi Sets
Subtitled “arXiv:2208.07148,” D. Klötzl, T. Krake, Y. Zhou, J. Stober, K. Schulte, I. Hotz, B. Wang, D. Weiskopf. 2022.

We present a new topological connection method for the local bilinear computation of Jacobi sets that improves the visual representation while preserving the topological structure and geometric configuration. To this end, the topological structure of the local bilinear method is utilized, which is given by the nerve complex of the traditional piecewise linear method. Since the nerve complex consists of higher-dimensional simplices, the local bilinear method (visually represented by the 1-skeleton of the nerve complex) leads to clutter via crossings of line segments. Therefore, we propose a homotopy-equivalent representation that uses different collapses and edge contractions to remove such artifacts. Our new connectivity method is easy to implement, comes with only little overhead, and results in a less cluttered representation.

Local Bilinear Computation of Jacobi Sets
D. Klotzl, T. Krake, Y. Zhou, I. Hotz, B. Wang, D. Weiskopf. In The Visual Computer, 2022.

We propose a novel method for the computation of Jacobi sets in 2D domains. The Jacobi set is a topological descriptor based on Morse theory that captures gradient alignments among multiple scalar fields, which is useful for multi-field visualization. Previous Jacobi set computations use piecewise linear approximations on triangulations that result in discretization artifacts like zig-zag patterns. In this paper, we utilize a local bilinear method to obtain a more precise approximation of Jacobi sets by preserving the topology and improving the geometry. Consequently, zig-zag patterns on edges are avoided, resulting in a smoother Jacobi set representation. Our experiments show a better convergence with increasing resolution compared to the piecewise linear method. We utilize this advantage with an efficient local subdivision scheme. Finally, our approach is evaluated qualitatively and quantitatively in comparison with previous methods for different mesh resolutions and across a number of synthetic and real-world examples.

Quick Clusters: A GPU-Parallel Partitioning for Efficient Path Tracing of Unstructured Volumetric Grids
N. Morrical, A. Sahistan, U. Güdükbay, I. Wald, V. Pascucci. 2022.
DOI: 10.13140/RG.2.2.34351.20648

We propose a simple, yet effective method for clustering finite elements in order to improve preprocessing times and rendering performance of unstructured volumetric grids. Rather than building bounding volume hierarchies (BVHs) over individual elements, we sort elements along a Hilbert curve and aggregate neighboring elements together, significantly improving BVH memory consumption. Then to further reduce memory consumption, we cluster the mesh on the fly into sub-meshes with smaller indices using series of efficient parallel mesh re-indexing operations. These clusters are then passed to a highly optimized ray tracing API for both point containment queries and ray-cluster intersection testing. Each cluster is assigned a maximum extinction value for adaptive sampling, which we rasterize into non-overlapping view-aligned bins allocated along the ray. These maximum extinction bins are then used to guide the placement of samples along the ray during visualization, significantly reducing the number of samples required and greatly improving overall visualization interactivity. Using our approach, we improve rendering performance over a competitive baseline on the NASA Mars Lander dataset by 6×(1FPS up to 6FPS including volumetric shadows) while simultaneously reducing memory consumption by 3×(33GB down to 11GB) and avoiding any offline preprocessing steps, enabling high quality interactive visualization on consumer graphics cards. By utilizing the full 48 GB of an RTX 8000, we improve performance of Lander by 17×(1FPS up to 17FPS), enabling new possibilities for large data exploration.

A Novel Tree Visualization to Guide Interactive Exploration of Multi-dimensional Topological Hierarchies
Subtitled “arXiv preprint arXiv:2208.06952,” Y. Livnat, D. Maljovec, A. Gyulassy, B. Mouginot, V. Pascucci. 2022.

Understanding the response of an output variable to multi-dimensional inputs lies at the heart of many data exploration endeavours. Topology-based methods, in particular Morse theory and persistent homology, provide a useful framework for studying this relationship, as phenomena of interest often appear naturally as fundamental features. The Morse-Smale complex captures a wide range of features by partitioning the domain of a scalar function into piecewise monotonic regions, while persistent homology provides a means to study these features at different scales of simplification. Previous works demonstrated how to compute such a representation and its usefulness to gain insight into multi-dimensional data. However, exploration of the multi-scale nature of the data was limited to selecting a single simplification threshold from a plot of region count. In this paper, we present a novel tree visualization that provides a concise overview of the entire hierarchy of topological features. The structure of the tree provides initial insights in terms of the distribution, size, and stability of all partitions. We use regression analysis to fit linear models in each partition, and develop local and relative measures to further assess uniqueness and the importance of each partition, especially with respect parents/children in the feature hierarchy. The expressiveness of the tree visualization becomes apparent when we encode such measures using colors, and the layout allows an unprecedented level of control over feature selection during exploration. For instance, selecting features from multiple scales of the hierarchy enables a more nuanced exploration. Finally, we …

Localization supervision of chest x-ray classifiers using label-specific eye-tracking annotation
Subtitled “arXiv:2207.09771,” R. Lanfredi, J.D. Schroeder, T. Tasdizen. 2022.

Convolutional neural networks (CNNs) have been successfully applied to chest x-ray (CXR) images. Moreover, annotated bounding boxes have been shown to improve the interpretability of a CNN in terms of localizing abnormalities. However, only a few relatively small CXR datasets containing bounding boxes are available, and collecting them is very costly. Opportunely, eye-tracking (ET) data can be collected in a non-intrusive way during the clinical workflow of a radiologist. We use ET data recorded from radiologists while dictating CXR reports to train CNNs. We extract snippets from the ET data by associating them with the dictation of keywords and use them to supervise the localization of abnormalities. We show that this method improves a model's interpretability without impacting its image-level classification.

“Understanding Robustness Lottery”: A Comparative Visual Analysis of Neural Network Pruning Approaches
Subtitled “arXiv preprint arXiv:2206.07918,” Z. Li, S. Liu, X. Yu, K. Bhavya, J. Cao, J. Diffenderfer, P.T. Bremer, V. Pascucci. 2022.

Deep learning approaches have provided state-of-the-art performance in many applications by relying on extremely large and heavily overparameterized neural networks. However, such networks have been shown to be very brittle, not generalize well to new uses cases, and are often difficult if not impossible to deploy on resources limited platforms. Model pruning, i.e., reducing the size of the network, is a widely adopted strategy that can lead to more robust and generalizable network -- usually orders of magnitude smaller with the same or even improved performance. While there exist many heuristics for model pruning, our understanding of the pruning process remains limited. Empirical studies show that some heuristics improve performance while others can make models more brittle or have other side effects. This work aims to shed light on how different pruning methods alter the network's internal feature representation, and the corresponding impact on model performance. To provide a meaningful comparison and characterization of model feature space, we use three geometric metrics that are decomposed from the common adopted classification loss. With these metrics, we design a visualization system to highlight the impact of pruning on model prediction as well as the latent feature embedding. The proposed tool provides an environment for exploring and studying differences among pruning methods and between pruned and original model. By leveraging our visualization, the ML researchers can not only identify samples that are fragile to model pruning and data corruption but also obtain insights and explanations on how some pruned …

Scalable CPU Ray Tracing for In Situ Visualization Using OSPRay,
W. Usher, J. Amstutz, J. Günther, A. Knoll, G. P. Johnson, C. Brownlee, A. Hota, B. Cherniak, T. Rowley, J. Jeffers, V. Pascucci . In In Situ Visualization for Computational Science, Springer International Publishing, pp. 353--374. 2022.
ISBN: 978-3-030-81627-8

In situ visualization increasingly involves rendering large numbers of images for post hoc exploration. As both the number of images to be rendered and the data being rendered are large, the scalability of the rendering component is of key concern. Furthermore, the renderer must be able to support a wide range of data distributions, simulation configurations, and HPC systems to provide the flexibility required for a portable, general purpose in situ rendering package. In this chapter, we discuss recent developments in OSPRay’s support for MPI-parallel applications to provide a flexible and scalable rendering API, with a focus on how these developments can be applied to enable scalable, high-quality in situ visualization.

A Review of Three-Dimensional Medical Image Visualization
L. Zhou, M. Fan, C. Hansen, C. R. Johnson, D. Weiskopf. In Health Data Science, Vol. 2022, 2022.
DOI: https://doi.org/10.34133/2022/9840519

Importance. Medical images are essential for modern medicine and an important research subject in visualization. However, medical experts are often not aware of the many advanced three-dimensional (3D) medical image visualization techniques that could increase their capabilities in data analysis and assist the decision-making process for specific medical problems. Our paper provides a review of 3D visualization techniques for medical images, intending to bridge the gap between medical experts and visualization researchers. Highlights. Fundamental visualization techniques are revisited for various medical imaging modalities, from computational tomography to diffusion tensor imaging, featuring techniques that enhance spatial perception, which is critical for medical practices. The state-of-the-art of medical visualization is reviewed based on a procedure-oriented classification of medical problems for studies of individuals and populations. This paper summarizes free software tools for different modalities of medical images designed for various purposes, including visualization, analysis, and segmentation, and it provides respective Internet links. Conclusions. Visualization techniques are a useful tool for medical experts to tackle specific medical problems in their daily work. Our review provides a quick reference to such techniques given the medical problem and modalities of associated medical images. We summarize fundamental techniques and readily available visualization tools to help medical experts to better understand and utilize medical imaging data. This paper could contribute to the joint effort of the medical and visualization communities to advance precision medicine.

Exploratory Lagrangian-Based Particle Tracing Using Deep Learning
M. Han, S. Sane, C. R. Johnson. In Journal of Flow Visualization and Image Processing, Begell, 2022.
DOI: 10.1615/JFlowVisImageProc.2022041197

Time-varying vector fields produced by computational fluid dynamics simulations are often prohibitively large and pose challenges for accurate interactive analysis and exploration. To address these challenges, reduced Lagrangian representations have been increasingly researched as a means to improve scientific time-varying vector field exploration capabilities. This paper presents a novel deep neural network-based particle tracing method to explore time-varying vector fields represented by Lagrangian flow maps. In our workflow, in situ processing is first utilized to extract Lagrangian flow maps, and deep neural networks then use the extracted data to learn flow field behavior. Using a trained model to predict new particle trajectories offers a fixed small memory footprint and fast inference. To demonstrate and evaluate the proposed method, we perform an in-depth study of performance using a well-known analytical data set, the Double Gyre. Our study considers two flow map extraction strategies, the impact of the number of training samples and integration durations on efficacy, evaluates multiple sampling options for training and testing, and informs hyperparameter settings. Overall, we find our method requires a fixed memory footprint of 10.5 MB to encode a Lagrangian representation of a time-varying vector field while maintaining accuracy. For post hoc analysis, loading the trained model costs only two seconds, significantly reducing the burden of I/O when reading data for visualization. Moreover, our parallel implementation can infer one hundred locations for each of two thousand new pathlines in 1.3 seconds using one NVIDIA Titan RTX GPU.

Demonstrating the viability of Lagrangian in situ reduction on supercomputers
S. Sane, C. R. Johnson, H. Childs. In Journal of Computational Science, Vol. 61, Elsevier, 2022.

Performing exploratory analysis and visualization of large-scale time-varying computational science applications is challenging due to inaccuracies that arise from under-resolved data. In recent years, Lagrangian representations of the vector field computed using in situ processing are being increasingly researched and have emerged as a potential solution to enable exploration. However, prior works have offered limited estimates of the encumbrance on the simulation code as they consider “theoretical” in situ environments. Further, the effectiveness of this approach varies based on the nature of the vector field, benefitting from an in-depth investigation for each application area. With this study, an extended version of Sane et al. (2021), we contribute an evaluation of Lagrangian analysis viability and efficacy for simulation codes executing at scale on a supercomputer. We investigated previously unexplored cosmology and seismology applications as well as conducted a performance benchmarking study by using a hydrodynamics mini-application targeting exascale computing. To inform encumbrance, we integrated in situ infrastructure with simulation codes, and evaluated Lagrangian in situ reduction in representative homogeneous and heterogeneous HPC environments. To inform post hoc accuracy, we conducted a statistical analysis across a range of spatiotemporal configurations as well as a qualitative evaluation. Additionally, our study contributes cost estimates for distributed-memory post hoc reconstruction. In all, we demonstrate viability for each application — data reduction to less than 1% of the total data via Lagrangian representations, while maintaining accurate reconstruction and requiring under 10% of total execution time in over 90% of our experiments.

Page 2 of 23

Start
Prev
1
2
3
4
5
6
7
8
9
10
Next
End

SCI