2015

J. Bennett, F. Vivodtzev, V. Pascucci (Eds.).
**“Topological and Statistical Methods for Complex Data,”** Subtitled **“Tackling Large-Scale, High-Dimensional, and Multivariate Data Spaces,”** Mathematics and Visualization, 2015.

ISBN: 978-3-662-44899-1

This book contains papers presented at the Workshop on the Analysis of Large-scale,

High-Dimensional, and Multi-Variate Data Using Topology and Statistics, held in Le Barp,

France, June 2013. It features the work of some of the most prominent and recognized

leaders in the field who examine challenges as well as detail solutions to the analysis of

extreme scale data.

The book presents new methods that leverage the mutual strengths of both topological

and statistical techniques to support the management, analysis, and visualization

of complex data. It covers both theory and application and provides readers with an

overview of important key concepts and the latest research trends.

Coverage in the book includes multi-variate and/or high-dimensional analysis techniques,

feature-based statistical methods, combinatorial algorithms, scalable statistics algorithms,

scalar and vector field topology, and multi-scale representations. In addition, the book

details algorithms that are broadly applicable and can be used by application scientists to

glean insight from a wide range of complex data sets.

H. Bhatia, Bei Wang, G. Norgard, V. Pascucci, P. T. Bremer.
**“Local, Smooth, and Consistent Jacobi Set Simplification,”** In *Computational Geometry: Theory and Applications (CGTA)*, Vol. 48, No. 4, pp. 311-332. 2015.

The relation between two Morse functions defined on a smooth, compact, and orientable 2-manifold can be studied in terms of their Jacobi set. The Jacobi set contains points in the domain where the gradients of the two functions are aligned. Both the Jacobi set itself as well as the segmentation of the domain it induces, have shown to be useful in various applications. In practice, unfortunately, functions often contain noise and discretization artifacts, causing their Jacobi set to become unmanageably large and complex. Although there exist techniques to simplify Jacobi sets, they are unsuitable for most applications as they lack fine-grained control over the process, and heavily restrict the type of simplifications possible.

This paper introduces the theoretical foundations of a new simplification framework for Jacobi sets. We present a new interpretation of Jacobi set simplification based on the perspective of domain segmentation. Generalizing the cancellation of critical points from scalar functions to Jacobi sets, we focus on simplifications that can be realized by smooth approximations of the corresponding functions, and show how these cancellations imply simultaneous simplification of contiguous subsets of the Jacobi set. Using these extended cancellations as atomic operations, we introduce an algorithm to successively cancel subsets of the Jacobi set with minimal modifications to some userdefined metric. We show that for simply connected domains, our algorithm reduces a given Jacobi set to its

P. T. Bremer, D. Maljovec, A. Saha, Bei Wang, J. Gaffney, B. K. Spears, V. Pascucci.
**“ND2AV: N-Dimensional Data Analysis and Visualization -- Analysis for the National Ignition Campaign,”** In *Computing and Visualization in Science*, 2015.

One of the biggest challenges in high-energy physics is to analyze a complex mix of experimental and simulation data to gain new insights into the underlying physics. Currently, this analysis relies primarily on the intuition of trained experts often using nothing more sophisticated than default scatter plots. Many advanced analysis techniques are not easily accessible to scientists and not flexible enough to explore the potentially interesting hypotheses in an intuitive manner. Furthermore, results from individual techniques are often difficult to integrate, leading to a confusing patchwork of analysis snippets too cumbersome for data exploration. This paper presents a case study on how a combination of techniques from statistics, machine learning, topology, and visualization can have a significant impact in the field of inertial confinement fusion. We present the ND2AV: N-Dimensional Data Analysis and Visualization framework, a user-friendly tool aimed at exploiting the intuition and current work flow of the target users. The system integrates traditional analysis approaches such as dimension reduction and clustering with state-of-the-art techniques such as neighborhood graphs and topological analysis, and custom capabilities such as defining combined metrics on the fly. All components are linked into an interactive environment that enables an intuitive exploration of a wide variety of hypotheses while relating the results to concepts familiar to the users, such as scatter plots. ND2AV uses a modular design providing easy extensibility and customization for different applications. ND2AV is being actively used in the National Ignition Campaign and has already led to a number of unexpected discoveries.

CIBC.
Note: *Data Sets: NCRR Center for Integrative Biomedical Computing (CIBC) data set archive. Download from: http://www.sci.utah.edu/cibc/software.html*, 2015.

CIBC.
Note: *Cleaver: A MultiMaterial Tetrahedral Meshing Library and Application. Scientific Computing and Imaging Institute (SCI), Download from: http://www.sci.utah.edu/cibc/software.html*, 2015.

S. Durrleman, T.P. Fletcher, G. Gerig, M. Niethammer, X. Pennec (Eds.).
**“Spatio-temporal Image Analysis for Longitudinal and Time-Series Image Data,”** In *Proceedings of the Third International Workshop, STIA 2014*, Image Processing, Computer Vision, Pattern Recognition, and Graphics, Vol. 8682, *Springer LNCS*, 2015.

ISBN: 978-3-319-14905-9

This book constitutes the thoroughly refereed post-conference proceedings of the Third

International Workshop on Spatio-temporal Image Analysis for Longitudinal and Time-

Series Image Data, STIA 2014, held in conjunction with MICCAI 2014 in Boston, MA, USA, in

September 2014.

The 7 papers presented in this volume were carefully reviewed and selected from 15

submissions. They are organized in topical sections named: longitudinal registration and

shape modeling, longitudinal modeling, reconstruction from longitudinal data, and 4D

image processing.

SCI Institute.
Note: *FluoRender: An interactive rendering tool for confocal microscopy data visualization. Scientific Computing and Imaging Institute (SCI) Download from: http://www.fluorender.org*, 2015.

Note: *FusionView: Problem Solving Environment for MHD Visualization. Scientific Computing and Imaging Institute (SCI), Download from: http://www.scirun.org*, 2015.

A. V. P. Grosset, M. Prasad, C. Christensen, A. Knoll, C. Hansen.
**“TOD-Tree: Task-Overlapped Direct send Tree Image Compositing for Hybrid MPI Parallelism,”** In *Eurographics Symposium on Parallel Graphics and Visualization (2015)*, Edited by C. Dachsbacher, P. Navrátil, 2015.

Modern supercomputers have very powerful multi-core CPUs. The programming model on these supercomputer is switching from pure MPI to MPI for inter-node communication, and shared memory and threads for intra-node communication. Consequently the bottleneck in most systems is no longer computation but communication between nodes. In this paper, we present a new compositing algorithm for hybrid MPI parallelism that focuses on communication avoidance and overlapping communication with computation at the expense of evenly balancing the workload. The algorithm has three stages: a direct send stage where nodes are arranged in groups and exchange regions of an image, followed by a tree compositing stage and a gather stage. We compare our algorithm with radix-k and binary-swap from the IceT library in a hybrid OpenMP/MPI setting, show strong scaling results and explain how we generally achieve better performance than these two algorithms.

A. Gyulassy, A. Knoll, K. C. Lau, Bei Wang, P. T. Bremer, M. E. Papka, L. A. Curtiss, V. Pascucci.
**“Morse-Smale Analysis of Ion Diffusion for DFT Battery Materials Simulations,”** *Topology-Based Methods in Visualization (TopoInVis)*, 2015.

*Ab initio* molecular dynamics (AIMD) simulations are increasingly useful in modeling, optimizing and synthesizing materials in energy sciences. In solving Schrodinger's equation, they generate the electronic structure of the simulated atoms as a scalar field. However, methods for analyzing these volume data are not yet common in molecular visualization. The Morse-Smale complex is a proven, versatile tool for topological analysis of scalar fields. In this paper, we apply the discrete Morse-Smale complex to analysis of first-principles battery materials simulations. We consider a carbon nanosphere structure used in battery materials research, and employ Morse-Smale decomposition to determine the possible lithium ion diffusion paths within that structure. Our approach is novel in that it uses the wavefunction itself as opposed distance fields, and that we analyze the 1-skeleton of the Morse-Smale complex to reconstruct our diffusion paths. Furthermore, it is the first application where specific motifs in the graph structure of the complete 1-skeleton define features, namely carbon rings with specific valence. We compare our analysis of DFT data with that of a distance field approximation, and discuss implications on larger classical molecular dynamics simulations.

J. K. Holmen, A. Humphrey, M. Berzins.
**“Exploring Use of the Reserved Core,”** In *High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches*, Vol. 2, Edited by J. Reinders and J. Jeffers, 2015.

A. Humphrey, T. Harman, M. Berzins, P. Smith.
**“A Scalable Algorithm for Radiative Heat Transfer Using Reverse Monte Carlo Ray Tracing,”** In *The International Supercomputing Conference*, Springer LNCS, 2015.

Radiative heat transfer is an important mechanism in a class of challenging engineering and research problems. A direct all-to-all treatment of these problems is prohibitively expensive on large core counts due to pervasive all-to-all MPI communication. The massive heat transfer problem arising from the next generation of clean coal boilers being modeled by the Uintah framework has radiation as a dominant heat transfer mode. Reverse Monte Carlo ray tracing (RMCRT) can be used to solve for the radiative-flux divergence while accounting for the effects of participating media. The ray tracing approach used here replicates the geometry of the boiler on a multi-core node and then uses an all-to-all communication phase to distribute the results globally. The cost of this all-to-all is reduced by using an adaptive mesh approach in which a fine mesh is only used locally, and a coarse mesh is used elsewhere. A model for communication and computation complexity is used to predict performance of this new method. We show this model is consistent with observed results and demonstrate excellent strong scaling to 262K cores on the DOE Titan system on problem sizes that were previously computationally intractable.

CIBC.
Note: *ImageVis3D: An interactive visualization software system for large-scale volume data. Scientific Computing and Imaging Institute (SCI), Download from: http://www.imagevis3d.org*, 2015.

M. Kim, C.D. Hansen.
**“Surface Flow Visualization using the Closest Point Embedding,”** In *2015 IEEE Pacific Visualization Symposium*, April, 2015.

In this paper, we introduce a novel flow visualization technique for arbitrary surfaces. This new technique utilizes the closest point embedding to represent the surface, which allows for accurate particle advection on the surface as well as supports the unsteady flow line integral convolution (UFLIC) technique on the surface. This global approach is faster than previous parameterization techniques and prevents the visual artifacts associated with image-based approaches.

**Keywords:** vector field, flow visualization

M. Kim, C.D. Hansen.
**“GPU Surface Extraction with the Closest Point Embedding,”** In *Proceedings of IS&T/SPIE Visualization and Data Analysis, 2015*, February, 2015.

Isosurface extraction is a fundamental technique used for both surface reconstruction and mesh generation. One method to extract well-formed isosurfaces is a particle system; unfortunately, particle systems can be slow. In this paper, we introduce an enhanced parallel particle system that uses the closest point embedding as the surface representation to speedup the particle system for isosurface extraction. The closest point embedding is used in the Closest Point Method (CPM), a technique that uses a standard three dimensional numerical PDE solver on two dimensional embedded surfaces. To fully take advantage of the closest point embedding, it is coupled with a Barnes-Hut tree code on the GPU. This new technique produces well-formed, conformal unstructured triangular and tetrahedral meshes from labeled multi-material volume datasets. Further, this new parallel implementation of the particle system is faster than any known methods for conformal multi-material mesh extraction. The resulting speed-ups gained in this implementation can reduce the time from labeled data to mesh from hours to minutes and benefits users, such as bioengineers, who employ triangular and tetrahedral meshes.

**Keywords:** scalar field methods, GPGPU, curvature based, scientific visualization

S. Liu, D. Maljovec, Bei Wang, P. T. Bremer, V. Pascucci.
**“Visualizing High-Dimensional Data: Advances in the Past Decade,”** In *State of The Art Report*, *Eurographics Conference on Visualization (EuroVis)*, 2015.

Massive simulations and arrays of sensing devices, in combination with increasing computing resources, have generated large, complex, high-dimensional datasets used to study phenomena across numerous fields of study. Visualization plays an important role in exploring such datasets. We provide a comprehensive survey of advances in high-dimensional data visualization over the past 15 years. We aim at providing actionable guidance for data practitioners to navigate through a modular view of the recent advances, allowing the creation of new visualizations along the enriched information visualization pipeline and identifying future opportunities for visualization research.

S. Liu, Bei Wang, J. J. Thiagarajan, P. T. Bremer, V. Pascucci.
**“Visual Exploration of High-Dimensional Data through Subspace Analysis and Dynamic Projections,”** *Eurographics Conference on Visualization (EuroVis)*, 2015.

We introduce a novel interactive framework for visualizing and exploring high-dimensional datasets based on subspace analysis and dynamic projections. We assume the high-dimensional dataset can be represented by a mixture of low-dimensional linear subspaces with mixed dimensions, and provide a method to reliably estimate the intrinsic dimension and linear basis of each subspace extracted from the subspace clustering. Subsequently, we use these bases to define unique 2D linear projections as viewpoints from which to visualize the data. To understand the relationships among the different projections and to discover hidden patterns, we connect these projections through dynamic projections that create smooth animated transitions between pairs of projections. We introduce the view transition graph, which provides flexible navigation among these projections to facilitate an intuitive exploration. Finally, we provide detailed comparisons with related systems, and use real-world examples to demonstrate the novelty and usability of our proposed framework.

CIBC.
Note: *map3d: Interactive scientific visualization tool for bioengineering data. Scientific Computing and Imaging Institute (SCI), Download from: http://www.sci.utah.edu/cibc/software.html*, 2015.

K.S. McDowell, S. Zahid, F. Vadakkumpadan, J.J. Blauer, R.S. MacLeod, N.A. Trayanova.
**“Virtual Electrophysiological Study of Atrial Fibrillation in Fibrotic Remodeling,”** In *PLoS ONE*, Vol. 10, No. 2, pp. e0117110. February, 2015.

DOI: 10.1371/journal.pone.0117110

Research has indicated that atrial fibrillation (AF) ablation failure is related to the presence of atrial fibrosis. However it remains unclear whether this information can be successfully used in predicting the optimal ablation targets for AF termination. We aimed to provide a proof-of-concept that patient-specific virtual electrophysiological study that combines i) atrial structure and fibrosis distribution from clinical MRI and ii) modeling of atrial electrophysiology, could be used to predict: (1) how fibrosis distribution determines the locations from which paced beats degrade into AF; (2) the dynamic behavior of persistent AF rotors; and (3) the optimal ablation targets in each patient. Four MRI-based patient-specific models of fibrotic left atria were generated, ranging in fibrosis amount. Virtual electrophysiological studies were performed in these models, and where AF was inducible, the dynamics of AF were used to determine the ablation locations that render AF non-inducible. In 2 of the 4 models patient-specific models AF was induced; in these models the distance between a given pacing location and the closest fibrotic region determined whether AF was inducible from that particular location, with only the mid-range distances resulting in arrhythmia. Phase singularities of persistent rotors were found to move within restricted regions of tissue, which were independent of the pacing location from which AF was induced. Electrophysiological sensitivity analysis demonstrated that these regions changed little with variations in electrophysiological parameters. Patient-specific distribution of fibrosis was thus found to be a critical component of AF initiation and maintenance. When the restricted regions encompassing the meander of the persistent phase singularities were modeled as ablation lesions, AF could no longer be induced. The study demonstrates that a patient-specific modeling approach to identify non-invasively AF ablation targets prior to the clinical procedure is feasible.

S. McKenna, M. Meyer, C. Gregg, S. Gerber.
**“s-CorrPlot: An Interactive Scatterplot for Exploring Correlation,”** In *Journal of Computational and Graphical Statistics*, 2015.

DOI: 10.1080/10618600.2015.1021926

The degree of correlation between variables is used in many data analysis applications as a key measure of interdependence. The most common techniques for exploratory analysis of pairwise correlation in multivariate datasets, like scatterplot matrices and clustered heatmaps, however, do not scale well to large datasets, either computationally or visually. We present a new visualization that is capable of encoding pairwise correlation between hundreds of thousands variables, called the s-CorrPlot. The s-CorrPlot encodes correlation spatially between variables as points on scatterplot using the geometric structure underlying Pearson's correlation. Furthermore, we extend the s-CorrPlot with interactive techniques that enable animation of the scatterplot to new projections of the correlation space, as illustrated in the companion video in Supplemental Materials. We provide the s-CorrPlot as an open-source R-package and validate its effectiveness through a variety of methods including a case study with a biology collaborator.