2015

J. Bennett, F. Vivodtzev, V. Pascucci (Eds.).
**“Topological and Statistical Methods for Complex Data,”** Subtitled **“Tackling Large-Scale, High-Dimensional, and Multivariate Data Spaces,”** Mathematics and Visualization, 2015.

ISBN: 978-3-662-44899-1

This book contains papers presented at the Workshop on the Analysis of Large-scale,

High-Dimensional, and Multi-Variate Data Using Topology and Statistics, held in Le Barp,

France, June 2013. It features the work of some of the most prominent and recognized

leaders in the field who examine challenges as well as detail solutions to the analysis of

extreme scale data.

The book presents new methods that leverage the mutual strengths of both topological

and statistical techniques to support the management, analysis, and visualization

of complex data. It covers both theory and application and provides readers with an

overview of important key concepts and the latest research trends.

Coverage in the book includes multi-variate and/or high-dimensional analysis techniques,

feature-based statistical methods, combinatorial algorithms, scalable statistics algorithms,

scalar and vector field topology, and multi-scale representations. In addition, the book

details algorithms that are broadly applicable and can be used by application scientists to

glean insight from a wide range of complex data sets.

J. Bennett, R. Clay, G. Baker, M. Gamell, D. Hollman, S. Knight, H. Kolla, G. Sjaardema, N. Slattengren, K. Teranishi, J. Wilke, M. Bettencourt, S. Bova, K. Franko, P. Lin, R. Grant, S. Hammond, S. Olivier.
**“ASC ATDM Level 2 Milestone #5325,”** Subtitled **“Asynchronous Many-Task Runtime System Analysis and Assessment for Next Generation Platforms,”** 2015.

This report provides in-depth information and analysis to help create a technical road map for developing nextgeneration programming models and runtime systems that support Advanced Simulation and Computing (ASC) workload requirements. The focus herein is on asynchronous many-task (AMT) model and runtime systems, which are of great interest in the context of "exascale" computing, as they hold the promise to address key issues associated with future extreme-scale computer architectures. This report includes a thorough qualitative and quantitative examination of three best-of-class AMT runtime systems—Charm++, Legion, and Uintah, all of which are in use as part of the ASC Predictive Science Academic Alliance Program II (PSAAP-II) Centers. The studies focus on each of the runtimes' *programmability, performance, and mutability*. Through the experiments and analysis presented, several overarching findings emerge. From a performance perspective, AMT runtimes show tremendous potential for addressing extremescale challenges. Empirical studies show an AMT runtime can mitigate performance heterogeneity inherent to the machine itself and that Message Passing Interface (MPI) and AMT runtimes perform comparably under balanced conditions. From a programmability and mutability perspective however, none of the runtimes in this study are currently ready for use in developing production-ready Sandia ASC applications. The report concludes by recommending a codesign path forward, wherein application, programming model, and runtime system developers work together to define requirements and solutions. Such a requirements-driven co-design approach benefits the high-performance computing (HPC) community as a whole, with widespread community engagement mitigating risk for both application developers and runtime system developers.

M. Berzins, J. Beckvermit, T. Harman, A. Bezdjian, A. Humphrey, Q. Meng, J. Schmidt,, C. Wight.
**“Extending the Uintah Framework through the Petascale Modeling of Detonation in Arrays of High Explosive Devices,”** *SCI Institute*, 2015.

The Uintah framework for solving a broad class of fluid-structure interaction problems uses a layered taskgraph approach that decouples the problem specification as a set of tasks from the adaptove runtime system that executes these tasks. Uintah has been developed by using a problem-driven approach that dates back to its inception. Using this approach it is possible to improve the performance of the problem-independent software components to enable the solution of broad classes of problems as well as the driving problem itself. This process is illustrated by a motivating problem that is the computational modeling of the hazards posed by thousands of explosive devices during a Deflagration to Detonation Transition (DDT) that occurred on Highway 6 in Utah. In order to solve this complex fluid-structure interaction problem at the required scale, algorithmic and data structure improvements were needed in a code that already appeared to work well at scale. These transformations enabled scalable runs for our target problem and provided the capability to model the transition to detonation. The performance improvements achieved are shown and the solution to the target problem provides insight as to why the detonation happened, as well as to a possible remediation strategy.

H. Bhatia, Bei Wang, G. Norgard, V. Pascucci, P. T. Bremer.
**“Local, Smooth, and Consistent Jacobi Set Simplification,”** In *Computational Geometry: Theory and Applications (CGTA)*, Vol. 48, No. 4, pp. 311-332. 2015.

The relation between two Morse functions defined on a smooth, compact, and orientable 2-manifold can be studied in terms of their Jacobi set. The Jacobi set contains points in the domain where the gradients of the two functions are aligned. Both the Jacobi set itself as well as the segmentation of the domain it induces, have shown to be useful in various applications. In practice, unfortunately, functions often contain noise and discretization artifacts, causing their Jacobi set to become unmanageably large and complex. Although there exist techniques to simplify Jacobi sets, they are unsuitable for most applications as they lack fine-grained control over the process, and heavily restrict the type of simplifications possible.

This paper introduces the theoretical foundations of a new simplification framework for Jacobi sets. We present a new interpretation of Jacobi set simplification based on the perspective of domain segmentation. Generalizing the cancellation of critical points from scalar functions to Jacobi sets, we focus on simplifications that can be realized by smooth approximations of the corresponding functions, and show how these cancellations imply simultaneous simplification of contiguous subsets of the Jacobi set. Using these extended cancellations as atomic operations, we introduce an algorithm to successively cancel subsets of the Jacobi set with minimal modifications to some userdefined metric. We show that for simply connected domains, our algorithm reduces a given Jacobi set to its

P. T. Bremer, D. Maljovec, A. Saha, Bei Wang, J. Gaffney, B. K. Spears, V. Pascucci.
**“ND2AV: N-Dimensional Data Analysis and Visualization -- Analysis for the National Ignition Campaign,”** In *Computing and Visualization in Science*, 2015.

One of the biggest challenges in high-energy physics is to analyze a complex mix of experimental and simulation data to gain new insights into the underlying physics. Currently, this analysis relies primarily on the intuition of trained experts often using nothing more sophisticated than default scatter plots. Many advanced analysis techniques are not easily accessible to scientists and not flexible enough to explore the potentially interesting hypotheses in an intuitive manner. Furthermore, results from individual techniques are often difficult to integrate, leading to a confusing patchwork of analysis snippets too cumbersome for data exploration. This paper presents a case study on how a combination of techniques from statistics, machine learning, topology, and visualization can have a significant impact in the field of inertial confinement fusion. We present the ND2AV: N-Dimensional Data Analysis and Visualization framework, a user-friendly tool aimed at exploiting the intuition and current work flow of the target users. The system integrates traditional analysis approaches such as dimension reduction and clustering with state-of-the-art techniques such as neighborhood graphs and topological analysis, and custom capabilities such as defining combined metrics on the fly. All components are linked into an interactive environment that enables an intuitive exploration of a wide variety of hypotheses while relating the results to concepts familiar to the users, such as scatter plots. ND2AV uses a modular design providing easy extensibility and customization for different applications. ND2AV is being actively used in the National Ignition Campaign and has already led to a number of unexpected discoveries.

H. Carr, Z. Geng, J. Tierny, A. Chattophadhyay,, A. Knoll.
**“Fiber Surfaces: Generalizing Isosurfaces to Bivariate Data,”** In *Computer Graphics Forum*, Vol. 34, No. 3, pp. 241-250. 2015.

Scientific visualization has many effective methods for examining and exploring scalar and vector fields, but rather fewer for multi-variate fields. We report the first general purpose approach for the interactive extraction of geometric separating surfaces in bivariate fields. This method is based on fiber surfaces: surfaces constructed from sets of fibers, the multivariate analogues of isolines. We show simple methods for fiber surface definition and extraction. In particular, we show a simple and efficient fiber surface extraction algorithm based on Marching Cubes. We also show how to construct fiber surfaces interactively with geometric primitives in the range of the function. We then extend this to build user interfaces that generate parameterized families of fiber surfaces with respect to arbitrary polylines and polygons. In the special case of isovalue-gradient plots, fiber surfaces capture features geometrically for quantitative analysis that have previously only been analysed visually and qualitatively using multi-dimensional transfer functions in volume rendering. We also demonstrate fiber surface extraction on a variety of bivariate data

CIBC.
Note: *Data Sets: NCRR Center for Integrative Biomedical Computing (CIBC) data set archive. Download from: http://www.sci.utah.edu/cibc/software.html*, 2015.

CIBC.
Note: *Cleaver: A MultiMaterial Tetrahedral Meshing Library and Application. Scientific Computing and Imaging Institute (SCI), Download from: http://www.sci.utah.edu/cibc/software.html*, 2015.

C.C. Conlin, J.L. Zhang, F. Rousset, C. Vachet, Y. Zhao, K.A. Morton, K. Carlston, G. Gerig, V.S. Lee.
**“Performance of an Efficient Image-registration Algorithm in Processing MR Renography Data,”** In *J Magnetic Resonance Imaging*, July, 2015.

DOI: 10.1002/jmri.25000

**PURPOSE:**

To evaluate the performance of an edge-based registration technique in correcting for respiratory motion artifacts in magnetic resonance renographic (MRR) data and to examine the efficiency of a semiautomatic software package in processing renographic data from a cohort of clinical patients.**MATERIALS AND METHODS:**

The developed software incorporates an image-registration algorithm based on the generalized Hough transform of edge maps. It was used to estimate glomerular filtration rate (GFR), renal plasma flow (RPF), and mean transit time (MTT) from 36 patients who underwent free-breathing MRR at 3T using saturation-recovery turbo-FLASH. The processing time required for each patient was recorded. Renal parameter estimates and model-fitting residues from the software were compared to those from a previously reported technique. Interreader variability in the software was quantified by the standard deviation of parameter estimates among three readers. GFR estimates from our software were also compared to a reference standard from nuclear medicine.**RESULTS:**

The time taken to process one patient's data with the software averaged 12 ± 4 minutes. The applied image registration effectively reduced motion artifacts in dynamic images by providing renal tracer-retention curves with significantly smaller fitting residues (P < 0.01) than unregistered data or data registered by the previously reported technique. Interreader variability was less than 10% for all parameters. GFR estimates from the proposed method showed greater concordance with reference values (P < 0.05).**CONCLUSION:**

These results suggest that the proposed software can process MRR data efficiently and accurately. Its incorporated registration technique based on the generalized Hough transform effectively reduces respiratory motion artifacts in free-breathing renographic acquisitions. J. Magn. Reson. Imaging 2015.

S. Durrleman, T.P. Fletcher, G. Gerig, M. Niethammer, X. Pennec (Eds.).
**“Spatio-temporal Image Analysis for Longitudinal and Time-Series Image Data,”** In *Proceedings of the Third International Workshop, STIA 2014*, Image Processing, Computer Vision, Pattern Recognition, and Graphics, Vol. 8682, *Springer LNCS*, 2015.

ISBN: 978-3-319-14905-9

This book constitutes the thoroughly refereed post-conference proceedings of the Third

International Workshop on Spatio-temporal Image Analysis for Longitudinal and Time-

Series Image Data, STIA 2014, held in conjunction with MICCAI 2014 in Boston, MA, USA, in

September 2014.

The 7 papers presented in this volume were carefully reviewed and selected from 15

submissions. They are organized in topical sections named: longitudinal registration and

shape modeling, longitudinal modeling, reconstruction from longitudinal data, and 4D

image processing.

SCI Institute.
Note: *FluoRender: An interactive rendering tool for confocal microscopy data visualization. Scientific Computing and Imaging Institute (SCI) Download from: http://www.fluorender.org*, 2015.

Note: *FusionView: Problem Solving Environment for MHD Visualization. Scientific Computing and Imaging Institute (SCI), Download from: http://www.scirun.org*, 2015.

Y. Gao, L. Zhu, J. Cates, R. S. MacLeod, S. Bouix,, A. Tannenbaum.
**“A Kalman Filtering Perspective for Multiatlas Segmentation,”** In *SIAM J. Imaging Sciences*, Vol. 8, No. 2, pp. 1007-1029. 2015.

DOI: 10.1137/130933423

In multiatlas segmentation, one typically registers several atlases to the novel image, and their respective segmented label images are transformed and fused to form the final segmentation. In this work, we provide a new dynamical system perspective for multiatlas segmentation, inspired by the following fact: The transformation that aligns the current atlas to the novel image can be not only computed by direct registration but also inferred from the transformation that aligns the previous atlas to the image together with the transformation between the two atlases. This process is similar to the global positioning system on a vehicle, which gets position by inquiring from the satellite and by employing the previous location and velocity—neither answer in isolation being perfect. To solve this problem, a dynamical system scheme is crucial to combine the two pieces of information; for example, a Kalman filtering scheme is used. Accordingly, in this work, a Kalman multiatlas segmentation is proposed to stabilize the global/affine registration step. The contributions of this work are twofold. First, it provides a new dynamical systematic perspective for standard independent multiatlas registrations, and it is solved by Kalman filtering. Second, with very little extra computation, it can be combined with most existing multiatlas segmentation schemes for better registration/segmentation accuracy.

C. Gritton, M. Berzins, R. M. Kirby.
**“Improving Accuracy In Particle Methods Using Null Spaces and Filters,”** In *Proceedings of the IV International Conference on Particle-Based Methods - Fundamentals and Applications*, Barcelona, Spain, Edited by E. Onate, M. Bischoff, D.R.J. Owen, P. Wriggers, and T. Zohdi, CIMNE, pp. 202-213. September, 2015.

ISBN: 978-84-944244-7-2

While particle-in-cell type methods, such as MPM, have been very successful in providing solutions to many challenging problems there are some important issues that remain to be resolved with regard to their analysis. One such challenge relates to the difference in dimensionality between the particles and the grid points to which they are mapped. There exists a non-trivial null space of the linear operator that maps particles values onto nodal values. In other words, there are non-zero particle values values that when mapped to the nodes are zero there. Given positive mapping weights such null space values are oscillatory in nature. The null space may be viewed as a more general form of the ringing instability identified by Brackbill for PIC methods. It will be shown that it is possible to remove these null-space values from the solution and so to improve the accuracy of PIC methods, using a matrix SVD approach. The expense of doing this is prohibitive for real problems and so a local method is developed for doing this.

A. V. P. Grosset, M. Prasad, C. Christensen, A. Knoll, C. Hansen.
**“TOD-Tree: Task-Overlapped Direct send Tree Image Compositing for Hybrid MPI Parallelism,”** In *Eurographics Symposium on Parallel Graphics and Visualization (2015)*, Edited by C. Dachsbacher, P. Navrátil, 2015.

Modern supercomputers have very powerful multi-core CPUs. The programming model on these supercomputer is switching from pure MPI to MPI for inter-node communication, and shared memory and threads for intra-node communication. Consequently the bottleneck in most systems is no longer computation but communication between nodes. In this paper, we present a new compositing algorithm for hybrid MPI parallelism that focuses on communication avoidance and overlapping communication with computation at the expense of evenly balancing the workload. The algorithm has three stages: a direct send stage where nodes are arranged in groups and exchange regions of an image, followed by a tree compositing stage and a gather stage. We compare our algorithm with radix-k and binary-swap from the IceT library in a hybrid OpenMP/MPI setting, show strong scaling results and explain how we generally achieve better performance than these two algorithms.

A. Gyulassy, A. Knoll, K. C. Lau, Bei Wang, P. T. Bremer, M. E. Papka, L. A. Curtiss, V. Pascucci.
**“Morse-Smale Analysis of Ion Diffusion for DFT Battery Materials Simulations,”** *Topology-Based Methods in Visualization (TopoInVis)*, 2015.

*Ab initio* molecular dynamics (AIMD) simulations are increasingly useful in modeling, optimizing and synthesizing materials in energy sciences. In solving Schrodinger's equation, they generate the electronic structure of the simulated atoms as a scalar field. However, methods for analyzing these volume data are not yet common in molecular visualization. The Morse-Smale complex is a proven, versatile tool for topological analysis of scalar fields. In this paper, we apply the discrete Morse-Smale complex to analysis of first-principles battery materials simulations. We consider a carbon nanosphere structure used in battery materials research, and employ Morse-Smale decomposition to determine the possible lithium ion diffusion paths within that structure. Our approach is novel in that it uses the wavefunction itself as opposed distance fields, and that we analyze the 1-skeleton of the Morse-Smale complex to reconstruct our diffusion paths. Furthermore, it is the first application where specific motifs in the graph structure of the complete 1-skeleton define features, namely carbon rings with specific valence. We compare our analysis of DFT data with that of a distance field approximation, and discuss implications on larger classical molecular dynamics simulations.

A. Gyulassy, A. Knoll, K. C. Lau, Bei Wang, PT. Bremer, M.l E. Papka, L. A. Curtiss, V. Pascucci.
**“Interstitial and Interlayer Ion Diffusion Geometry Extraction in Graphitic Nanosphere Battery Materials,”** In *Proceedings IEEE Visualization Conference*, 2015.

Large-scale molecular dynamics (MD) simulations are commonly used for simulating the synthesis and ion diffusion of battery materials. A good battery anode material is determined by its capacity to store ion or other diffusers. However, modeling of ion diffusion dynamics and transport properties at large length and long time scales would be impossible with current MD codes. To analyze the fundamental properties of these materials, therefore, we turn to geometric and topological analysis of their structure. In this paper, we apply a novel technique inspired by discrete Morse theory to the Delaunay triangulation of the simulated geometry of a thermally annealed carbon nanosphere. We utilize our computed structures to drive further geometric analysis to extract the interstitial diffusion structure as a single mesh. Our results provide a new approach to analyze the geometry of the simulated carbon nanosphere, and new insights into the role of carbon defect size and distribution in determining the charge capacity and charge dynamics of these carbon based battery materials.

J. K. Holmen, A. Humphrey, M. Berzins.
**“Exploring Use of the Reserved Core,”** In *High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches, Vol. 2*, Edited by J. Reinders and J. Jeffers, Elsevier, 2015.

In this chapter, we illustrate benefits of thinking in terms of thread management techniques when using a centralized scheduler model along with interoperability of MPI and PThreads. This is facilitated through an exploration of thread placement strategies for an algorithm modeling radiative heat transfer with special attention to the 61^{st} core. This algorithm plays a key role within the Uintah Computational Framework (UCF) and current efforts taking place at the University of Utah to model next-generation, large-scale clean coal boilers. In such simulations, this algorithm models the dominant form of heat transfer and consumes a large portion of compute time. Exemplified by a real-world example, this chapter presents our early efforts in porting a key portion of a scalability-centric codebase to the Intel ® Xeon Phi^{TM} coprocessor. Specifically, this chapter presents results from our experiments profiling the native execution of a reverse Monte-Carlo ray tracing-based radiation model on a single coprocessor. These results demonstrate that our fastest run confiurations utilized the 61st core and that performance was not profoundly impacted when explicitly over-subscribing the coprocessor operating system thread. Additionally, this chapter presents a portion of radiation model source code, a MIC-centric UCF cross-compilation example, and less conventional thread management techniques for developers utilizing the PThreads threading model.

A. Humphrey, T. Harman, M. Berzins, P. Smith.
**“A Scalable Algorithm for Radiative Heat Transfer Using Reverse Monte Carlo Ray Tracing,”** In *High Performance Computing*, Lecture Notes in Computer Science, Vol. 9137, Edited by Kunkel, Julian M. and Ludwig, Thomas, Springer International Publishing, pp. 212-230. 2015.

ISBN: 978-3-319-20118-4

DOI: 10.1007/978-3-319-20119-1_16

Radiative heat transfer is an important mechanism in a class of challenging engineering and research problems. A direct all-to-all treatment of these problems is prohibitively expensive on large core counts due to pervasive all-to-all MPI communication. The massive heat transfer problem arising from the next generation of clean coal boilers being modeled by the Uintah framework has radiation as a dominant heat transfer mode. Reverse Monte Carlo ray tracing (RMCRT) can be used to solve for the radiative-flux divergence while accounting for the effects of participating media. The ray tracing approach used here replicates the geometry of the boiler on a multi-core node and then uses an all-to-all communication phase to distribute the results globally. The cost of this all-to-all is reduced by using an adaptive mesh approach in which a fine mesh is only used locally, and a coarse mesh is used elsewhere. A model for communication and computation complexity is used to predict performance of this new method. We show this model is consistent with observed results and demonstrate excellent strong scaling to 262K cores on the DOE Titan system on problem sizes that were previously computationally intractable.

**Keywords:** Uintah; Radiation modeling; Parallel; Scalability; Adaptive mesh refinement; Simulation science; Titan

CIBC.
Note: *ImageVis3D: An interactive visualization software system for large-scale volume data. Scientific Computing and Imaging Institute (SCI), Download from: http://www.imagevis3d.org*, 2015.