I. Rodero, M. Parashar, A.G. Landge, S. Kumar, V. Pascucci,, P.T. Bremer. Evaluation of in-situ analysis strategies at scale for power efficiency and scalability, In Cluster, Cloud and Grid Computing (CCGrid), 2016 16th IEEE/ACM International Symposium on, IEEE, pp. 156--164. 2016.
The increasing gap between available compute power and I/O capabilities is resulting in simulation pipelines running on leadership computing facilities being reformulated. In particular, in-situ processing is complementing conventional post-process analysis; however, it can be performed by using the same compute resources as the simulation or using secondary dedicated resources.
In this paper, we focus on three different in-situ analysis strategies, which use the same compute resources as the ongoing simulation but different data movement strategies. We evaluate the costs incurred by these strategies in terms of run time, scalability and power/energy consumption. Furthermore, we extrapolate power behavior to peta-scale and investigate different design choices through projections. Experimental evaluation at full machine scale on Titan supports that using fewer cores per node for in-situ analysis is the optimum choice in terms of scalability. Hence, further research effort should be devoted towards developing in-situ analysis techniques following this strategy in future high-end systems.
W. Usher, I. Wald, A. Knoll, M. Papka, V. Pascucci.
In Situ Exploration of Particle Simulations with CPU Ray Tracing, In Supercomputing Frontiers and Innovations, Vol. 3, No. 4, 2016.
We present a system for interactive in situ visualization of large particle simulations, suitable for general CPU-based HPC architectures. As simulations grow in scale, in situ methods are needed to alleviate IO bottlenecks and visualize data at full spatio-temporal resolution. We use a lightweight loosely-coupled layer serving distributed data from the simulation to a data-parallel renderer running in separate processes. Leveraging the OSPRay ray tracing framework for visualization and balanced P-k-d trees, we can render simulation data in real-time, as they arrive, with negligible memory overhead. This flexible solution allows users to perform exploratory in situ visualization on the same computational resources as the simulation code, on dedicated visualization clusters or remote workstations, via a standalone rendering client that can be connected or disconnected as needed. We evaluate this system on simulations with up to 227M particles in the LAMMPS and Uintah computational frameworks, and show that our approach provides many of the advantages of tightly-coupled systems, with the flexibility to render on a wide variety of remote and co-processing resources.
Topological and Statistical Methods for Complex Data, Subtitled Tackling Large-Scale, High-Dimensional, and Multivariate Data Spaces, Mathematics and Visualization, Springer Berlin Heidelberg, 2015.
This book contains papers presented at the Workshop on the Analysis of Large-scale,
High-Dimensional, and Multi-Variate Data Using Topology and Statistics, held in Le Barp,
France, June 2013. It features the work of some of the most prominent and recognized
leaders in the field who examine challenges as well as detail solutions to the analysis of
extreme scale data.
The book presents new methods that leverage the mutual strengths of both topological
and statistical techniques to support the management, analysis, and visualization
of complex data. It covers both theory and application and provides readers with an
overview of important key concepts and the latest research trends.
Coverage in the book includes multi-variate and/or high-dimensional analysis techniques,
feature-based statistical methods, combinatorial algorithms, scalable statistics algorithms,
scalar and vector field topology, and multi-scale representations. In addition, the book
details algorithms that are broadly applicable and can be used by application scientists to
glean insight from a wide range of complex data sets.
P. T. Bremer, D. Maljovec, A. Saha, Bei Wang, J. Gaffney, B. K. Spears, V. Pascucci. ND2AV: N-Dimensional Data Analysis and Visualization -- Analysis for the National Ignition Campaign, In Computing and Visualization in Science, 2015.
J. Edwards, S. Kumar, V. Pascucci. Big data from scientific simulations, In Big Data and High Performance Computing, Vol. 26, IOS Press, pp. 32. 2015.
Scientic simulations often generate massive amounts of data used for debugging, restarts, and scientic analysis and discovery. Challenges that practitioners face using these types of big data are unique. Of primary importance is speed of writing data during a simulation, but this need for fast I/O is at odds with other priorities, such as data access time for visualization and analysis, ecient storage, and portability across a variety of supercomputer topologies, congurations, le systems, and storage devices. The computational power of high-performance computing systems continues to increase according to Moore's law, but the same is not true for I/O subsystems, creating a performance gap between computation and I/O. This chapter explores these issues, as well as possible optimization strategies, the use of in situ analytics, and a case study using the PIDX I/O library in a typical simulation.
J. Edwards, E. Daniel, V. Pascucci, C. Bajaj.
Approximating the Generalized Voronoi Diagram of Closely Spaced Objects, In Computer Graphics Forum, Vol. 34, No. 2, Wiley-Blackwell, pp. 299-309. May, 2015.
Generalized Voronoi Diagrams (GVDs) have far-reaching applications in robotics, visualization, graphics, and simulation. However, while the ordinary Voronoi Diagram has mature and efficient algorithms for its computation, the GVD is difficult to compute in general, and in fact, has only approximation algorithms for anything but the simplest of datasets. Our work is focused on developing algorithms to compute the GVD efficiently and with bounded error on the most difficult of datasets -- those with objects that are extremely close to each other.
Morse-Smale Analysis of Ion Diffusion for DFT Battery Materials Simulations, In Topology-Based Methods in Visualization (TopoInVis), 2015.A. Gyulassy, A. Knoll, K. C. Lau, Bei Wang, P. T. Bremer, M. E. Papka, L. A. Curtiss, V. Pascucci.
Ab initio molecular dynamics (AIMD) simulations are increasingly useful in modeling, optimizing and synthesizing materials in energy sciences. In solving Schrodinger's equation, they generate the electronic structure of the simulated atoms as a scalar field. However, methods for analyzing these volume data are not yet common in molecular visualization. The Morse-Smale complex is a proven, versatile tool for topological analysis of scalar fields. In this paper, we apply the discrete Morse-Smale complex to analysis of first-principles battery materials simulations. We consider a carbon nanosphere structure used in battery materials research, and employ Morse-Smale decomposition to determine the possible lithium ion diffusion paths within that structure. Our approach is novel in that it uses the wavefunction itself as opposed distance fields, and that we analyze the 1-skeleton of the Morse-Smale complex to reconstruct our diffusion paths. Furthermore, it is the first application where specific motifs in the graph structure of the complete 1-skeleton define features, namely carbon rings with specific valence. We compare our analysis of DFT data with that of a distance field approximation, and discuss implications on larger classical molecular dynamics simulations.
A. Gyulassy, A. Knoll, K. C. Lau, Bei Wang, PT. Bremer, M.l E. Papka, L. A. Curtiss, V. Pascucci. Interstitial and Interlayer Ion Diffusion Geometry Extraction in Graphitic Nanosphere Battery Materials, In Proceedings IEEE Visualization Conference, 2015.
S. Liu, D. Maljovec, Bei Wang, P. T. Bremer, V. Pascucci. Visualizing High-Dimensional Data: Advances in the Past Decade, In State of The Art Report, Eurographics Conference on Visualization (EuroVis), 2015.
S. Liu, Bei Wang, J. J. Thiagarajan, P. T. Bremer, V. Pascucci.
Visual Exploration of High-Dimensional Data through Subspace Analysis and Dynamic Projections, In Computer Graphics Forum, Vol. 34, No. 3, Wiley-Blackwell, pp. 271--280. June, 2015.
B. Summa, A. A. Gooch, G. Scorzelli, V. Pascucci.
Paint and Click: Unified Interactions for Image Boundaries, In Computer Graphics Forum, Vol. 34, No. 2, Wiley-Blackwell, pp. 385--393. May, 2015.
Image boundaries are a fundamental component of many interactive digital photography techniques, enabling applications such as segmentation, panoramas, and seamless image composition. Interactions for image boundaries often rely on two complementary but separate approaches: editing via painting or clicking constraints. In this work, we provide a novel, unified approach for interactive editing of pairwise image boundaries that combines the ease of painting with the direct control of constraints. Rather than a sequential coupling, this new formulation allows full use of both interactions simultaneously, giving users unprecedented flexibility for fast boundary editing. To enable this new approach, we provide technical advancements. In particular, we detail a reformulation of image boundaries as a problem of finding cycles, expanding and correcting limitations of the previous work. Our new formulation provides boundary solutions for painted regions with performance on par with state-of-the-art specialized, paint-only techniques. In addition, we provide instantaneous exploration of the boundary solution space with user constraints. Finally, we provide examples of common graphics applications impacted by our new approach.
H. Bhatia, V. Pascucci, R.M. Kirby, P.-T. Bremer.
Extracting Features from Time-Dependent Vector Fields Using Internal Reference Frames, In Computer Graphics Forum, Vol. 33, No. 3, pp. 21--30. June, 2014.
S. Kumar, C. Christensen, P.-T. Bremer, E. Brugger, V. Pascucci, J. Schmidt, M. Berzins, H. Kolla, J. Chen, V. Vishwanath, P. Carns, R. Grout. Fast Multi-Resolution Reads of Massive Simulation Datasets, In Proceedings of the International Supercomputing Conference ISC'14, Leipzig, Germany, June, 2014.
S. Kumar, J. Edwards, P.-T. Bremer, A. Knoll, C. Christensen, V. Vishwanath, P. Carns, J.A. Schmidt, V. Pascucci.
Efficient I/O and storage of adaptive-resolution data, In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, IEEE Press, pp. 413--423. 2014.
The ever increasing amount of data generated by scientific simulations coupled with system I/O constraints are fueling a need for in-situ analysis techniques. Of particular interest are approaches that produce reduced data representations while maintaining the ability to redefine, extract, and study features in a post-process to obtain scientific insights.
This paper presents two variants of in-situ feature extraction techniques using segmented merge trees, which encode a wide range of threshold based features. The first approach is a fast, low communication cost technique that generates an exact solution but has limited scalability. The second is a scalable, local approximation that nevertheless is guaranteed to correctly extract all features up to a predefined size. We demonstrate both variants using some of the largest combustion simulations available on leadership class supercomputers. Our approach allows state-of-the-art, feature-based analysis to be performed in-situ at significantly higher frequency than currently possible and with negligible impact on the overall simulation runtime.
Shusen Liu, Bei Wang, J.J. Thiagarajan, P.-T. Bremer, V. Pascucci. Visual Exploration of High-Dimensional Data: Subspace Analysis through Dynamic Projections, SCI Technical Report, No. UUSCI-2014-003, SCI Institute, University of Utah, 2014.
Understanding high-dimensional data is rapidly becoming a central challenge in many areas of science and engineering. Most current techniques either rely on manifold learning based techniques which typically create a single embedding of the data or on subspace selection to find subsets of the original attributes that highlight the structure. However, the former creates a single, difficult-to-interpret view and assumes the data to be drawn from a single manifold, while the latter is limited to axis-aligned projections with restrictive viewing angles. Instead, we introduce ideas based on subspace clustering that can faithfully represent more complex data than the axis-aligned projections, yet do not assume the data to lie on a single manifold. In particular, subspace clustering assumes that the data can be represented by a union of low-dimensional subspaces, which can subsequently be used for analysis and visualization. In this paper, we introduce new techniques to reliably estimate both the intrinsic dimension and the linear basis of a mixture of subspaces extracted through subspace clustering. We show that the resulting bases represent the high-dimensional structures more reliably than traditional approaches. Subsequently, we use the bases to define different “viewpoints”, i.e., different projections onto pairs of basis vectors, from which to visualize the data. While more intuitive than non-linear projections, interpreting linear subspaces in terms of the original dimensions can still be challenging. To address this problem, we present new, animated transitions between different views to help the user navigate and explore the high-dimensional space. More specifically, we introduce the view transition graph which contains nodes for each subspace viewpoint and edges for potential transition between views. The transition graph enables users to explore both the structure within a subspace and the relations between different subspaces, for better understanding of the data. Using a number of case studies on well-know reference datasets, we demonstrate that the interactive exploration through such dynamic projections provides additional insights not readily available from existing tools.
Keywords: High-dimensional data, Subspace, Dynamic projection