E. Laughton, V. Zala, A. Narayan, R. M. Kirby, D. Moxey. Fast Barycentric-Based Evaluation Over Spectral/hp Elements, Subtitled arXiv preprint arXiv:2103.03594, 2021.
As the use of spectral/hp element methods, and high-order finite element methods in general, continues to spread, community efforts to create efficient, optimized algorithms associated with fundamental high-order operations have grown. Core tasks such as solution expansion evaluation at quadrature points, stiffness and mass matrix generation, and matrix assembly have received tremendousattention. With the expansion of the types of problems to which high-order methods are applied, and correspondingly the growth in types of numerical tasks accomplished through high-order methods, the number and types of these core operations broaden. This work focuses on solution expansion evaluation at arbitrary points within an element. This operation is core to many postprocessing applications such as evaluation of streamlines and pathlines, as well as to field projection techniques such as mortaring. We expand barycentric interpolation techniques developed on an interval to 2D (triangles and quadrilaterals) and 3D (tetrahedra, prisms, pyramids, and hexahedra) spectral/hp element methods. We provide efficient algorithms for their implementations, and demonstrate their effectiveness using the spectral/hp element library Nektar++.
Z. Li, H. Menon, K. Mohror, P. T. Bremer, Y. Livant, V. Pascucci. Understanding a program's resiliency through error propagation, In Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ACM, pp. 362-373. 2021.
Aggressive technology scaling trends have worsened the transient fault problem in high-performance computing (HPC) systems. Some faults are benign, but others can lead to silent data corruption (SDC), which represents a serious problem; a fault introducing an error that is not readily detected nto an HPC simulation. Due to the insidious nature of SDCs, researchers have worked to understand their impact on applications. Previous studies have relied on expensive fault injection campaigns with uniform sampling to provide overall SDC rates, but this solution does not provide any feedback on the code regions without samples.
S. Li, Z. Wang, A. Narayan, R. Kirby, S. Zhe. Meta-Learning with Adjoint Methods, Subtitled arXiv preprint arXiv:2110.08432, 2021.
Model Agnostic Meta-Learning (MAML) is widely used to find a good initialization for a family of tasks. Despite its success, a critical challenge in MAML is to calculate the gradient w.r.t the initialization of a long training trajectory for the sampled tasks, because the computation graph can rapidly explode and the computational cost is very expensive. To address this problem, we propose Adjoint MAML (A-MAML). We view gradient descent in the inner optimization as the evolution of an Ordinary Differential Equation (ODE). To efficiently compute the gradient of the validation loss w.r.t the initialization, we use the adjoint method to construct a companion, backward ODE. To obtain the gradient w.r.t the initialization, we only need to run the standard ODE solver twice -- one is forward in time that evolves a long trajectory of gradient flow for the sampled task; the other is backward and solves the adjoint ODE. We need not create or expand any intermediate computational graphs, adopt aggressive approximations, or impose proximal regularizers in the training loss. Our approach is cheap, accurate, and adaptable to different trajectory lengths. We demonstrate the advantage of our approach in both synthetic and real-world meta-learning tasks.
Z. Liu, A. Narayan. On the computation of recurrence coefficients for univariate orthogonal polynomials, Subtitled arXiv preprint arXiv:2101.11963, 2021.
Associated to a finite measure on the real line with finite moments are recurrence coefficients in a three-term formula for orthogonal polynomials with respect to this measure. These recurrence coefficients are frequently inputs to modern computational tools that facilitate evaluation and manipulation of polynomials with respect to the measure, and such tasks are foundational in numerical approximation and quadrature. Although the recurrence coefficients for classical measures are known explicitly, those for nonclassical measures must typically be numerically computed. We survey and review existing approaches for computing these recurrence coefficients for univariate orthogonal polynomial families and propose a novel" predictor-corrector" algorithm for a general class of continuous measures. We combine the predictor-corrector scheme with a stabilized Lanczos procedure for a new hybrid algorithm that computes recurrence coefficients for a fairly wide class of measures that can have both continuous and discrete parts. We evaluate the new algorithms against existing methods in terms of accuracy and efficiency.
C. Ly, C. Nizinski, C. Vachet, L. McDonald, T. Tasdizen. Learning to Estimate the Composition of a Mixture with Synthetic Data, In Microscopy and Microanalysis, 2021.
Identifying the precise composition of a mixed material is important in various applications. For instance, in nuclear forensics analysis, knowing the process history of unknown or illicitly trafficked nuclear materials when they are discovered is desirable to prevent future losses or theft of material from the processing facilities. Motivated by this open problem, we describe a novel machine learning approach to determine the composition of a mixture from SEM images. In machine learning, the training data distribution should reflect the distribution of the data the model is expected to make predictions for, which can pose a hurdle. However, a key advantage of our proposed framework is that it requires reference images of pure material samples only. Removing the need for reference samples of various mixed material compositions reduces the time and monetary cost associated with reference sample preparation and imaging. Moreover, our proposed framework can determine the composition of a mixture composed of chemically similar materials, whereas other elemental analysis tools such as powder X-ray diffraction (p-XRD) have trouble doing so. For example, p-XRD is unable to discern mixtures composed of triuranium octoxide (U3O8) synthesized from different synthetic routes such as uranyl peroxide (UO4) and ammonium diuranate (ADU) . In contrast, our proposed framework can easily determine the composition of uranium oxides mixture synthesized from different synthetic routes, as we illustrate in the experiments.
Determining the composition of a mixed material is an open problem that has attracted the interest of researchers in many fields. In our recent work, we proposed a novel approach to determine the composition of a mixed material using convolutional neural networks (CNNs). In machine learning, a model “learns” a specific task for which it is designed through data. Hence, obtaining a dataset of mixed materials is required to develop CNNs for the task of estimating the composition. However, the proposed method instead creates the synthetic data of mixed materials generated from using only images of pure materials present in those mixtures. Thus, it eliminates the prohibitive cost and tedious process of collecting images of mixed materials. The motivation for this study is to provide mathematical details of the proposed approach in addition to extensive experiments and analyses. We examine the approach on two datasets to demonstrate the ease of extending the proposed approach to any mixtures. We perform experiments to demonstrate that the proposed approach can accurately determine the presence of the materials, and sufficiently estimate the precise composition of a mixed material. Moreover, we provide analyses to strengthen the validation and benefits of the proposed approach.
N. Marshak, P. Grosset, A. Knoll, J. P. Ahrens, C. R. Johnson. Evaluation of GPU Volume Rendering in PyTorch Using Data-Parallel Primitives, In Eurographics Symposium on Parallel Graphics and Visualization (EGPGV), 2021.
Data-parallel programming (DPP) has attracted considerable interest from the visualization community, fostering major software initiatives such as VTK-m. However, there has been relatively little recent investigation of data-parallel APIs in higherlevel languages such as Python, which could help developers sidestep the need for low-level application programming in C++ and CUDA. Moreover, machine learning frameworks exposing data-parallel primitives, such as PyTorch and TensorFlow, have exploded in popularity, making them attractive platforms for parallel visualization and data analysis. In this work, we benchmark data-parallel primitives in PyTorch, and investigate its application to GPU volume rendering using two distinct DPP formulations: a parallel scan and reduce over the entire volume, and repeated application of data-parallel operators to an array of rays. We find that most relevant DPP primitives exhibit performance similar to a native CUDA library. However, our volume rendering implementation reveals that PyTorch is limited in expressiveness when compared to other DPP APIs. Furthermore, while render times are sufficient for an early ''proof of concept'', memory usage acutely limits scalability.
T. McDonald, R. Shrestha, X. Yi, H. Bhatia, D. Chen, D. Goswami, V. Pascucci, T. Turbyville, P‐T Bremer. Leveraging Topological Events in Tracking Graphs for Understanding Particle Diffusion, In Computer Graphics Forum, Vol. 40, No. 3, pp. 251-262. 2021.
Single particle tracking (SPT) of fluorescent molecules provides significant insights into the diffusion and relative motion of tagged proteins and other structures of interest in biology. However, despite the latest advances in high-resolution microscopy, individual particles are typically not distinguished from clusters of particles. This lack of resolution obscures potential evidence for how merging and splitting of particles affect their diffusion and any implications on the biological environment. The particle tracks are typically decomposed into individual segments at observed merge and split events, and analysis is performed without knowing the true count of particles in the resulting segments. Here, we address the challenges in analyzing particle tracks in the context of cancer biology. In particular, we study the tracks of KRAS protein, which is implicated in nearly 20% of all human cancers, and whose clustering and aggregation have been linked to the signaling pathway leading to uncontrolled cell growth. We present a new analysis approach for particle tracks by representing them as tracking graphs and using topological events – merging and splitting, to disambiguate the tracks. Using this analysis, we infer a lower bound on the count of particles as they cluster and create conditional distributions of diffusion speeds before and after merge and split events. Using thousands of time-steps of simulated and in-vitro SPT data, we demonstrate the efficacy of our method, as it offers the biologists a new, detailed look into the relationship between KRAS clustering and diffusion speeds.
R. A. Moore, A. Narayan. Adaptive Density Tracking by Quadrature for Stochastic Differential Equations, Subtitled arXiv preprint arXiv:2105.08148, 2021.
Density tracking by quadrature (DTQ) is a numerical procedure for computing solutions to Fokker-Planck equations that describe probability densities for stochastic differential equations (SDEs). In this paper, we extend upon existing tensorized DTQ procedures by utilizing a flexible quadrature rule that allows for unstructured, adaptive meshes. We propose and describe the procedure for -dimensions, and demonstrate that the resulting adaptive procedure is significantly more efficient than a tensorized approach. Although we consider two-dimensional examples, all our computational procedures are extendable to higher dimensional problems.
N. Morrical, J. Tremblay, Y. Lin, S. Tyree, S. Birchfield, V. Pascucci, I. Wald. NViSII: A Scriptable Tool for Photorealistic Image Generation, Subtitled arXiv preprint arXiv:2105.13962, 2021.
We present a Python-based renderer built on NVIDIA's OptiX ray tracing engine and the OptiX AI denoiser, designed to generate high-quality synthetic images for research in computer vision and deep learning. Our tool enables the description and manipulation of complex dynamic 3D scenes containing object meshes, materials, textures, lighting, volumetric data (e.g., smoke), and backgrounds. Metadata, such as 2D/3D bounding boxes, segmentation masks, depth maps, normal maps, material properties, and optical flow vectors, can also be generated. In this work, we discuss design goals, architecture, and performance. We demonstrate the use of data generated by path tracing for training an object detector and pose estimator, showing improved performance in sim-to-real transfer in situations that are difficult for traditional raster-based renderers. We offer this tool as an easy-to-use, performant, high-quality renderer for advancing research in synthetic data generation and deep learning.
A. Narayan, L. Yan, T. Zhou. Optimal design for kernel interpolation: Applications to uncertainty quantification, In Journal of Computational Physics, Vol. 430, Academic Press, pp. 110094. 2021.
The paper is concerned with classic kernel interpolation methods, in addition to approximation methods that are augmented by gradient measurements. To apply kernel interpolation using radial basis functions (RBFs) in a stable way, we propose a type of quasi-optimal interpolation points, searching from a large set of candidate points, using a procedure similar to designing Fekete points or power function maximizing points that use pivot from a Cholesky decomposition. The proposed quasi-optimal points results in smaller condition number, and thus mitigates the instability of the interpolation procedure when the number of points becomes large. Applications to parametric uncertainty quantification are presented, and it is shown that the proposed interpolation method can outperform sparse grid methods in many interesting cases. We also demonstrate the new procedure can be applied to constructing gradient-enhanced Gaussian process emulators.
Built environments can affect health, but data in many geographic areas are limited. We used a big data source to create national indicators of neighborhood quality and assess their associations with health.
Computational Image Techniques for Analyzing Lanthanide and Actinide Morphology, In Rare Earth Elements and Actinides: Progress in Computational Science Applications, Ch. 6, pp. 133-155. 2021.
This chapter introduces computational image analysis techniques and how they may be used for material characterization as it pertains to lanthanide and actinide chemistry. Specifically, the underlying theory behind particle segmentation, texture analysis, and convolutional neural networks for material characterization are briefly summarized. The variety of particle segmentation techniques that have been used to effectively measure the size and shape of morphological features from scanning electron microscope images will be discussed. In addition, the extraction of image texture features via gray-level co-occurrence matrices and angle measurement techniques are described and demonstrated. To conclude, the application of convolutional neural networks to lanthanide and actinide materials science challenges are described with applications for image classification, feature extraction, and predicting a materials morphology discussed.
Quantifying user performance with metrics such as time and accuracy does not show the whole picture when researchers evaluate complex, interactive visualization tools. In such systems, performance is often influenced by different analysis strategies that statistical analysis methods cannot account for. To remedy this lack of nuance, we propose a novel analysis methodology for evaluating complex interactive visualizations at scale. We implement our analysis methods in reVISit, which enables analysts to explore participant interaction performance metrics and responses in the context of users' analysis strategies. Replays of participant sessions can aid in identifying usability problems during pilot studies and make individual analysis processes salient. To demonstrate the applicability of reVISit to visualization studies, we analyze participant data from two published crowdsourced studies. Our findings show that reVISit can be used to reveal and describe novel interaction patterns, to analyze performance differences between different analysis strategies, and to validate or challenge design decisions.
A. Nouri, P.E. Davis, P. Subedi, M. Parashar. Scalable Graph Embedding LearningOn A Single GPU, Subtitled arXiv preprint arXiv:2110.06991, 2021.
Graph embedding techniques have attracted growing interest since they convert the graph data into continuous and low-dimensional space. Effective graph analytic provides users a deeper understanding of what is behind the data and thus can benefit a variety of machine learning tasks. With the current scale of real-world applications, most graph analytic methods suffer high computation and space costs. These methods and systems can process a network with thousands to a few million nodes. However, scaling to large-scale networks remains a challenge. The complexity of training graph embedding system requires the use of existing accelerators such as GPU. In this paper, we introduce a hybrid CPU-GPU framework that addresses the challenges of learning embedding of large-scale graphs. The performance of our method is compared qualitatively and quantitatively with the existing embedding systems on common benchmarks. We also show that our system can scale training to datasets with an order of magnitude greater than a single machine's total memory capacity. The effectiveness of the learned embedding is evaluated within multiple downstream applications. The experimental results indicate the effectiveness of the learned embedding in terms of performance and accuracy.
A. Nouri, P.E. Davis, P. Subedi, M. Parashar. Exploring the Role of Machine Learning in Scientific Workflows: Opportunities and Challenges, Subtitled arXiv preprint arXiv:2110.13999, 2021.
In this survey, we discuss the challenges of executing scientific workflows as well as existing Machine Learning (ML) techniques to alleviate those challenges. We provide the context and motivation for applying ML to each step of the execution of these workflows. Furthermore, we provide recommendations on how to extend ML techniques to unresolved challenges in the execution of scientific workflows. Moreover, we discuss the possibility of using ML techniques for in-situ operations. We explore the challenges of in-situ workflows and provide suggestions for improving the performance of their execution using ML techniques.
M. Penwarden, S. Zhe, A. Narayan, R. M. Kirby. Multifidelity Modeling for Physics-Informed Neural Networks (PINNs), Subtitled arXiv preprint arXiv:2106.13361, 2021.
Multifidelity simulation methodologies are often used in an attempt to judiciously combine low-fidelity and high-fidelity simulation results in an accuracy-increasing, cost-saving way. Candidates for this approach are simulation methodologies for which there are fidelity differences connected with significant computational cost differences. Physics-informed Neural Networks (PINNs) are candidates for these types of approaches due to the significant difference in training times required when different fidelities (expressed in terms of architecture width and depth as well as optimization criteria) are employed. In this paper, we propose a particular multifidelity approach applied to PINNs that exploits low-rank structure. We demonstrate that width, depth, and optimization criteria can be used as parameters related to model fidelity, and show numerical justification of cost differences in training due to fidelity parameter choices. We test our multifidelity scheme on various canonical forward PDE models that have been presented in the emerging PINNs literature.
M. Penwarden, S. Zhe, A. Narayan, R. M. Kirby. Physics-Informed Neural Networks (PINNs) for Parameterized PDEs: A Metalearning Approach, Subtitled arXiv preprint arXiv:2110.13361, 2021.
Physics-informed neural networks (PINNs) as a means of discretizing partial differential equations (PDEs) are garnering much attention in the Computational Science and Engineering (CS&E) world. At least two challenges exist for PINNs at present: an understanding of accuracy and convergence characteristics with respect to tunable parameters and identification of optimization strategies that make PINNs as efficient as other computational science tools. The cost of PINNs training remains a major challenge of Physics-informed Machine Learning (PiML) – and, in fact, machine learning (ML) in general. This paper is meant to move towards addressing the latter through the study of PINNs for parameterized PDEs. Following the ML world, we introduce metalearning of PINNs for parameterized PDEs. By introducing metalearning and transfer learning concepts, we can greatly accelerate the PINNs optimization process. We present a survey of model-agnostic metalearning, and then discuss our model-aware metalearning applied to PINNs. We provide theoretically motivated and empirically backed assumptions that make our metalearning approach possible. We then test our approach on various canonical forward parameterized PDEs that have been presented in the emerging PINNs literature.
R. Pulch, A. Narayan, T. Stykel. Sensitivity analysis of random linear differential–algebraic equations using system norms, In Journal of Computational and Applied Mathematics, North-Holland, pp. 113666. 2021.
We consider linear dynamical systems composed of differential–algebraic equations (DAEs), where a quantity of interest (QoI) is assigned as output. Physical parameters of a system are modelled as random variables to quantify uncertainty, and we investigate a variance-based sensitivity analysis of the random QoI. Based on expansions via generalised polynomial chaos, the stochastic Galerkin method yields a new deterministic system of DAEs of high dimension. We define sensitivity measures by system norms, ie, the H∞-norm of the transfer function associated with the Galerkin system for different combinations of outputs. To ameliorate the enormous computational effort required to compute norms of high-dimensional systems, we apply balanced truncation, a particular method of model order reduction (MOR), to obtain a low-dimensional linear dynamical system that produces approximations of system norms …
E. Qian, J.M. Tabeart, C. Beattie, S. Gugercin, J. Jiang, P. Kramer, A. Narayan. Model Reduction of Linear Dynamical Systems via Balancing for Bayesian Inference, Subtitled arXiv preprint arXiv:2111.13246, 2021.
We consider the Bayesian approach to the linear Gaussian inference problem of inferring the initial condition of a linear dynamical system from noisy output measurements taken after the initial time. In practical applications, the large dimension of the dynamical system state poses a computational obstacle to computing the exact posterior distribution. Model reduction offers a variety of computational tools that seek to reduce this computational burden. In particular, balanced truncation is a system-theoretic approach to model reduction which obtains an efficient reduced-dimension dynamical system by projecting the system operators onto state directions which trade off the reachability and observability of state directions as expressed through the associated Gramians. We introduce Gramian definitions relevant to the inference setting and propose a balanced truncation approach based on these inference Gramians that yield a reduced dynamical system that can be used to cheaply approximate the posterior mean and covariance. Our definitions exploit natural connections between (i) the reachability Gramian and the prior covariance and (ii) the observability Gramian and the Fisher information. The resulting reduced model then inherits stability properties and error bounds from system theoretic considerations, and in some settings yields an optimal posterior covariance approximation. Numerical demonstrations on two benchmark problems in model reduction show that our method can yield near-optimal posterior covariance approximations with order-of-magnitude state dimension reduction.