V. Keshavarzzadeh, R. M. Kirby, A. Narayan. Multilevel Designed Quadrature for Partial Differential Equations with Random Inputs, In SIAM Journal on Scientific Computing, Vol. 43, No. 2, Society for Industrial and Applied Mathematics, pp. A1412-A1440. 2021.
We introduce a numerical method, multilevel designed quadrature for computing the statistical solution of partial differential equations with random input data. Similar to multilevel Monte Carlo methods, our method relies on hierarchical spatial approximations in addition to a parametric/stochastic sampling strategy. A key ingredient in multilevel methods is the relationship between the spatial accuracy at each level and the number of stochastic samples required to achieve that accuracy. Our sampling is based on flexible quadrature points that are designed for a prescribed accuracy, which can yield less overall computational cost compared to alternative multilevel methods. We propose a constrained optimization problem that determines the number of samples to balance the approximation error with the computational budget. We further show that the optimization problem is convex and derive analytic formulas for the optimal number of points at each level. We validate the theoretical estimates and the performance of our multilevel method via numerical examples on a linear elasticity and a steady state heat diffusion problem.
Robust topology optimization with low rank approximation using artificial neural networks, In Computational Mechanics, 2021.
We present a low rank approximation approach for topology optimization of parametrized linear elastic structures. The parametrization is considered on loading and stiffness of the structure. The low rank approximation is achieved by identifying a parametric connection among coarse finite element models of the structure (associated with different design iterates) and is used to inform the high fidelity finite element analysis. We build an Artificial Neural Network (ANN) map between low resolution design iterates and their corresponding interpolative coefficients (obtained from low rank approximations) and use this surrogate to perform high resolution parametric topology optimization. We demonstrate our approach on robust topology optimization with compliance constraints/objective functions and develop error bounds for the the parametric compliance computations. We verify these parametric computations with more challenging quantities of interest such as the p-norm of von Mises stress. To conclude, we use our approach on a 3D robust topology optimization and show significant reduction in computational cost via quantitative measures.
V. Keshavarzzadeh, S. Zhe, R.M. Kirby, A. Narayan. GP-HMAT: Scalable, $O(n\log (n)) $ Gaussian Process Regression with Hierarchical Low-Rank Matrices, Subtitled arXiv preprint arXiv:2201.00888, 2021.
A Gaussian process (GP) is a powerful and widely used regression technique. The main building block of a GP regression is the covariance kernel, which characterizes the relationship between pairs in the random field. The optimization to find the optimal kernel, however, requires several large-scale and often unstructured matrix inversions. We tackle this challenge by introducing a hierarchical matrix approach, named HMAT, which effectively decomposes the matrix structure, in a recursive manner, into significantly smaller matrices where a direct approach could be used for inversion. Our matrix partitioning uses a particular aggregation strategy for data points, which promotes the low-rank structure of off-diagonal blocks in the hierarchical kernel matrix. We employ a randomized linear algebra method for matrix reduction on the low-rank off-diagonal blocks without factorizing a large matrix. We provide analytical error and cost estimates for the inversion of the matrix, investigate them empirically with numerical computations, and demonstrate the application of our approach on three numerical examples involving GP regression for engineering problems and a large-scale real dataset. We provide the computer implementation of GP-HMAT, HMAT adapted for GP likelihood and derivative computations, and the implementation of the last numerical example on a real dataset. We demonstrate superior scalability of the HMAT approach compared to built-in operator in MATLAB for large-scale linear solves Ax=y via a repeatable and verifiable empirical study. An extension to hierarchical semiseparable (HSS) matrices is discussed as future research.
R. Kirby, K. Nottingham, R. Roy, S. Godil, B. Catanzaro. Guiding Global Placement With Reinforcement Learning, Subtitled arXiv preprint arXiv:2109.02631, 2021.
Recent advances in GPU accelerated global and detail placement have reduced the time to solution by an order of magnitude. This advancement allows us to leverage data driven optimization (such as Reinforcement Learning) in an effort to improve the final quality of placement results. In this work we augment state-of-the-art, force-based global placement solvers with a reinforcement learning agent trained to improve the final detail placed Half Perimeter Wire Length (HPWL). We propose novel control schemes with either global or localized control of the placement process. We then train reinforcement learning agents to use these controls to guide placement to improved solutions. In both cases, the augmented optimizer finds improved placement solutions. Our trained agents achieve an average 1% improvement in final detail place HPWL across a range of academic benchmarks and more than 1% in global place HPWL on real industry designs.
We present a method for the browsing of hierarchical 3D models in which we combine the typical navigation of hierarchical structures in a 2D environment---using clicks on nodes, links, or icons---with a 3D spatial data visualization. Our approach is motivated by large molecular models, for which the traditional single-scale navigational metaphors are not suitable. Multi-scale phenomena, e. g., in astronomy or geography, are complex to navigate due to their large data spaces and multi-level organization. Models from structural biology are in addition also densely crowded in space and scale. Cutaways are needed to show individual model subparts. The camera has to support exploration on the level of a whole virus, as well as on the level of a small molecule. We address these challenges by employing HyperLabels: active labels that---in addition to their annotational role---also support user interaction. Clicks on HyperLabels select the next structure to be explored. Then, we adjust the visualization to showcase the inner composition of the selected subpart and enable further exploration. Finally, we use a breadcrumbs panel for orientation and as a mechanism to traverse upwards in the model hierarchy. We demonstrate our concept of hierarchical 3D model browsing using two exemplary models from meso-scale biology.
A.S. Krishnapriyan, A. Gholami, S. Zhe, R.M. Kirby, M.W. Mahoney. Characterizing possible failure modes in physics-informed neural networks, Subtitled arXiv preprint arXiv:2109.01050, 2021.
Recent work in scientific machine learning has developed so-called physics-informed neural network (PINN) models. The typical approach is to incorporate physical domain knowledge as soft constraints on an empirical loss function and use existing machine learning methodologies to train the model. We demonstrate that, while existing PINN methodologies can learn good models for relatively trivial problems, they can easily fail to learn relevant physical phenomena even for simple PDEs. In particular, we analyze several distinct situations of widespread physical interest, including learning differential equations with convection, reaction, and diffusion operators. We provide evidence that the soft regularization in PINNs, which involves differential operators, can introduce a number of subtle problems, including making the problem ill-conditioned. Importantly, we show that these possible failure modes are not due to the lack of expressivity in the NN architecture, but that the PINN's setup makes the loss landscape very hard to optimize. We then describe two promising solutions to address these failure modes. The first approach is to use curriculum regularization, where the PINN's loss term starts from a simple PDE regularization, and becomes progressively more complex as the NN gets trained. The second approach is to pose the problem as a sequence-to-sequence learning task, rather than learning to predict the entire space-time at once. Extensive testing shows that we can achieve up to 1-2 orders of magnitude lower error with these methods as compared to regular PINN training.
L. Kühnel, T. Fletcher, S. Joshi, S. Sommer. Latent Space Geometric Statistics, In Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, January 10–15, 2021, Proceedings, Part VI, Springer International Publishing, pp. 163-178. 2021.
Deep generative models, e.g., variational autoencoders and generative adversarial networks, result in latent representation of observed data. The low dimensionality of the latent space provides an ideal setting for analysing high-dimensional data that would otherwise often be infeasible to handle statistically. The linear Euclidean geometry of the high-dimensional data space pulls back to a nonlinear Riemannian geometry on latent space where classical linear statistical techniques are no longer applicable. We show how analysis of data in their latent space representation can be performed using techniques from the field of geometric statistics. Geometric statistics provide generalisations of Euclidean statistical notions including means, principal component analysis, and maximum likelihood estimation of parametric distributions. Introduction to estimation procedures on latent space are considered, and the …
D. Lange, E. Polanco, R. Judson-Torres, T. Zangle, A. Lex. Loon: Using Exemplars to Visualize Large Scale Microscopy Data, In OSF Preprints, 2021.
Which drug is most promising for a cancer patient? This is a question a new microscopy-based approach for measuring the mass of individual cancer cells treated with different drugs promises to answer in only a few hours. However, the analysis pipeline for extracting data from these images is still far from complete automation: human intervention is necessary for quality control for preprocessing steps such as segmentation, to adjust filters, and remove noise, and for the analysis of the result. To address this workflow, we developed Loon, a visualization tool for analyzing drug screening data based on quantitative phase microscopy imaging. Loon visualizes both, derived data such as growth rates, and imaging data. Since the images are collected automatically at a large scale, manual inspection of images and segmentations is infeasible. However, reviewing representative samples of cells is essential, both for quality control and for data analysis. We introduce a new approach of choosing and visualizing representative exemplar cells that retain a close connection to the low-level data. By tightly integrating the derived data visualization capabilities with the novel exemplar visualization and providing selection and filtering capabilities, Loon is well suited for making decisions about which drugs are suitable for a specific patient.
R. B. Lanfredi, M. Zhang, W. F. Auffermann, J. Chan, P. T. Duong, V. Srikumar, T. Drew, J. D. Schroeder, T. Tasdizen. REFLACX, a dataset of reports and eye-tracking data for localization of abnormalities in chest x-rays, Subtitled arXiv:2109.14187, 2021.
Deep learning has shown recent success in classifying anomalies in chest x-rays, but datasets are still small compared to natural image datasets. Supervision of abnormality localization has been shown to improve trained models, partially compensating for dataset sizes. However, explicitly labeling these anomalies requires an expert and is very time-consuming. We propose a method for collecting implicit localization data using an eye tracker to capture gaze locations and a microphone to capture a dictation of a report, imitating the setup of a reading room, and potentially scalable for large datasets. The resulting REFLACX (Reports and Eye-Tracking Data for Localization of Abnormalities in Chest X-rays) dataset was labeled by five radiologists and contains 3,032 synchronized sets of eye-tracking data and timestamped report transcriptions. We also provide bounding boxes around lungs and heart and validation labels consisting of ellipses localizing abnormalities and image-level labels. Furthermore, a small subset of the data contains readings from all radiologists, allowing for the calculation of inter-rater scores.
R.B. Lanfredi, A. Arora, T. Drew, J.D. Schroeder, T. Tasdizen. Comparing radiologists’ gaze and saliency maps generated by interpretability methods for chest x-rays, Subtitled arXiv:2112.11716v1, 2021.
The interpretability of medical image analysis models is considered a key research field. We use a dataset of eye-tracking data from five radiologists to compare the outputs of interpretability methods against the heatmaps representing where radiologists looked. We conduct a class-independent analysis of the saliency maps generated by two methods selected from the literature: Grad-CAM and attention maps from an attention-gated model. For the comparison, we use shuffled metrics, which avoid biases from fixation locations. We achieve scores comparable to an interobserver baseline in one shuffled metric, highlighting the potential of saliency maps from Grad-CAM to mimic a radiologist’s attention over an image. We also divide the dataset into subsets to evaluate in which cases similarities are higher.
E. Laughton, V. Zala, A. Narayan, R. M. Kirby, D. Moxey. Fast Barycentric-Based Evaluation Over Spectral/hp Elements, Subtitled arXiv preprint arXiv:2103.03594, 2021.
As the use of spectral/hp element methods, and high-order finite element methods in general, continues to spread, community efforts to create efficient, optimized algorithms associated with fundamental high-order operations have grown. Core tasks such as solution expansion evaluation at quadrature points, stiffness and mass matrix generation, and matrix assembly have received tremendousattention. With the expansion of the types of problems to which high-order methods are applied, and correspondingly the growth in types of numerical tasks accomplished through high-order methods, the number and types of these core operations broaden. This work focuses on solution expansion evaluation at arbitrary points within an element. This operation is core to many postprocessing applications such as evaluation of streamlines and pathlines, as well as to field projection techniques such as mortaring. We expand barycentric interpolation techniques developed on an interval to 2D (triangles and quadrilaterals) and 3D (tetrahedra, prisms, pyramids, and hexahedra) spectral/hp element methods. We provide efficient algorithms for their implementations, and demonstrate their effectiveness using the spectral/hp element library Nektar++.
Z. Li, H. Menon, K. Mohror, P. T. Bremer, Y. Livant, V. Pascucci. Understanding a program's resiliency through error propagation, In Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ACM, pp. 362-373. 2021.
Aggressive technology scaling trends have worsened the transient fault problem in high-performance computing (HPC) systems. Some faults are benign, but others can lead to silent data corruption (SDC), which represents a serious problem; a fault introducing an error that is not readily detected nto an HPC simulation. Due to the insidious nature of SDCs, researchers have worked to understand their impact on applications. Previous studies have relied on expensive fault injection campaigns with uniform sampling to provide overall SDC rates, but this solution does not provide any feedback on the code regions without samples.
S. Li, Z. Wang, A. Narayan, R. Kirby, S. Zhe. Meta-Learning with Adjoint Methods, Subtitled arXiv preprint arXiv:2110.08432, 2021.
Model Agnostic Meta-Learning (MAML) is widely used to find a good initialization for a family of tasks. Despite its success, a critical challenge in MAML is to calculate the gradient w.r.t the initialization of a long training trajectory for the sampled tasks, because the computation graph can rapidly explode and the computational cost is very expensive. To address this problem, we propose Adjoint MAML (A-MAML). We view gradient descent in the inner optimization as the evolution of an Ordinary Differential Equation (ODE). To efficiently compute the gradient of the validation loss w.r.t the initialization, we use the adjoint method to construct a companion, backward ODE. To obtain the gradient w.r.t the initialization, we only need to run the standard ODE solver twice -- one is forward in time that evolves a long trajectory of gradient flow for the sampled task; the other is backward and solves the adjoint ODE. We need not create or expand any intermediate computational graphs, adopt aggressive approximations, or impose proximal regularizers in the training loss. Our approach is cheap, accurate, and adaptable to different trajectory lengths. We demonstrate the advantage of our approach in both synthetic and real-world meta-learning tasks.
Z. Liu, A. Narayan. On the computation of recurrence coefficients for univariate orthogonal polynomials, Subtitled arXiv preprint arXiv:2101.11963, 2021.
Associated to a finite measure on the real line with finite moments are recurrence coefficients in a three-term formula for orthogonal polynomials with respect to this measure. These recurrence coefficients are frequently inputs to modern computational tools that facilitate evaluation and manipulation of polynomials with respect to the measure, and such tasks are foundational in numerical approximation and quadrature. Although the recurrence coefficients for classical measures are known explicitly, those for nonclassical measures must typically be numerically computed. We survey and review existing approaches for computing these recurrence coefficients for univariate orthogonal polynomial families and propose a novel" predictor-corrector" algorithm for a general class of continuous measures. We combine the predictor-corrector scheme with a stabilized Lanczos procedure for a new hybrid algorithm that computes recurrence coefficients for a fairly wide class of measures that can have both continuous and discrete parts. We evaluate the new algorithms against existing methods in terms of accuracy and efficiency.
C. Ly, C. Nizinski, C. Vachet, L. McDonald, T. Tasdizen. Learning to Estimate the Composition of a Mixture with Synthetic Data, In Microscopy and Microanalysis, 2021.
Identifying the precise composition of a mixed material is important in various applications. For instance, in nuclear forensics analysis, knowing the process history of unknown or illicitly trafficked nuclear materials when they are discovered is desirable to prevent future losses or theft of material from the processing facilities. Motivated by this open problem, we describe a novel machine learning approach to determine the composition of a mixture from SEM images. In machine learning, the training data distribution should reflect the distribution of the data the model is expected to make predictions for, which can pose a hurdle. However, a key advantage of our proposed framework is that it requires reference images of pure material samples only. Removing the need for reference samples of various mixed material compositions reduces the time and monetary cost associated with reference sample preparation and imaging. Moreover, our proposed framework can determine the composition of a mixture composed of chemically similar materials, whereas other elemental analysis tools such as powder X-ray diffraction (p-XRD) have trouble doing so. For example, p-XRD is unable to discern mixtures composed of triuranium octoxide (U3O8) synthesized from different synthetic routes such as uranyl peroxide (UO4) and ammonium diuranate (ADU) . In contrast, our proposed framework can easily determine the composition of uranium oxides mixture synthesized from different synthetic routes, as we illustrate in the experiments.
Determining the composition of a mixed material is an open problem that has attracted the interest of researchers in many fields. In our recent work, we proposed a novel approach to determine the composition of a mixed material using convolutional neural networks (CNNs). In machine learning, a model “learns” a specific task for which it is designed through data. Hence, obtaining a dataset of mixed materials is required to develop CNNs for the task of estimating the composition. However, the proposed method instead creates the synthetic data of mixed materials generated from using only images of pure materials present in those mixtures. Thus, it eliminates the prohibitive cost and tedious process of collecting images of mixed materials. The motivation for this study is to provide mathematical details of the proposed approach in addition to extensive experiments and analyses. We examine the approach on two datasets to demonstrate the ease of extending the proposed approach to any mixtures. We perform experiments to demonstrate that the proposed approach can accurately determine the presence of the materials, and sufficiently estimate the precise composition of a mixed material. Moreover, we provide analyses to strengthen the validation and benefits of the proposed approach.
N. Marshak, P. Grosset, A. Knoll, J. P. Ahrens, C. R. Johnson. Evaluation of GPU Volume Rendering in PyTorch Using Data-Parallel Primitives, In Eurographics Symposium on Parallel Graphics and Visualization (EGPGV), 2021.
Data-parallel programming (DPP) has attracted considerable interest from the visualization community, fostering major software initiatives such as VTK-m. However, there has been relatively little recent investigation of data-parallel APIs in higherlevel languages such as Python, which could help developers sidestep the need for low-level application programming in C++ and CUDA. Moreover, machine learning frameworks exposing data-parallel primitives, such as PyTorch and TensorFlow, have exploded in popularity, making them attractive platforms for parallel visualization and data analysis. In this work, we benchmark data-parallel primitives in PyTorch, and investigate its application to GPU volume rendering using two distinct DPP formulations: a parallel scan and reduce over the entire volume, and repeated application of data-parallel operators to an array of rays. We find that most relevant DPP primitives exhibit performance similar to a native CUDA library. However, our volume rendering implementation reveals that PyTorch is limited in expressiveness when compared to other DPP APIs. Furthermore, while render times are sufficient for an early ''proof of concept'', memory usage acutely limits scalability.
T. McDonald, R. Shrestha, X. Yi, H. Bhatia, D. Chen, D. Goswami, V. Pascucci, T. Turbyville, P‐T Bremer. Leveraging Topological Events in Tracking Graphs for Understanding Particle Diffusion, In Computer Graphics Forum, Vol. 40, No. 3, pp. 251-262. 2021.
Single particle tracking (SPT) of fluorescent molecules provides significant insights into the diffusion and relative motion of tagged proteins and other structures of interest in biology. However, despite the latest advances in high-resolution microscopy, individual particles are typically not distinguished from clusters of particles. This lack of resolution obscures potential evidence for how merging and splitting of particles affect their diffusion and any implications on the biological environment. The particle tracks are typically decomposed into individual segments at observed merge and split events, and analysis is performed without knowing the true count of particles in the resulting segments. Here, we address the challenges in analyzing particle tracks in the context of cancer biology. In particular, we study the tracks of KRAS protein, which is implicated in nearly 20% of all human cancers, and whose clustering and aggregation have been linked to the signaling pathway leading to uncontrolled cell growth. We present a new analysis approach for particle tracks by representing them as tracking graphs and using topological events – merging and splitting, to disambiguate the tracks. Using this analysis, we infer a lower bound on the count of particles as they cluster and create conditional distributions of diffusion speeds before and after merge and split events. Using thousands of time-steps of simulated and in-vitro SPT data, we demonstrate the efficacy of our method, as it offers the biologists a new, detailed look into the relationship between KRAS clustering and diffusion speeds.
R. A. Moore, A. Narayan. Adaptive Density Tracking by Quadrature for Stochastic Differential Equations, Subtitled arXiv preprint arXiv:2105.08148, 2021.
Density tracking by quadrature (DTQ) is a numerical procedure for computing solutions to Fokker-Planck equations that describe probability densities for stochastic differential equations (SDEs). In this paper, we extend upon existing tensorized DTQ procedures by utilizing a flexible quadrature rule that allows for unstructured, adaptive meshes. We propose and describe the procedure for -dimensions, and demonstrate that the resulting adaptive procedure is significantly more efficient than a tensorized approach. Although we consider two-dimensional examples, all our computational procedures are extendable to higher dimensional problems.
N. Morrical, J. Tremblay, Y. Lin, S. Tyree, S. Birchfield, V. Pascucci, I. Wald. NViSII: A Scriptable Tool for Photorealistic Image Generation, Subtitled arXiv preprint arXiv:2105.13962, 2021.
We present a Python-based renderer built on NVIDIA's OptiX ray tracing engine and the OptiX AI denoiser, designed to generate high-quality synthetic images for research in computer vision and deep learning. Our tool enables the description and manipulation of complex dynamic 3D scenes containing object meshes, materials, textures, lighting, volumetric data (e.g., smoke), and backgrounds. Metadata, such as 2D/3D bounding boxes, segmentation masks, depth maps, normal maps, material properties, and optical flow vectors, can also be generated. In this work, we discuss design goals, architecture, and performance. We demonstrate the use of data generated by path tracing for training an object detector and pose estimator, showing improved performance in sim-to-real transfer in situations that are difficult for traditional raster-based renderers. We offer this tool as an easy-to-use, performant, high-quality renderer for advancing research in synthetic data generation and deep learning.