R. Pulch, A. Narayan, T. Stykel. Sensitivity analysis of random linear differential–algebraic equations using system norms, In Journal of Computational and Applied Mathematics, North-Holland, pp. 113666. 2021.
We consider linear dynamical systems composed of differential–algebraic equations (DAEs), where a quantity of interest (QoI) is assigned as output. Physical parameters of a system are modelled as random variables to quantify uncertainty, and we investigate a variance-based sensitivity analysis of the random QoI. Based on expansions via generalised polynomial chaos, the stochastic Galerkin method yields a new deterministic system of DAEs of high dimension. We define sensitivity measures by system norms, ie, the H∞-norm of the transfer function associated with the Galerkin system for different combinations of outputs. To ameliorate the enormous computational effort required to compute norms of high-dimensional systems, we apply balanced truncation, a particular method of model order reduction (MOR), to obtain a low-dimensional linear dynamical system that produces approximations of system norms …
E. Qian, J.M. Tabeart, C. Beattie, S. Gugercin, J. Jiang, P. Kramer, A. Narayan. Model Reduction of Linear Dynamical Systems via Balancing for Bayesian Inference, Subtitled arXiv preprint arXiv:2111.13246, 2021.
We consider the Bayesian approach to the linear Gaussian inference problem of inferring the initial condition of a linear dynamical system from noisy output measurements taken after the initial time. In practical applications, the large dimension of the dynamical system state poses a computational obstacle to computing the exact posterior distribution. Model reduction offers a variety of computational tools that seek to reduce this computational burden. In particular, balanced truncation is a system-theoretic approach to model reduction which obtains an efficient reduced-dimension dynamical system by projecting the system operators onto state directions which trade off the reachability and observability of state directions as expressed through the associated Gramians. We introduce Gramian definitions relevant to the inference setting and propose a balanced truncation approach based on these inference Gramians that yield a reduced dynamical system that can be used to cheaply approximate the posterior mean and covariance. Our definitions exploit natural connections between (i) the reachability Gramian and the prior covariance and (ii) the observability Gramian and the Fisher information. The resulting reduced model then inherits stability properties and error bounds from system theoretic considerations, and in some settings yields an optimal posterior covariance approximation. Numerical demonstrations on two benchmark problems in model reduction show that our method can yield near-optimal posterior covariance approximations with order-of-magnitude state dimension reduction.
Y. Qin, A. Narayan, K. Cheng, P. Wang. An efficient method of calculating composition-dependent inter-diffusion coefficients based on compressed sensing method, In Computational Materials Science, Vol. 188, Elsevier, pp. 110145. 2021.
Composition-dependent inter-diffusion coefficients are key parameters in many physical processes. Due to the under-determinedness of the governing diffusion equations, numerical methods either impose strict physical conditions on the samples or require a computationally onerous amount of data. To address such problems, we propose a novel inverse framework to recover the diffusion coefficients using a compressed sensing method, which in principle can be extended to alloy systems with arbitrary number of species. Comparing to conventional methods, the new approach does not impose any priori assumptions on the functional relationship between diffusion coefficients and concentrations, nor any preference on the locations of the samples, as long as it is in the diffused zone. It also requires much less data compared to least-squares approaches. Through a few numerical examples of ternary and quandary systems, we demonstrate the accuracy and robustness of the new method.
With the growing number and increasing availability of shared-use instruments and observatories, observational data is becoming an essential part of application workflows and contributor to scientific discoveries in a range of disciplines. However, the corresponding growth in the number of users accessing these facilities coupled with the expansion in the scale and variety of the data, is making it challenging for these facilities to ensure their data can be accessed, integrated, and analyzed in a timely manner, and is resulting significant demands on their cyberinfrastructure (CI). In this paper, we present the design of a push-based data delivery framework that leverages emerging in-network capabilities, along with data pre-fetching techniques based on a hybrid data management model. Specifically, we analyze data access traces for two large-scale observatories, Ocean Observatories Initiative (OOI) and Geodetic Facility for the Advancement of Geoscience (GAGE), to identify typical user access patterns and to develop a model that can be used for data pre-fetching. Furthermore, we evaluate our data pre-fetching model and the proposed framework using a simulation of the Virtual Data Collaboratory (VDC) platform that provides in-network data staging and processing capabilities. The results demonstrate that the ability of the framework to significantly improve data delivery performance and reduce network traffic at the observatories’ facilities.
Large-scale multiuser scientific facilities, such as geographically distributed observatories, remote instruments, and experimental platforms, represent some of the largest national investments and can enable dramatic advances across many areas of science. Recent examples of such advances include the detection of gravitational waves and the imaging of a black hole’s event horizon. However, as the number of such facilities and their users grow, along with the complexity, diversity, and volumes of their data products, finding and accessing relevant data is becoming increasingly challenging, limiting the potential impact of facilities. These challenges are further amplified as scientists and application workflows increasingly try to integrate facilities’ data from diverse domains. In this paper, we leverage concepts underlying recommender systems, which are extremely effective in e-commerce, to address these data-discovery and data-access challenges for large-scale distributed scientific facilities. We first analyze data from facilities and identify and model user-query patterns in terms of facility location and spatial localities, domain-specific data models, and user associations. We then use this analysis to generate a knowledge graph and develop the collaborative knowledge-aware graph attention network (CKAT) recommendation model, which leverages graph neural networks (GNNs) to explicitly encode the collaborative signals through propagation and combine them with knowledge associations. Moreover, we integrate a knowledge-aware neural attention mechanism to enable the CKAT to pay more attention to key information while reducing irrelevant noise, thereby increasing the accuracy of the recommendations. We apply the proposed model on two real-world facility datasets and empirically demonstrate that the CKAT can effectively facilitate data discovery, significantly outperforming several compelling state-of-the-art baseline models.
Y. Qin, I. Rodero, M. Parashar. Toward Democratizing Access to Facilities Data: A Framework for Intelligent Data Discovery and Delivery, Subtitled arXiv:2112.06479, 2021.
Data collected by large-scale instruments, observatories, and sensor networks are key enablers of scientific discoveries in many disciplines. However, ensuring that these data can be accessed, integrated, and analyzed in a democratized and timely manner remains a challenge. In this article, we explore how state-of-the-art techniques for data discovery and access can be adapted to facility data and develop a conceptual framework for intelligent data access and discovery.
A.S. Rababah, L.R. Bear, Y.S. Dogrusoz, W. Good, J. Bergquist, J. Stoks, R. MacLeod, K. Rjoob, M. Jennings, J. Mclaughlin, D. D. Finlay. Reducing Line-of-block Artifacts in Cardiac Activation Maps Estimated Using ECG Imaging: A Comparison of Source Models and Estimation Methods, In Computers in Biology and Medicine, Vol. 136, pp. 104666. 2021.
Electrocardiographic imaging is an imaging modality that has been introduced recently to help in visualizing the electrical activity of the heart and consequently guide the ablation therapy for ventricular arrhythmias. One of the main challenges of this modality is that the electrocardiographic signals recorded at the torso surface are contaminated with noise from different sources. Low amplitude leads are more affected by noise due to their low peak-to-peak amplitude. In this paper, we have studied 6 datasets from two torso tank experiments (Bordeaux and Utah experiments) to investigate the impact of removing or interpolating these low amplitude leads on the inverse reconstruction of cardiac electrical activity. Body surface potential maps used were calculated by using the full set of recorded leads, removing 1, 6, 11, 16, or 21 low amplitude leads, or interpolating 1, 6, 11, 16, or 21 low amplitude leads using one of the three interpolation methods – Laplacian interpolation, hybrid interpolation, or the inverse-forward interpolation. The epicardial potential maps and activation time maps were computed from these body surface potential maps and compared with those recorded directly from the heart surface in the torso tank experiments. There was no significant change in the potential maps and activation time maps after the removal of up to 11 low amplitude leads. Laplacian interpolation and hybrid interpolation improved the inverse reconstruction in some datasets and worsened it in the rest. The inverse forward interpolation of low amplitude leads improved it in two out of 6 datasets and at least remained the same in the other datasets. It was noticed that after doing the inverse-forward interpolation, the selected lambda value was closer to the optimum lambda value that gives the inverse solution best correlated with the recorded one.
Detection and segmentation in microscopy images, In Computer Vision for Microscopy Image Analysis, Academic Press, pp. 43-71. 2021.
The plethora of heterogeneous data generated using modern microscopy imaging techniques eliminates the possibility of manual image analysis for biologists. Consequently, reliable and robust computerized techniques are critical to analyze microscopy data. Detection problems in microscopy images focuses on accurately identifying the objects of interest in an image that can be used to investigate hypotheses about developmental or pathological processes and can be indicative of prognosis in patients. Detection is also considered to be the preliminary step for solving subsequent problems, such as segmentation and tracking for various biological applications. Segmentation of the desired structures and regions in microscopy images require pixel-level labels to uniquely identify the individual structures and regions with contours for morphological and physiological analysis. Distributions of features extracted from the segmented regions can be used to compare normal versus disease or normal versus wild-type populations. Segmentation can be considered as a precursor for solving classification, reconstruction, and tracking problems in microscopy images. In this chapter, we discuss how the field of microscopic image analysis has progressed over the years, starting with traditional approaches and then followed by the study of learning algorithms. Because there is a lot of variability in microscopy data, it is essential to study learning algorithms that can adapt to these changes. We focus on deep learning approaches with convolutional neural networks (CNNs), as well as hierarchical methods for segmentation and detection in optical and electron microscopy images. Limitation of training data is one of the significant problems; hence, we explore solutions to learn better models with minimal user annotations.
M. Rasouli, R. M. Kirby, H. Sundar. A Compressed, Divide and Conquer Algorithm for Scalable Distributed Matrix-Matrix Multiplication, In The International Conference on High Performance Computing in Asia-Pacific Region, pp. 110-119. 2021.
Matrix-matrix multiplication (GEMM) is a widely used linear algebra primitive common in scientific computing and data sciences. While several highly-tuned libraries and implementations exist, these typically target either sparse or dense matrices. The performance of these tuned implementations on unsupported types can be poor, and this is critical in cases where the structure of the computations is associated with varying degrees of sparsity. One such example is Algebraic Multigrid (AMG), a popular solver and preconditioner for large sparse linear systems. In this work, we present a new divide and conquer sparse GEMM, that is also highly performant and scalable when the matrix becomes dense, as in the case of AMG matrix hierarchies. In addition, we implement a lossless data compression method to reduce the communication cost. We combine this with an efficient communication pattern during distributed-memory GEMM to provide 2.24 times (on average) better performance than the state-of-the-art library PETSc. Additionally, we show that the performance and scalability of our method surpass PETSc even more when the density of the matrix increases. We demonstrate the efficacy of our methods by comparing our GEMM with PETSc on a wide range of matrices.
A. Rathore, N. Chalapathi, S. Palande, Bei Wang. TopoAct: Visually Exploring the Shape of Activations in Deep Learning, In Computer Graphics Forum, Vol. 40, No. 1, pp. 382-397. 2021.
Deep neural networks such as GoogLeNet, ResNet, and BERT have achieved impressive performance in tasks such as image and text classification. To understand how such performance is achieved, we probe a trained deep neural network by studying neuron activations, i.e., combinations of neuron firings, at various layers of the network in response to a particular input. With a large number of inputs, we aim to obtain a global view of what neurons detect by studying their activations. In particular, we develop visualizations that show the shape of the activation space, the organizational principle behind neuron activations, and the relationships of these activations within a layer. Applying tools from topological data analysis, we present TopoAct, a visual exploration system to study topological summaries of activation vectors. We present exploration scenarios using TopoAct that provide valuable insights into learned representations of neural networks. We expect TopoAct to give a topological perspective that enriches the current toolbox of neural network analysis, and to provide a basis for network architecture diagnosis and data anomaly detection.
Many biological tissues contain an underlying fibrous microstructure that is optimized to suit a physiological function. The fiber architecture dictates physical characteristics such as stiffness, diffusivity, and electrical conduction. Abnormal deviations of fiber architecture are often associated with disease. Thus, it is useful to characterize fiber network organization from image data in order to better understand pathological mechanisms. We devised a method to quantify distributions of fiber orientations based on the Fourier transform and the Qball algorithm from diffusion MRI. The Fourier transform was used to decompose images into directional components, while the Qball algorithm efficiently converted the directional data from the frequency domain to the orientation domain. The representation in the orientation domain does not require any particular functional representation, and thus the method is nonparametric. The algorithm was verified to demonstrate its reliability and used on datasets from microscopy to show its applicability. This method increases the ability to extract information of microstructural fiber organization from experimental data that will enhance our understanding of structure-function relationships and enable accurate representation of material anisotropy in biological tissues.
Kernel optimization for Low-Rank Multi-Fidelity Algorithms, In International Journal for Uncertainty Quantification, Begel House Inc., pp. 31-54. 2021.M. Razi, M. Kirby, A. Narayan.
One of the major challenges for low-rank multi-fidelity (MF) approaches is the assumption that low-fidelity (LF) and high-fidelity (HF) models admit``similar''low-rank kernel representations. Low-rank MF methods have traditionally attempted to exploit low-rank representations of\emph linear kernels. However, such linear kernels may not be able to capture low-rank behavior, and they may admit LF and HF kernels that are not similar. Such a situation renders a naive approach to low-rank MF procedures ineffective. In this paper, we propose a novel approach for the selection of a near-optimal kernel function for use in low-rank MF methods. The proposed framework is a two-step strategy wherein:(1) hyperparameters of a library of kernel functions are optimized, and (2) a particular combination of of the optimized kernels is selected, through either a convex mixture (Additive Kernel Approach) or through a data-driven …
In this work, we present a reinforcement learning (RL) based approach to designing parallel prefix circuits such as adders or priority encoders that are fundamental to high-performance digital design. Unlike prior methods, our approach designs solutions tabula rasa purely through learning with synthesis in the loop. We design a grid-based state-action representation and an RL environment for constructing legal prefix circuits. Deep Convolutional RL agents trained on this environment produce prefix adder circuits that Pareto-dominate existing baselines with up to 16.0% and 30.2% lower area for the same delay in the 32b and 64b settings respectively. We observe that agents trained with open-source synthesis tools and cell library can design adder circuits that achieve lower area and delay than commercial tool adders in an industrial cell library.
Damodar Sahasrabudhe. Enhancing Asynchronous Many-Task Runtime Systems for Next-Generation Architectures and Exascale Supercomputers, School of Computing, University of Utah, Salt Lake City, UT, USA, 2021.
Exascale supercomputers capable of computing 1018 double-precision floating point operations per second are expected to be operational around 2022/23. The complexity and diversity of the proposed exascale machines pose new challenges for the software applications, namely, 1) implementing efficient data management; 2) having programming systems to exploit locality and multimillion parallelism; 3) developing efficient algorithms to leverage new architectures; 4) ensuring resiliency; and 5) improving scientific productivity on diverse architectures. Due to data-driven scheduling and asynchronous execution, Asynchronous Many-Task (AMT) runtime systems show promise to handle these exascale challenges.
One such AMT, the Uintah Computational Framework, maintains two distinct layers for the application and underlying runtime infrastructure. This distinction allows Uintah users to concentrate on application and the Uintah infrastructure handles communication, data coherency, multithreading, and architecture-specific complexities.
This dissertation addresses some of the exascale challenges and also integrates the individual solutions under the single umbrella of Uintah. The resiliency approach handles node failure faster than the traditional checkpointing method and helps to address challenge (4). A potential solution for challenges (2) and (3) can be the new asynchronous scheduler designed for the Sunway Taihulight supercomputer that shows the benefits of asynchronous execution. The novel portable Single Instruction Multiple Data (SIMD) primitive provides a prospective approach to handle (2) and (5), which achieves near-ideal vectorization on Central Processing Units (CPUs) along with Graphics Processing Unit (GPU) portability provided by the CUDA back end. The newly developed threading model using MPI endpoints shows performance improvements over the MPI-everywhere version, which can be one of the solutions to tackle challenges (2) and (3). Finally, this work enhances the heterogeneous scheduler, contributes to the ongoing portability drive, and successfully runs a simulation using portable AMT tasks on thousands of CPUs and GPUs. These enhancements are important to answer challenges (2), (3), and (5). As a result, this research takes Uintah closer to exascale readiness. Using Uintah as an example, this work demonstrates how AMTs, third-party libraries, and applications can be enhanced to benefit from the next-generation architectures.
Salinet et al. Electrocardiographic Imaging for Atrial Fibrillation treatment guidance (for example, localization of AF triggers and sustaining mechanisms), and we discuss the technological requirements and validation. We address experimental and clinical results, limitations, and future challenges for fruitful application of ECGI for AF understanding and management. We pay attention to existing techniques and clinical application, to computer models and (animal or human) experiments, to challenges of methodological and clinical validation. The overall objective of the study is to provide a consensus on valuable directions that ECGI research may take to provide future improvements in AF characterization and treatment guidance.
Prestin Generates Instantaneous Force in Outer Hair Cell Membranes, In Biophysical Journal, Vol. 120, No. 3, 2021.J. Sandhu, T. Bidone, R. D. Rabbitt.
Hearing occurs from sound reaching the inner ear cochlea, where electromotile Outer Hair Cells (OHCs) amplify vibrations by elongating and contracting rapidly in response to auditory frequency changes in membrane potential. OHCs can generate force cycle-by-cycle at frequencies exceeding 50kHz, but precisely how this is achieved is unclear. Electromotility requires expression of the transmembrane protein, prestin, which facilitates the electromechanical conversion through action of the Coulomb force acting on the anion Cl- bound at the core of the protein. However, recent experimental data suggests the charge displacement is too slow to support sound amplification at auditory frequencies. As a consequence, prestin electromechanics remain unclear at the molecular level. We hypothesize that prestin instantaneously transmits stress to the membrane, which subsequently drives charge displacement, membrane deformation, and OHC shape changes. To test the hypothesis, we examined the conformational dynamics of prestin and its effects on the motion of lipids under: (1) isometric conditions and (2) constant force conditions in order to mimic different regimes of membrane loading. All-atom molecular dynamics simulations of the prestin dimer embedded in POPC membranes were run and the trajectories analyzed. We discovered that under isometric conditions, the presence of a chloride ion in the electric field increased residue fluctuations. This trend was not observed under constant force conditions, supporting the idea that isometric conditions cause instantaneous force to be generated in the membrane. The analysis allowed us to identify the molecular mechanisms by which prestin allows electromechanical amplification by OHCs in the cochlea.
S. Sane, T. Athawale,, C.R. Johnson. Visualization of Uncertain Multivariate Data via Feature Confidence Level-Sets, In EuroVis 2021, 2021.
Recent advancements in multivariate data visualization have opened new research opportunities for the visualization community. In this paper, we propose an uncertain multivariate data visualization technique called feature confidence level-sets. Conceptually, feature level-sets refer to level-sets of multivariate data. Our proposed technique extends the existing idea of univariate confidence isosurfaces to multivariate feature level-sets. Feature confidence level-sets are computed by considering the trait for a specific feature, a confidence interval, and the distribution of data at each grid point in the domain. Using uncertain multivariate data sets, we demonstrate the utility of the technique to visualize regions with uncertainty in relation to the specific trait or feature, and the ability of the technique to provide secondary feature structure visualization based on uncertainty.
S. Sane, A. Yenpure, R. Bujack, M. Larsen, K. Moreland, C. Garth, C. R. Johnson,, H. Childs.
Scalable In Situ Computation of Lagrangian Representations via Local Flow Maps, In Eurographics Symposium on Parallel Graphics and Visualization, The Eurographics Association, 2021.
In situ computation of Lagrangian flow maps to enable post hoc time-varying vector field analysis has recently become an active area of research. However, the current literature is largely limited to theoretical settings and lacks a solution to address scalability of the technique in distributed memory. To improve scalability, we propose and evaluate the benefits and limitations of a simple, yet novel, performance optimization. Our proposed optimization is a communication-free model resulting in local Lagrangian flow maps, requiring no message passing or synchronization between processes, intrinsically improving scalability, and thereby reducing overall execution time and alleviating the encumbrance placed on simulation codes from communication overheads. To evaluate our approach, we computed Lagrangian flow maps for four time-varying simulation vector fields and investigated how execution time and reconstruction accuracy are impacted by the number of GPUs per compute node, the total number of compute nodes, particles per rank, and storage intervals. Our study consisted of experiments computing Lagrangian flow maps with up to 67M particle trajectories over 500 cycles and used as many as 2048 GPUs across 512 compute nodes. In all, our study contributes an evaluation of a communication-free model as well as a scalability study of computing distributed Lagrangian flow maps at scale using in situ infrastructure on a modern supercomputer.
Investigating In Situ Reduction via Lagrangian Representations for Cosmology and Seismology Applications, In Computational Science -- ICCS 2021, Springer International Publishing, pp. 436--450. 2021.
Although many types of computational simulations produce time-varying vector fields, subsequent analysis is often limited to single time slices due to excessive costs. Fortunately, a new approach using a Lagrangian representation can enable time-varying vector field analysis while mitigating these costs. With this approach, a Lagrangian representation is calculated while the simulation code is running, and the result is explored after the simulation. Importantly, the effectiveness of this approach varies based on the nature of the vector field, requiring in-depth investigation for each application area. With this study, we evaluate the effectiveness for previously unexplored cosmology and seismology applications. We do this by considering encumbrance (on the simulation) and accuracy (of the reconstructed result). To inform encumbrance, we integrated in situ infrastructure with two simulation codes, and evaluated on representative HPC environments, performing Lagrangian in situ reduction using GPUs as well as CPUs. To inform accuracy, our study conducted a statistical analysis across a range of spatiotemporal configurations as well as a qualitative evaluation. In all, we demonstrate effectiveness for both cosmology and seismology—time-varying vector fields from these domains can be reduced to less than 1% of the total data via Lagrangian representations, while maintaining accurate reconstruction and requiring under 10% of total execution time in over 80% of our experiments.
A. Singh, M. Bauer, S. Joshi. Physics Informed Convex Artificial Neural Networks (PICANNs) for Optimal Transport based Density Estimation, Subtitled arXiv, 2021.
Optimal Mass Transport (OMT) is a well studied problem with a variety of applications in a diverse set of fields ranging from Physics to Computer Vision and in particular Statistics and Data Science. Since the original formulation of Monge in 1781 significant theoretical progress been made on the existence, uniqueness and properties of the optimal transport maps. The actual numerical computation of the transport maps, particularly in high dimensions, remains a challenging problem. By Brenier's theorem, the continuous OMT problem can be reduced to that of solving a non-linear PDE of Monge-Ampere type whose solution is a convex function. In this paper, building on recent developments of input convex neural networks and physics informed neural networks for solving PDE's, we propose a Deep Learning approach to solve the continuous OMT problem.
To demonstrate the versatility of our framework we focus on the ubiquitous density estimation and generative modeling tasks in statistics and machine learning. Finally as an example we show how our framework can be incorporated with an autoencoder to estimate an effective probabilistic generative model.