Designed especially for neurobiologists, FluoRender is an interactive tool for multi-channel fluorescence microscopy data visualization and analysis.
Deep brain stimulation
BrainStimulator is a set of networks that are used in SCIRun to perform simulations of brain stimulation such as transcranial direct current stimulation (tDCS) and magnetic transcranial stimulation (TMS).
Developing software tools for science has always been a central vision of the SCI Institute.

SCI Publications

2021


K. Gadhave, J. Görtler, Z. Cutler, C. Nobre, O. Deussen, M. Meyer, J.M. Phillips, A. Lex. “Predicting intent behind selections in scatterplot visualizations,” In Information Visualization, Vol. 20, No. 4, pp. 207-228. 2021.
DOI: 10.1177/14738716211038604

ABSTRACT

Predicting and capturing an analyst’s intent behind a selection in a data visualization is valuable in two scenarios: First, a successful prediction of a pattern an analyst intended to select can be used to auto-complete a partial selection which, in turn, can improve the correctness of the selection. Second, knowing the intent behind a selection can be used to improve recall and reproducibility. In this paper, we introduce methods to infer analyst’s intents behind selections in data visualizations, such as scatterplots. We describe intents based on patterns in the data, and identify algorithms that can capture these patterns. Upon an interactive selection, we compare the selected items with the results of a large set of computed patterns, and use various ranking approaches to identify the best pattern for an analyst’s selection. We store annotations and the metadata to reconstruct a selection, such as the type of algorithm and its parameterization, in a provenance graph. We present a prototype system that implements these methods for tabular data and scatterplots. Analysts can select a prediction to auto-complete partial selections and to seamlessly log their intents. We discuss implications of our approach for reproducibility and reuse of analysis workflows. We evaluate our approach in a crowd-sourced study, where we show that auto-completing selection improves accuracy, and that we can accurately capture pattern-based intent.



K. Gadhave, Z.T. Cutler, A. Lex. “Reusing Interactive Analysis Workflows,” Subtitled “OSF Preprints,” 2021.

ABSTRACT

Interactive visual analysis has many advantages, but has the disadvantage that analysis processes and workflows cannot be easily stored and reused, which is in contrast to scripted analysis workflows using a programming language such as Python. In this paper, we introduce methods to semantically capture workflows in interactive visualization systems for different interactions such as selections, filters, categorizing/grouping, labeling, and aggregation. We design these workflows to be robust to updates in the dataset by capturing the semantics of underlying interactions, and, hence, they can be applied to updated datasets. We demonstrate this specification using a prototype that visualizes the data, shows interaction provenance, and allows generating workflows from this provenance. Finally, we introduce a Python library that can consume the workflow and apply it to the datasets, providing a seamless bridge between computational workflows and interactive visualization tools. We demonstrate our techniques using our UI prototype and Jupyter notebooks.



W. W. Good, B. Zenger, J. A. Bergquist, L. C. Rupp, K. K. Gillette, M. A.F. Gsell, G. Plank, R. S. MacLeod. “Quantifying the spatiotemporal influence of acute myocardial ischemia on volumetric conduction velocity,” In Journal of Electrocardiology, Vol. 66, Churchill Livingstone, pp. 86-94. 2021.

ABSTRACT

Introduction
Acute myocardial ischemia occurs when coronary perfusion to the heart is inadequate, which can perturb the highly organized electrical activation of the heart and can result in adverse cardiac events including sudden cardiac death. Ischemia is known to influence the ST and repolarization phases of the ECG, but it also has a marked effect on propagation (QRS); however, studies investigating propagation during ischemia have been limited.

Methods
We estimated conduction velocity (CV) and ischemic stress prior to and throughout 20 episodes of experimentally induced ischemia in order to quantify the progression and correlation of volumetric conduction changes during ischemia. To estimate volumetric CV, we 1) reconstructed the activation wavefront; 2) calculated the elementwise gradient to approximate propagation direction; and 3) estimated conduction speed (CS) with an inverse-gradient technique.
Results
We found that acute ischemia induces significant conduction slowing, reducing the global median speed by 20 cm/s. We observed a biphasic response in CS (acceleration then deceleration) early in some ischemic episodes. Furthermore, we noted a high temporal correlation between ST-segment changes and CS slowing; however, when comparing these changes over space, we found only moderate correlation (corr. = 0.60).
Discussion
This study is the first to report volumetric CS changes (acceleration and slowing) during episodes of acute ischemia in the whole heart. We showed that while CS changes progress in a similar time course to ischemic stress (measured by ST-segment shifts), the spatial overlap is complex and variable, showing extreme conduction slowing both in and around regions experiencing severe ischemia.



W. W. Good, K. Gillette, B. Zenger, J. Bergquist, L. C. Rupp, J. D. Tate, D. Anderson, M. Gsell, G. Plank, R. S. Macleod. “Estimation and validation of cardiac conduction velocity and wavefront reconstruction using epicardial and volumetric data,” In IEEE Transactions on Biomedical Engineering, IEEE, 2021.
DOI: 10.1109/TBME.2021.3069792

ABSTRACT

Objective: In this study, we have used whole heart simulations parameterized with large animal experiments to validate three techniques (two from the literature and one novel) for estimating epicardial and volumetric conduction velocity (CV). Methods: We used an eikonal-based simulation model to generate ground truth activation sequences with prescribed CVs. Using the sampling density achieved experimentally we examined the accuracy with which we could reconstruct the wavefront, and then examined the robustness of three CV estimation techniques to reconstruction related error. We examined a triangulation-based, inverse-gradient-based, and streamline-based techniques for estimating CV cross the surface and within the volume of the heart. Results: The reconstructed activation times agreed closely with simulated values, with 50-70% of the volumetric nodes and 97-99% of the epicardial nodes were within 1 ms of the ground truth. We found close agreement between the CVs calculated using reconstructed versus ground truth activation times, with differences in the median estimated CV on the order of 3-5% volumetrically and 1-2% superficially, regardless of what technique was used. Conclusion: Our results indicate that the wavefront reconstruction and CV estimation techniques are accurate, allowing us to examine changes in propagation induced by experimental interventions such as acute ischemia, ectopic pacing, or drugs. Significance: We implemented, validated, and compared the performance of a number of CV estimation techniques. The CV estimation techniques implemented in this study produce accurate, high-resolution CV fields that can be used to study propagation in the heart experimentally and clinically.



A. A. Gooch, S. Petruzza, A. Gyulassy, G. Scorzelli, V. Pascucci, L. Rantham, W. Adcock, C. Coopmans. “Lessons learned towards the immediate delivery of massive aerial imagery to farmers and crop consultants,” In Autonomous Air and Ground Sensing Systems for Agricultural Optimization and Phenotyping VI, Vol. 11747, International Society for Optics and Photonics, pp. 22 -- 34. 2021.
DOI: 10.1117/12.2587694

ABSTRACT

In this paper, we document lessons learned from using ViSOAR Ag Explorer™ in the fields of Arkansas and Utah in the 2018-2020 growing seasons. Our insights come from creating software with fast reading and writing of 2D aerial image mosaics for platform-agnostic collaborative analytics and visualization. We currently enable stitching in the field on a laptop without the need for an internet connection. The full resolution result is then available for instant streaming visualization and analytics via Python scripting. While our software, ViSOAR Ag Explorer™ removes the time and labor software bottleneck in processing large aerial surveys, enabling a cost-effective process to deliver actionable information to farmers, we learned valuable lessons with regard to the acquisition, storage, viewing, analysis, and planning stages of aerial data surveys. Additionally, with the ultimate goal of stitching thousands of images in minutes on board a UAV at the time of data capture, we performed preliminary tests for on-board, real-time stitching and analysis on USU AggieAir sUAS using lightweight computational resources. This system is able to create a 2D map while flying and allow interactive exploration of the full resolution data as soon as the platform has landed or has access to a network. This capability further speeds up the assessment process on the field and opens opportunities for new real-time photogrammetry applications. Flying and imaging over 1500-2000 acres per week provides up-to-date maps that give crop consultants a much broader scope of the field in general as well as providing a better view into planting and field preparation than could be observed from field level. Ultimately, our software and hardware could provide a much better understanding of weed presence and intensity or lack thereof.



W. W. Good, B. Zenger, J. A. Bergquist, L. C. Rupp, K. Gillett, N. Angel, D. Chou, G. Plank, R. S. MacLeod. “Combining endocardial mapping and electrocardiographic imaging (ECGI) for improving PVC localization: A feasibility study,” In Journal of Electrocardiology, 2021.
ISSN: 0022-0736
DOI: https://doi.org/10.1016/j.jelectrocard.2021.08.013

ABSTRACT

Introduction

Accurate reconstruction of cardiac activation wavefronts is crucial for clinical diagnosis, management, and treatment of cardiac arrhythmias. Furthermore, reconstruction of activation profiles within the intramural myocardium has long been impossible because electrical mapping was only performed on the endocardial surface. Recent advancements in electrocardiographic imaging (ECGI) have made endocardial and epicardial activation mapping possible. We propose a novel approach to use both endocardial and epicardial mapping in a combined approach to reconstruct intramural activation times.

Objective

To implement and validate a combined epicardial/endocardial intramural activation time reconstruction technique.
Methods

We used 11 simulations of ventricular activation paced from sites throughout myocardial wall and extracted endocardial and epicardial activation maps at approximate clinical resolution. From these maps, we interpolated the activation times through the myocardium using thin-plate-spline radial basis functions. We evaluated activation time reconstruction accuracy using root-mean-squared error (RMSE) of activation times and the percent of nodes within 1 ms of the ground truth.
Results

Reconstructed intramural activation times showed an RMSE and percentage of nodes within 1 ms of the ground truth simulations of 3 ms and 70%, respectively. In the worst case, the RMSE and percentage of nodes were 4 ms and 60%, respectively.
Conclusion

We showed that a simple, yet effective combination of clinical endocardial and epicardial activation maps can accurately reconstruct intramural wavefronts. Furthermore, we showed that this approach provided robust reconstructions across multiple intramural stimulation sites.



J. K. Holmen, D. Sahasrabudhe, M. Berzins. “A Heterogeneous MPI+PPL Task Scheduling Approach for Asynchronous Many-Task Runtime Systems,” In Proceedings of the Practice and Experience in Advanced Research Computing 2021 on Sustainability, Success and Impact (PEARC21), ACM, 2021.

ABSTRACT

Asynchronous many-task runtime systems and MPI+X hybrid parallelism approaches have shown promise for helping manage the increasing complexity of nodes in current and emerging high performance computing (HPC) systems, including those for exascale. The increasing architectural diversity, however, poses challenges for large legacy runtime systems emphasizing broad support for major HPC systems. Performance portability layers (PPL) have shown promise for helping manage this diversity. This paper describes a heterogeneous MPI+PPL task scheduling approach for combining these promising solutions with additional consideration for parallel third party libraries facing similar challenges to help prepare such a runtime for the diverse heterogeneous systems accompanying exascale computing. This approach is demonstrated using a heterogeneous MPI+Kokkos task scheduler and the accompanying portable abstractions [15] implemented in the Uintah Computational Framework, an asynchronous many-task runtime system, with additional consideration for hypre, a parallel third party library. Results are shown for two challenging problems executing workloads representative of typical Uintah applications. These results show performance improvements up to 4.4x when using this scheduler and the accompanying portable abstractions [15] to port a previously MPI-Only problem to Kokkos::OpenMP and Kokkos::CUDA to improve multi-socket, multi-device node use. Good strong-scaling to 1,024 NVIDIA V100 GPUs and 512 IBM POWER9 processor are also shown using MPI+Kokkos::OpenMP+Kokkos::CUDA at scale.



J. K. Holmen, D. Sahasrabudhe, M. Berzins, A. Bardakoff, T. J. Blattner, . Keyrouz. “Uintah+Hedgehog: Combining Parallelism Models for End-to-End Large-Scale Simulation Performance,” Scientific Computing and Imaging Institute, 2021.

ABSTRACT

The complexity of heterogeneous nodes near and at exascale has increased the need for “heroic” programming efforts. To accommodate this complexity, significant investment is required for codes not yet optimizing for low-level architecture features (e.g., wide vector units) and/or running at large-scale. This paper describes ongoing efforts to combine two codes, Hedgehog and Uintah, lying at both extremes to ease programming efforts. The end goals of this effort are (1) to combine the two codes to make an asynchronous many-task runtime system specializing in both node-level and large-scale performance and (2) to further improve the accessibility of both with portable abstractions. A prototype adopting Hedgehog in Uintah and a prototype extending Hedgehog to support MPI+X hybrid parallelism are discussed. Results achieving ∼60% of NVIDIA V100 GPU peak performance for a distributed DGEMM problem are shown for a naive MPI+Hedgehog implementation before any attempt to optimize for performance.

Authors note: This is a refereed but unpublished report that was
submitted to, reviewed for and accepted in revised form for a presentation of the same material at the Hipar Workshop at Supercomputing 21



Z. Houmani, D. Balouek-Thomert, E. Caron, M. Parashar. “Enabling microservices management for Deep Learning applications across the Edge-Cloud Continuum,” In SBAC-PAD 2021 - IEEE 33rd International Symposium on Computer Architecture and High Performance Computing, October, 2021.

ABSTRACT

Deep Learning has shifted the focus of traditional batch workflows to data-driven feature engineering on streaming data. In particular, the execution of Deep Learning workflows presents expectations of near-real-time results with user-defined acceptable accuracy. Meeting the objectives of such applications across heterogeneous resources located at the edge of the network, the core, and in-between requires managing trade-offs between the accuracy and the urgency of the results. However, current data analysis rarely manages the entire Deep Learning pipeline along the data path, making it complex for developers to implement strategies in real-world deployments. Driven by an object detection use case, this paper presents an architecture for time-critical Deep Learning workflows by providing a data-driven scheduling approach to distribute the pipeline across Edge to Cloud resources. Furthermore, it adopts a data management strategy that reduces the resolution of incoming data when potential trade-off optimizations are available. We illustrate the system's viability through a performance evaluation of the object detection use case on the Grid'5000 testbed. We demonstrate that in a multiuser scenario, with a standard frame rate of 25 frames per second, the system speed-up data analysis up to 54.4% compared to a Cloud-only-based scenario with an analysis accuracy higher than a fixed threshold.



X. Huang, P. Klacansky, S. Petruzza, A. Gyulassy, P.T. Bremer, V. Pascucci. “Distributed merge forest: a new fast and scalable approach for topological analysis at scale,” In Proceedings of the ACM International Conference on Supercomputing, pp. 367-377. 2021.

ABSTRACT

Topological analysis is used in several domains to identify and characterize important features in scientific data, and is now one of the established classes of techniques of proven practical use in scientific computing. The growth in parallelism and problem size tackled by modern simulations poses a particular challenge for these approaches. Fundamentally, the global encoding of topological features necessitates inter process communication that limits their scaling. In this paper, we extend a new topological paradigm to the case of distributed computing, where the construction of a global merge tree is replaced by a distributed data structure, the merge forest, trading slower individual queries on the structure for faster end-to-end performance and scaling. Empirically, the queries that are most negatively affected also tend to have limited practical use. Our experimental results demonstrate the scalability of both the merge forest construction and the parallel queries needed in scientific workflows, and contrast this scalability with the two established alternatives that construct variations of a global tree.



M. H. Jensen, S. Joshi, S. Sommer. “Bridge Simulation and Metric Estimation on Lie Groups,” Subtitled “arXiv preprint arXiv:2106.03431,” 2021.

ABSTRACT

We present a simulation scheme for simulating Brownian bridges on complete and connected Lie groups. We show how this simulation scheme leads to absolute continuity of the Brownian bridge measure with respect to the guided process measure. This result generalizes the Euclidean result of Delyon and Hu to Lie groups. We present numerical results of the guided process in the Lie group $\SO(3)$. In particular, we apply importance sampling to estimate the metric on $\SO(3)$ using an iterative maximum likelihood method.



M. Højgaard Jensen, L. Hilgendorf, S. Joshi, S. Sommer. “Bridge Simulation on Lie Groups and Homogeneous Spaces with Application to Parameter Estimation,” Subtitled “arXiv:2112.00866,” 2021.



X. Jiang, J. C. Font, J. A. Bergquist, B. Zenger, W. W. Good, D. H. Brooks, R. S. MacLeod, L. Wang. “Deep Adaptive Electrocardiographic Imaging with Generative Forward Model for Error Reduction,” In Functional Imaging and Modeling of the Heart: 11th International Conference, In Functional Imaging and Modeling of the Heart: 11th International Conference, Vol. 12738, Springer Nature, pp. 471. 2021.

ABSTRACT

Accuracy of estimating the heart’s electrical activity with Electrocardiographic Imaging (ECGI) is challenging due to using an error-prone physics-based model (forward model). While getting better results than the traditional numerical methods following the underlying physics, modern deep learning approaches ignore the physics behind the electrical propagation in the body and do not allow the use of patientspecific geometry. We introduce a deep-learning-based ECGI framework capable of understanding the underlying physics, aware of geometry, and adjustable to patient-specific data. Using a variational autoencoder (VAE), we uncover the forward model’s parameter space, and when solving the inverse problem, these parameters will be optimized to reduce the errors in the forward model. In both simulation and real data experiments, we demonstrated the ability of the presented framework to provide accurate reconstruction of the heart’s electrical potentials and localization of the earliest activation sites.



C. R. Johnson. “Translational computer science at the scientific computing and imaging institute,” In Journal of Computational Science, Vol. 52, pp. 101217. 2021.
ISSN: 1877-7503
DOI: https://doi.org/10.1016/j.jocs.2020.101217

ABSTRACT

The Scientific Computing and Imaging (SCI) Institute at the University of Utah evolved from the SCI research group, started in 1994 by Professors Chris Johnson and Rob MacLeod. Over time, research centers funded by the National Institutes of Health, Department of Energy, and State of Utah significantly spurred growth, and SCI became a permanent interdisciplinary research institute in 2000. The SCI Institute is now home to more than 150 faculty, students, and staff. The history of the SCI Institute is underpinned by a culture of multidisciplinary, collaborative research, which led to its emergence as an internationally recognized leader in the development and use of visualization, scientific computing, and image analysis research to solve important problems in a broad range of domains in biomedicine, science, and engineering. A particular hallmark of SCI Institute research is the creation of open source software systems, including the SCIRun scientific problem-solving environment, Seg3D, ImageVis3D, Uintah, ViSUS, Nektar++, VisTrails, FluoRender, and FEBio. At this point, the SCI Institute has made more than 50 software packages broadly available to the scientific community under open-source licensing and supports them through web pages, documentation, and user groups. While the vast majority of academic research software is written and maintained by graduate students, the SCI Institute employs several professional software developers to help create, maintain, and document robust, tested, well-engineered open source software. The story of how and why we worked, and often struggled, to make professional software engineers an integral part of an academic research institute is crucial to the larger story of the SCI Institute’s success in translational computer science (TCS).



A. Junn, J. Dinis, S. C. Hauc, M. K. Bruce, K. E. Park, W. Tao, C. Christensen, R. Whitaker, J. A. Goldstein, M. Alperovich. “Validation of Artificial Intelligence Severity Assessment in Metopic Craniosynostosis,” In The Cleft Palate-Craniofacial Journal, SAGE Publications, 2021.
DOI: https://doi.org/10.1177/10.1177/10556656211061021

ABSTRACT

Objective
Several severity metrics have been developed for metopic craniosynostosis, including a recent machine learning-derived algorithm. This study assessed the diagnostic concordance between machine learning and previously published severity indices.

Design
Preoperative computed tomography (CT) scans of patients who underwent surgical correction of metopic craniosynostosis were quantitatively analyzed for severity. Each scan was manually measured to derive manual severity scores and also received a scaled metopic severity score (MSS) assigned by the machine learning algorithm. Regression analysis was used to correlate manually captured measurements to MSS. ROC analysis was performed for each severity metric and were compared to how accurately they distinguished cases of metopic synostosis from controls.
Results
In total, 194 CT scans were analyzed, 167 with metopic synostosis and 27 controls. The mean scaled MSS for the patients with metopic was 6.18 ± 2.53 compared to 0.60 ± 1.25 for controls. Multivariable regression analyses yielded an R-square of 0.66, with significant manual measurements of endocranial bifrontal angle (EBA) (P = 0.023), posterior angle of the anterior cranial fossa (p < 0.001), temporal depression angle (P = 0.042), age (P < 0.001), biparietal distance (P < 0.001), interdacryon distance (P = 0.033), and orbital width (P < 0.001). ROC analysis demonstrated a high diagnostic value of the MSS (AUC = 0.96, P < 0.001), which was comparable to other validated indices including the adjusted EBA (AUC = 0.98), EBA (AUC = 0.97), and biparietal/bitemporal ratio (AUC = 0.95).
Conclusions
The machine learning algorithm offers an objective assessment of morphologic severity that provides a reliable composite impression of severity. The generated score is comparable to other severity indices in ability to distinguish cases of metopic synostosis from controls.



R. Kamali, J. Kump, E. Ghafoori, M. Lange, N. Hu, T. J. Bunch, D. J. Dosdall, R. S. Macleod, R. Ranjan. “Area Available for Atrial Fibrillation to Propagate Is an Important Determinant of Recurrence After Ablation,” In JACC: Clinical Electrophysiology, Elsevier, 2021.

ABSTRACT

This study sought to evaluate atrial fibrillation (AF) ablation outcomes based on scar patterns and contiguous area available for AF wavefronts to propagate.



V. Keshavarzzadeh, M. Alirezaei, T. Tasdizen, R. M. Kirby. “Image-Based Multiresolution Topology Optimization Using Deep Disjunctive Normal Shape Model,” In Computer-Aided Design, Vol. 130, Elsevier, pp. 102947. 2021.

ABSTRACT

We present a machine learning framework for predicting the optimized structural topology design susing multiresolution data. Our approach primarily uses optimized designs from inexpensive coarse mesh finite element simulations for model training and generates high resolution images associated with simulation parameters that are not previously used. Our cost-efficient approach enables the designers to effectively search through possible candidate designs in situations where the design requirements rapidly change. The underlying neural network framework is based on a deep disjunctive normal shape model (DDNSM) which learns the mapping between the simulation parameters and segments of multi resolution images. Using this image-based analysis we provide a practical algorithm which enhances the predictability of the learning machine by determining a limited number of important parametric samples(i.e.samples of the simulation parameters)on which the high resolution training data is generated. We demonstrate our approach on benchmark compliance minimization problems including the 3D topology optimization where we show that the high-fidelity designs from the learning machine are close to optimal designs and can be used as effective initial guesses for the large-scale optimization problem.



V. Keshavarzzadeh, R. M. Kirby, A. Narayan. “Multilevel Designed Quadrature for Partial Differential Equations with Random Inputs,” In SIAM Journal on Scientific Computing, Vol. 43, No. 2, Society for Industrial and Applied Mathematics, pp. A1412-A1440. 2021.

ABSTRACT

We introduce a numerical method, multilevel designed quadrature for computing the statistical solution of partial differential equations with random input data. Similar to multilevel Monte Carlo methods, our method relies on hierarchical spatial approximations in addition to a parametric/stochastic sampling strategy. A key ingredient in multilevel methods is the relationship between the spatial accuracy at each level and the number of stochastic samples required to achieve that accuracy. Our sampling is based on flexible quadrature points that are designed for a prescribed accuracy, which can yield less overall computational cost compared to alternative multilevel methods. We propose a constrained optimization problem that determines the number of samples to balance the approximation error with the computational budget. We further show that the optimization problem is convex and derive analytic formulas for the optimal number of points at each level. We validate the theoretical estimates and the performance of our multilevel method via numerical examples on a linear elasticity and a steady state heat diffusion problem.



V. Keshavarzzadeh, R. M. Kirby, A. Narayan. “Robust topology optimization with low rank approximation using artificial neural networks,” In Computational Mechanics, 2021.
DOI: 10.1007/s00466-021-02069-3

ABSTRACT

We present a low rank approximation approach for topology optimization of parametrized linear elastic structures. The parametrization is considered on loading and stiffness of the structure. The low rank approximation is achieved by identifying a parametric connection among coarse finite element models of the structure (associated with different design iterates) and is used to inform the high fidelity finite element analysis. We build an Artificial Neural Network (ANN) map between low resolution design iterates and their corresponding interpolative coefficients (obtained from low rank approximations) and use this surrogate to perform high resolution parametric topology optimization. We demonstrate our approach on robust topology optimization with compliance constraints/objective functions and develop error bounds for the the parametric compliance computations. We verify these parametric computations with more challenging quantities of interest such as the p-norm of von Mises stress. To conclude, we use our approach on a 3D robust topology optimization and show significant reduction in computational cost via quantitative measures.



V. Keshavarzzadeh, S. Zhe, R.M. Kirby, A. Narayan. “GP-HMAT: Scalable, $O(n\log (n)) $ Gaussian Process Regression with Hierarchical Low-Rank Matrices,” Subtitled “arXiv preprint arXiv:2201.00888,” 2021.

ABSTRACT

A Gaussian process (GP) is a powerful and widely used regression technique. The main building block of a GP regression is the covariance kernel, which characterizes the relationship between pairs in the random field. The optimization to find the optimal kernel, however, requires several large-scale and often unstructured matrix inversions. We tackle this challenge by introducing a hierarchical matrix approach, named HMAT, which effectively decomposes the matrix structure, in a recursive manner, into significantly smaller matrices where a direct approach could be used for inversion. Our matrix partitioning uses a particular aggregation strategy for data points, which promotes the low-rank structure of off-diagonal blocks in the hierarchical kernel matrix. We employ a randomized linear algebra method for matrix reduction on the low-rank off-diagonal blocks without factorizing a large matrix. We provide analytical error and cost estimates for the inversion of the matrix, investigate them empirically with numerical computations, and demonstrate the application of our approach on three numerical examples involving GP regression for engineering problems and a large-scale real dataset. We provide the computer implementation of GP-HMAT, HMAT adapted for GP likelihood and derivative computations, and the implementation of the last numerical example on a real dataset. We demonstrate superior scalability of the HMAT approach compared to built-in operator in MATLAB for large-scale linear solves Ax=y via a repeatable and verifiable empirical study. An extension to hierarchical semiseparable (HSS) matrices is discussed as future research.