M. K. Ballard, R. Amici, V. Shankar, L. A. Ferguson, M. Braginsky, R. M. Kirby. Towards an Extrinsic, CG-XFEM Approach Based on Hierarchical Enrichments for Modeling Progressive Fracture, Subtitled arXiv preprint arXiv:2104.14704, 2021.
We propose an extrinsic, continuous-Galerkin (CG), extended finite element method (XFEM) that generalizes the work of Hansbo and Hansbo to allow multiple Heaviside enrichments within a single element in a hierarchical manner. This approach enables complex, evolving XFEM surfaces in 3D that cannot be captured using existing CG-XFEM approaches. We describe an implementation of the method for 3D static elasticity with linearized strain for modeling open cracks as a salient step towards modeling progressive fracture. The implementation includes a description of the finite element model, hybrid implicit/explicit representation of enrichments, numerical integration method, and novel degree-of-freedom (DoF) enumeration algorithm. This algorithm supports an arbitrary number of enrichments within an element, while simultaneously maintaining a CG solution across elements. Additionally, our approach easily allows an implementation suitable for distributed computing systems. Enabled by the DoF enumeration algorithm, the proposed method lays the groundwork for a computational tool that efficiently models progressive fracture. To facilitate a discussion of the complex enrichment hierarchies, we develop enrichment diagrams to succinctly describe and visualize the relationships between the enrichments (and the fields they create) within an element. This also provides a unified language for discussing extrinsic XFEM methods in the literature. We compare several methods, relying on the enrichment diagrams to highlight their nuanced differences.
M. Carlson, X. Zheng, H. Sundar, G. E. Karniadakis, R. M. Kirby. An open-source parallel code for computing the spectral fractional Laplacian on 3D complex geometry domains, In Computer Physics Communications, Vol. 261, North-Holland, pp. 107695. 2021.
We present a spectral element algorithm and open-source code for computing the fractional Laplacian defined by the eigenfunction expansion on finite 2D/3D complex domains with both homogeneous and nonhomogeneous boundaries. We demonstrate the scalability of the spectral element algorithm on large clusters by constructing the fractional Laplacian based on computed eigenvalues and eigenfunctions using up to thousands of CPUs. To demonstrate the accuracy of this eigen-based approach for computing the factional Laplacian, we approximate the solutions of the fractional diffusion equation using the computed eigenvalues and eigenfunctions on a 2D quadrilateral, and on a 3D cubic and cylindrical domain, and compare the results with the contrived solutions to demonstrate fast convergence. Subsequently, we present simulation results for a fractional diffusion equation on a hand-shaped domain discretized with 3D hexahedra, as well as on a domain constructed from the Hanford site geometry corresponding to nonzero Dirichlet boundary conditions. Finally, we apply the algorithm to solve the surface quasi-geostrophic (SQG) equation on a 2D square with periodic boundaries. Simulation results demonstrate the accuracy, efficiency, and geometric flexibility of our algorithm and that our algorithm can capture the subtle dynamics of anomalous diffusion modeled by the fractional Laplacian on complex geometry domains. The included open-source code is the first of its kind.
J. Chilleri, Y. He, D. Bedrov, R. M. Kirby. Optimal allocation of computational resources based on Gaussian process: Application to molecular dynamics simulations, In Computational Materials Science, Vol. 188, Elsevier, pp. 110178. 2021.
Simulation models have been utilized in a wide range of real-world applications for behavior predictions of complex physical systems or material designs of large structures. While extensive simulation is mathematically preferable, external limitations such as available resources are often necessary considerations. With a fixed computational resource (i.e., total simulation time), we propose a Gaussian process-based numerical optimization framework for optimal time allocation over simulations at different locations, so that a surrogate model with uncertainty estimation can be constructed to approximate the full simulation. The proposed framework is demonstrated first via two synthetic problems, and later using a real test case of a glass-forming system with divergent dynamic relaxations where a Gaussian process is constructed to estimate the diffusivity and its uncertainty with respect to the temperature.
V. Keshavarzzadeh, M. Alirezaei, T. Tasdizen, R. M. Kirby. Image-Based Multiresolution Topology Optimization Using Deep Disjunctive Normal Shape Model, In Computer-Aided Design, Vol. 130, Elsevier, pp. 102947. 2021.
We present a machine learning framework for predicting the optimized structural topology design susing multiresolution data. Our approach primarily uses optimized designs from inexpensive coarse mesh finite element simulations for model training and generates high resolution images associated with simulation parameters that are not previously used. Our cost-efficient approach enables the designers to effectively search through possible candidate designs in situations where the design requirements rapidly change. The underlying neural network framework is based on a deep disjunctive normal shape model (DDNSM) which learns the mapping between the simulation parameters and segments of multi resolution images. Using this image-based analysis we provide a practical algorithm which enhances the predictability of the learning machine by determining a limited number of important parametric samples(i.e.samples of the simulation parameters)on which the high resolution training data is generated. We demonstrate our approach on benchmark compliance minimization problems including the 3D topology optimization where we show that the high-fidelity designs from the learning machine are close to optimal designs and can be used as effective initial guesses for the large-scale optimization problem.
V. Keshavarzzadeh, R. M. Kirby, A. Narayan. Multilevel Designed Quadrature for Partial Differential Equations with Random Inputs, In SIAM Journal on Scientific Computing, Vol. 43, No. 2, Society for Industrial and Applied Mathematics, pp. A1412-A1440. 2021.
We introduce a numerical method, multilevel designed quadrature for computing the statistical solution of partial differential equations with random input data. Similar to multilevel Monte Carlo methods, our method relies on hierarchical spatial approximations in addition to a parametric/stochastic sampling strategy. A key ingredient in multilevel methods is the relationship between the spatial accuracy at each level and the number of stochastic samples required to achieve that accuracy. Our sampling is based on flexible quadrature points that are designed for a prescribed accuracy, which can yield less overall computational cost compared to alternative multilevel methods. We propose a constrained optimization problem that determines the number of samples to balance the approximation error with the computational budget. We further show that the optimization problem is convex and derive analytic formulas for the optimal number of points at each level. We validate the theoretical estimates and the performance of our multilevel method via numerical examples on a linear elasticity and a steady state heat diffusion problem.
E. Laughton, V. Zala, A. Narayan, R. M. Kirby, D. Moxey. Fast Barycentric-Based Evaluation Over Spectral/hp Elements, Subtitled arXiv preprint arXiv:2103.03594, 2021.
As the use of spectral/hp element methods, and high-order finite element methods in general, continues to spread, community efforts to create efficient, optimized algorithms associated with fundamental high-order operations have grown. Core tasks such as solution expansion evaluation at quadrature points, stiffness and mass matrix generation, and matrix assembly have received tremendousattention. With the expansion of the types of problems to which high-order methods are applied, and correspondingly the growth in types of numerical tasks accomplished through high-order methods, the number and types of these core operations broaden. This work focuses on solution expansion evaluation at arbitrary points within an element. This operation is core to many postprocessing applications such as evaluation of streamlines and pathlines, as well as to field projection techniques such as mortaring. We expand barycentric interpolation techniques developed on an interval to 2D (triangles and quadrilaterals) and 3D (tetrahedra, prisms, pyramids, and hexahedra) spectral/hp element methods. We provide efficient algorithms for their implementations, and demonstrate their effectiveness using the spectral/hp element library Nektar++.
M. Rasouli, R. M. Kirby, H. Sundar. A Compressed, Divide and Conquer Algorithm for Scalable Distributed Matrix-Matrix Multiplication, In The International Conference on High Performance Computing in Asia-Pacific Region, pp. 110-119. 2021.
Matrix-matrix multiplication (GEMM) is a widely used linear algebra primitive common in scientific computing and data sciences. While several highly-tuned libraries and implementations exist, these typically target either sparse or dense matrices. The performance of these tuned implementations on unsupported types can be poor, and this is critical in cases where the structure of the computations is associated with varying degrees of sparsity. One such example is Algebraic Multigrid (AMG), a popular solver and preconditioner for large sparse linear systems. In this work, we present a new divide and conquer sparse GEMM, that is also highly performant and scalable when the matrix becomes dense, as in the case of AMG matrix hierarchies. In addition, we implement a lossless data compression method to reduce the communication cost. We combine this with an efficient communication pattern during distributed-memory GEMM to provide 2.24 times (on average) better performance than the state-of-the-art library PETSc. Additionally, we show that the performance and scalability of our method surpass PETSc even more when the density of the matrix increases. We demonstrate the efficacy of our methods by comparing our GEMM with PETSc on a wide range of matrices.
W. W. Xing, A. A. Shah, P. Wang, S. Zhe, Q. Fu, R. M. Kirby. Residual Gaussian process: A tractable nonparametric Bayesian emulator for multi-fidelity simulations, In Applied Mathematical Modelling, Vol. 97, Elsevier, pp. 36-56. 2021.
Challenges in multi-fidelity modelling relate to accuracy, uncertainty estimation and high-dimensionality. A novel additive structure is introduced in which the highest fidelity solution is written as a sum of the lowest fidelity solution and residuals between the solutions at successive fidelity levels, with Gaussian process priors placed over the low fidelity solution and each of the residuals. The resulting model is equipped with a closed-form solution for the predictive posterior, making it applicable to advanced, high-dimensional tasks that require uncertainty estimation. Its advantages are demonstrated on univariate benchmarks and on three challenging multivariate problems. It is shown how active learning can be used to enhance the model, especially with a limited computational budget. Furthermore, error bounds are derived for the mean prediction in the univariate case.
W. W. Xing, R. M. Kirby, S. Zhe. Deep coregionalization for the emulation of simulation-based spatial-temporal fields, In Journal of Computational Physics, Academic Press, pp. 109984. 2021.
Data-driven surrogate models are widely used for applications such as design optimization and uncertainty quantification, where repeated evaluations of an expensive simulator are required. For most partial differential equation (PDE) simulations, the outputs of interest are often spatial or spatial-temporal fields, leading to very high-dimensional outputs. Despite the success of existing data-driven surrogates for high-dimensional outputs, most methods require a significant number of samples to cover the response surface in order to achieve a reasonable degree of accuracy. This demand makes the idea of surrogate models less attractive considering the high-computational cost to generate the data. To address this issue, we exploit the multifidelity nature of a PDE simulation and introduce deep coregionalization, a Bayesian nonparametric autoregressive framework for efficient emulation of spatial-temporal fields. To effectively extract the output correlations in the context of multifidelity data, we develop a novel dimension reduction technique, residual principal component analysis. Our model can simultaneously capture the rich output correlations and the fidelity correlations and make high-fidelity predictions with only a small number of expensive, high-fidelity simulation samples. We show the advantages of our model in three canonical PDE models and a fluid dynamics problem. The results show that the proposed method can not only approximate simulation results with significantly less cost (by bout 10%-25%) but also further improve model accuracy.
Y. Xu, V. Keshavarzzadeh, R. M. Kirby, A. Narayan. A bandit-learning approach to multifidelity approximation, Subtitled arXiv preprint arXiv:2103.15342, 2021.
Multifidelity approximation is an important technique in scientific computation and simulation. In this paper, we introduce a bandit-learning approach for leveraging data of varying fidelities to achieve precise estimates of the parameters of interest. Under a linear model assumption, we formulate a multifidelity approximation as a modified stochastic bandit, and analyze the loss for a class of policies that uniformly explore each model before exploiting. Utilizing the estimated conditional mean-squared error, we propose a consistent algorithm, adaptive Explore-Then-Commit (AETC), and establish a corresponding trajectory-wise optimality result. These results are then extended to the case of vector-valued responses, where we demonstrate that the algorithm is efficient without the need to worry about estimating high-dimensional parameters. The main advantage of our approach is that we require neither hierarchical model structure nor\textit a priori knowledge of statistical information (eg, correlations) about or between models. Instead, the AETC algorithm requires only knowledge of which model is a trusted high-fidelity model, along with (relative) computational cost estimates of querying each model. Numerical experiments are provided at the end to support our theoretical findings.
T. A. J. Ouermi, R. M. Kirby, M. Berzins. Numerical Testing of a New Positivity-Preserving Interpolation Algorithm, Subtitled arXiv, 2020.
An important component of a number of computational modeling algorithms is an interpolation method that preserves the positivity of the function being interpolated. This report describes the numerical testing of a new positivity-preserving algorithm that is designed to be used when interpolating from a solution defined on one grid to different spatial grid. The motivating application is a numerical weather prediction (NWP) code that uses spectral elements as the discretization choice for its dynamics core and Cartesian product meshes for the evaluation of its physics routines. This combination of spectral elements, which use nonuniformly spaced quadrature/collocation points, and uniformly-spaced Cartesian meshes combined with the desire to maintain positivity when moving between these necessitates our work. This new approach is evaluated against several typical algorithms in use on a range of test problems in one or more space dimensions. The results obtained show that the new method is competitive in terms of observed accuracy while at the same time preserving the underlying positivity of the functions being interpolated.
The automatic inverse design of three-dimensional plasmonic nanoparticles enables scientists and engineers to explore a wide design space and to maximize a device's performance. However, due to the large uncertainty in the nanofabrication process, we may not be able to obtain a deterministic value of the objective, and the objective may vary dramatically with respect to a small variation in uncertain parameters. Therefore, we take into account the uncertainty in simulations and adopt a classical robust design model for a robust design. In addition, we propose an efficient numerical procedure for the robust design to reduce the computational cost of the process caused by the consideration of the uncertainty. Specifically, we use a global sensitivity analysis method to identify the important random variables and consider the non-important ones as deterministic, and consequently reduce the dimension of the stochastic space. In addition, we apply the generalized polynomial chaos expansion method for constructing computationally cheaper surrogate models to approximate and replace the full simulations. This efficient robust design procedure is performed by varying the particles' material among the most commonly used plasmonic materials such as gold, silver, and aluminum, to obtain different robust optimal shapes for the best enhancement of electric fields.
As the finite element method (FEM) and the finite volume method (FVM), both traditional and high-order variants, continue their proliferation into various applied engineering disciplines, it is important that the visualization techniques and corresponding data analysis tools that act on the results produced by these methods faithfully represent the underlying data. To state this in another way: the interpretation of data generated by simulation needs to be consistent with the numerical schemes that underpin the specific solver technology. As the verifiable visualization literature has demonstrated: visual artifacts produced by the introduction of either explicit or implicit data transformations, such as data resampling, can sometimes distort or even obfuscate key scientific features in the data. In this paper, we focus on the handling of elemental continuity, which is often only C0 continuous or piecewise discontinuous, when visualizing primary or derived fields from FEM or FVM simulations. We demonstrate that traditional data handling and visualization of these fields introduce visual errors. In addition, we show how the use of the recently proposed line-SIAC filter provides a way of handling elemental continuity issues in an accuracy-conserving manner with the added benefit of casting the data in a smooth context even if the representation is element discontinuous.
V. Keshavarzzadeh, R.M. Kirby, A. Narayan. Numerical integration in multiple dimensions with designed quadrature, In CoRR, 2018.
We present a systematic computational framework for generating positive quadrature rules in multiple dimensions on general geometries. A direct moment-matching formulation that enforces exact integration on polynomial subspaces yields nonlinear conditions and geometric constraints on nodes and weights. We use penalty methods to address the geometric constraints, and subsequently solve a quadratic minimization problem via the Gauss-Newton method. Our analysis provides guidance on requisite sizes of quadrature rules for a given polynomial subspace, and furnishes useful user-end stability bounds on error in the quadrature rule in the case when the polynomial moment conditions are violated by a small amount due to, e.g., finite precision limitations or stagnation of the optimization procedure. We present several numerical examples investigating optimal low-degree quadrature rules, Lebesgue constants, and 100-dimensional quadrature. Our capstone examples compare our quadrature approach to popular alternatives, such as sparse grids and quasi-Monte Carlo methods, for problems in linear elasticity and topology optimization.
Performance optimization in the petascale era and beyond in the exascale era has and will require modifications of legacy codes to take advantage of new architectures with large core counts and SIMD units. The Numerical Weather Prediction (NWP) physics codes considered here are optimized using thread-local structures of arrays (SOA). High-level and low-level optimization strategies are applied to the WRF Single-Moment 6-Class Microphysics Scheme (WSM6) and Global Forecast System (GFS) physics codes used in the NEPTUNE forecast code. By building on previous work optimizing WSM6 on the Intel Knights Landing (KNL), it is shown how to further optimize WMS6 and GFS physics, and GFS radiation on Intel KNL, Haswell, and potentially on future micro-architectures with many cores and SIMD vector units. The optimization techniques used herein employ thread-local structures of arrays (SOA), an OpenMP directive, OMP SIMD, and minor code transformations to enable better utilization of SIMD units, increase parallelism, improve locality, and reduce memory traffic. The optimized versions of WSM6, GFS physics, GFS radiation run 70, 27, and 23 faster (respectively) on KNL and 26, 18 and 30 faster (respectively) on Haswell than their respective original serial versions. Although this work targets WRF physics schemes, the findings are transferable to other performance optimization contexts and provide insight into the optimization of codes with complex physical models for present and near-future architectures with many core and vector units.
Spectral Element and hp Methods, In Encyclopedia of Computational Mechanics Second Edition, John Wiley & Sons, Ltd, pp. 1--43. 2018.Y. Yu, R.M. Kirby, G.E. Karniadakis.
Spectral/hp element methods provide high‐order discretization, which is essential in the longtime integration of advection–diffusion systems and for capturing dynamic instabilities in solids. In this chapter, we review the main formulations for simulations of incompressible and compressible viscous flows as well as for solid mechanics and present several examples with some emphasis on fluid–structure interactions and interfaces. The first generation of (nodal) spectral elements was limited to relatively simple geometries and smooth solutions. However, the new generation of spectral/hp elements, consisting of both nodal and modal forms, can handle very complex geometries using unstructured grids and can capture strong shocks by employing discontinuous Galerkin methods. New flexible formulations allow simulations of multiphysics problems including extremely complex geometries and multiphase flows. Several implementation strategies have also been developed on the basis of multilevel parallel algorithms that allow dynamic p ‐refinement at constant wall clock time. After three decades of intense developments, spectral element and hp methods are mature and efficient to be used effectively in applications of industrial complexity. They provide the capabilities that standard finite element and finite volume methods do, but, in addition, they exhibit high‐order accuracy and error control.
We present a new iterative technique based on radial basis function (RBF) interpolation and smoothing for the generation and smoothing of curvilinear meshes from straight-sided or other curvilinear meshes. Our technique approximates the coordinate deformation maps in both the interior and boundary of the curvilinear output mesh by using only scattered nodes on the boundary of the input mesh as data sites in an interpolation problem. Our technique produces high-quality meshes in the deformed domain even when the deformation maps are singular due to a new iterative algorithm based on modification of the RBF shape parameter. Due to the use of RBF interpolation, our technique is applicable to both 2D and 3D curvilinear mesh generation without significant modification.
A. Bhaduri, Y. He, M.D. Shields, L. Graham-Brady, R.M. Kirby. Stochastic collocation approach with adaptive mesh refinement for parametric uncertainty analysis, In CoRR, 2017.
Presence of a high-dimensional stochastic parameter space with discontinuities poses major computational challenges in analyzing and quantifying the effects of the uncertainties in a physical system. In this paper, we propose a stochastic collocation method with adaptive mesh refinement (SCAMR) to deal with high dimensional stochastic systems with discontinuities. Specifically, the proposed approach uses generalized polynomial chaos (gPC) expansion with Legendre polynomial basis and solves for the gPC coefficients using the least squares method. It also implements an adaptive mesh (element) refinement strategy which checks for abrupt variations in the output based on the second order gPC approximation error to track discontinuities or non-smoothness. In addition, the proposed method involves a criterion for checking possible dimensionality reduction and consequently, the decomposition of the full-dimensional problem to a number of lower-dimensional subproblems. Specifically, this criterion checks all the existing interactions between input dimensions of a specific problem based on the high-dimensional model representation (HDMR) method, and therefore automatically provides the subproblems which only involve interacting dimensions. The efficiency of the approach is demonstrated using both smooth and non-smooth function examples with input dimensions up to 300, and the approach is compared against other existing algorithms.
J. Docampo-Sánchez, J.K. Ryan, M. Mirzargar, R.M. Kirby.
Multi-Dimensional Filtering: Reducing the Dimension Through Rotation Read More: https://epubs.siam.org/doi/abs/10.1137/16M1097845, In SIAM Journal on Scientific Computing, Vol. 39, No. 5, SIAM, pp. A2179--A2200. Jan, 2017.
Over the past few decades there has been a strong effort toward the development of Smoothness-Increasing Accuracy-Conserving (SIAC) filters for discontinuous Galerkin (DG) methods, designed to increase the smoothness and improve the convergence rate of the DG solution through this postprocessor. These advantages can be exploited during flow visualization, for example, by applying the SIAC filter to DG data before streamline computations [M. Steffen, S. Curtis, R. M. Kirby, and J. K. Ryan, IEEE Trans. Vis. Comput. Graphics, 14 (2008), pp. 680--692]. However, introducing these filters in engineering applications can be challenging since a tensor product filter grows in support size as the field dimension increases, becoming computationally expensive. As an alternative, [D. Walfisch, J. K. Ryan, R. M. Kirby, and R. Haimes, J. Sci. Comput., 38 (2009), pp. 164--184] proposed a univariate filter implemented along the streamline curves. Until now, this technique remained a numerical experiment. In this paper we introduce the line SIAC filter and explore how the orientation, structure, and filter size affect the order of accuracy and global errors. We present theoretical error estimates showing how line filtering preserves the properties of traditional tensor product filtering, including smoothness and improvement in the convergence rate. Furthermore, numerical experiments are included, exhibiting how these filters achieve the same accuracy at significantly lower computational costs, becoming an attractive tool for the scientific visualization community.
C. Gritton, J. Guilkey, J. Hooper, D. Bedrov, R. M. Kirby, M. Berzins. Using the material point method to model chemical/mechanical coupling in the deformation of a silicon anode, In Modelling and Simulation in Materials Science and Engineering, Vol. 25, No. 4, pp. 045005. 2017.
The lithiation and delithiation of a silicon battery anode is modeled using the material point method (MPM). The main challenges in modeling this process using the MPM is to simulate stress dependent diffusion coupled with concentration dependent stress within a material that undergoes large deformations. MPM is chosen as the numerical method of choice because of its ability to handle large deformations. A method for modeling diffusion within MPM is described. A stress dependent model for diffusivity and three different constitutive models that fully couple the equations for stress with the equations for diffusion are considered. Verifications tests for the accuracy of the numerical implementations of the models and validation tests with experimental results show the accuracy of the approach. The application of the fully coupled stress diffusion model implemented in MPM is applied to modeling the lithiation and delithiation of silicon nanopillars.