Cyber Science and Engineering: A Report of the National Science Foundation Advisory Committee for Cyberinfrastructure Task Force on Grand Challenges|
J.T. Oden, O. Ghattas, J.L. King, B.I. Schneider, K. Bartschat, F. Darema, J. Drake, T. Dunning, D. Estep, S. Glotzer, M. Gurnis, C.R. Johnson, D.S. Katz, D. Keyes, S. Kiesler, S. Kim, J. Kinter, G. Klimeck, C.W. McCurdy, R. Moser, C. Ott, A. Patra, L. Petzold, T. Schlick, K. Schulten, V. Stodden, J. Tromp, M. Wheeler, S.J. Winter, C. Wu, K. Yelick. Note: NSF Report, 2011.
This document contains the findings and recommendations of the NSF – Advisory Committee for Cyberinfrastructure Task Force on Grand Challenges addressed by advances in Cyber Science and Engineering. The term Cyber Science and Engineering (CS&E) is introduced to describe the intellectual discipline that brings together core areas of science and engineering, computer science, and computational and applied mathematics in a concerted effort to use the cyberinfrastructure (CI) for scientific discovery and engineering innovations; CS&E is computational and data-based science and engineering enabled by CI. The report examines a host of broad issues faced in addressing the Grand Challenges of science and technology and explores how those can be met by advances in CI. Included in the report are recommendations for new programs and initiatives that will expand the portfolio of the Office of Cyberinfrastructure and that will be critical to advances in all areas of science and engineering that rely on the CI.
Advisory Committee for CyberInfrastructure Task Force on Software for Science and Engineering|
D. Keyes, V. Taylor, T. Hey, S. Feldman, G. Allen, P. Colella, P. Cummings, F. Darema, J. Dongarra, T. Dunning, M. Ellisman, I. Foster, W. Gropp, C.R. Johnson, C. Kamath, R. Madduri, M. Mascagni, S.G. Parker, P. Raghavan, A. Trefethen, S. Valcourt, A. Patra, F. Choudhury, C. Cooper, P. McCartney, M. Parashar, T. Russell, B. Schneider, J. Schopf, N. Sharp. Note: NSF Report, 2011.
The Software for Science and Engineering (SSE) Task Force commenced in June 2009 with a charge that consisted of the following three elements:Identify specific needs and opportunities across the spectrum of scientific software infrastructure. Characterize the specific needs and analyze technical gaps and opportunities for NSF to meet those needs through individual and systemic approaches. Design responsive approaches. Develop initiatives and programs led (or co-led) by NSF to grow, develop, and sustain the software infrastructure needed to support NSF’s mission of transformative research and innovation leading to scientific leadership and technological competitiveness. Address issues of institutional barriers. Anticipate, analyze and address both institutional and exogenous barriers to NSF’s promotion of such an infrastructure.
The SSE Task Force members participated in bi-weekly telecons to address the given charge. The telecons often included additional distinguished members of the scientific community beyond the task force membership engaged in software issues, as well as personnel from federal agencies outside of NSF who manage software programs. It was quickly acknowledged that a number of reports loosely and tightly related to SSE existed and should be leveraged. By September 2009, the task formed had formed three subcommittees focused on the following topics: (1) compute-intensive science, (2) data-intensive science, and (3) software evolution.
Sensitivity Analysis for the Optimization of Radiofrequency Ablation in the Presence of Material Parameter Uncertainty|
I. Altrogge, T. Preusser, T. Kroeger, S. Haase, T. Paetz, R.M. Kirby. In International Journal for Uncertainty Quantification, 2011.
We present a sensitivity analysis of the optimization of the probe placement in radiofrequency (RF) ablation which takes the uncertainty associated with bio-physical tissue properties (electrical and thermal conductivity) into account. Our forward simulation of RF ablation is based upon a system of partial differential equations (PDEs) that describe the electric potential of the probe and the steady state of the induced heat. The probe placement is optimized by minimizing a temperature-based objective function such that the volume of destroyed tumor tissue is maximized. The resulting optimality system is solved with a multi-level gradient descent approach. By evaluating the corresponding optimality system for certain realizations of tissue parameters (i.e. at certain, well-chosen points in the stochastic space) the sensitivity of the system can be analyzed with respect to variations in the tissue parameters. For the interpolation in the stochastic space we use a stochastic finite element approach with piecewise multilinear ansatz functions on adaptively refined, hierarchical grids. We underscore the significance of the approach by applying the optimization to CT data obtained from a real RF ablation case.
Keywords: netl, stochastic sensitivity analysis, stochastic partial dierential equations, stochastic nite element method, adaptive sparse grid, heat transfer, multiscale modeling, representation of uncertainty
Implementation and Verification of a Nodally-Integrated Tetrahedral Element in FEBio|
SCI Technical Report, S.A. Maas, B.J. Ellis, D.S. Rawlins, L.T. Edgar, C.R. Henak, J.A. Weiss. No. UUSCI-2011-007, SCI Institute, University of Utah, 2011.
Finite element simulations in computational biomechanics commonly require the discretization of extremely complicated geometries. Creating meshes for these complex geometries can be very difficult and time consuming using hexahedral elements. Automatic meshing algorithms exist for tetrahedral elements, but these elements often have numerical problems that discourage their use in complex finite element models. To overcome these problems we have implemented a stabilized, nodally-integrated tetrahedral element formulation in FEBio, our in-house developed finite element code, allowing researchers to use linear tetrahedral elements in their models and still obtain accurate solutions. In addition to facilitating automatic mesh generation, this also allows researchers to use mesh refinement algorithms which are fairly well developed for tetrahedral elements but not so much for hexahedral elements. In this document, the implementation of the stabilized, nodallyintegrated, tetrahedral element, named the \"UT4 element\", is described. Two slightly different variations of the nodally integrated tetrahedral element are considered. In one variation the entire virtual work is stabilized and in the other one the stabilization is only applied to the isochoric part of the virtual work. The implementation of both formulations has been verified and the convergence behavior illustrated using the patch test and three verification problems. Also, a model from our laboratory with very complex geometry is discretized and analyzed using the UT4 element to show its utility for a problem from the biomechanics literature. The convergence behavior of the UT4 element does vary depending on problem, tetrahedral mesh structure and choice of formulation parameters, but the results from the verification problems should assure analysts that a converged solution using the UT4 element can be obtained that is more accurate than the solution from a classical linear tetrahedral formulation.
Dark Regions of No-Reflow on Late Gadolinium Enhancement Magnetic Resonance Imaging Result in Scar Formation After Atrial Fibrillation Ablation|
C.J. McGann, E.G. Kholmovski, J.J. Blauer, S. Vijayakumar, T.S. Haslam, J.E. Cates, E.V. DiBella, N.S. Burgon, B. Wilson, A.J. Alexander, M.W. Prastawa, M. Daccarett, G. Vergara, N.W. Akoum, D.L. Parker, R.S. MacLeod, N.F. Marrouche. In Journal of the American College of Cardiology, Vol. 58, No. 2, pp. 177--185. 2011.
PubMed ID: 21718914
Objectives: The aim of this study was to assess acute ablation injuries seen on late gadolinium enhancement (LGE) magnetic resonance imaging (MRI) immediately post-ablation (IPA) and the association with permanent scar 3 months post-ablation (3moPA).
Background: Success rates for atrial fibrillation catheter ablation vary significantly, in part because of limited information about the location, extent, and permanence of ablation injury at the time of procedure. Although the amount of scar on LGE MRI months after ablation correlates with procedure outcomes, early imaging predictors of scar remain elusive.
Methods: Thirty-seven patients presenting for atrial fibrillation ablation underwent high-resolution MRI with a 3-dimensional LGE sequence before ablation, IPA, and 3moPA using a 3-T scanner. The acute left atrial wall injuries on IPA scans were categorized as hyperenhancing (HE) or nonenhancing (NE) and compared with scar 3moPA.
Results: Heterogeneous injuries with HE and NE regions were identified in all patients. Dark NE regions in the left atrial wall on LGE MRI demonstrate findings similar to the \"no-reflow\" phenomenon. Although the left atrial wall showed similar amounts of HE, NE, and normal tissue IPA (37.7 ± 13\%, 34.3 ± 14\%, and 28.0 ± 11\%, respectively; p = NS), registration of IPA injuries with 3moPA scarring demonstrated that 59.0 ± 19\% of scar resulted from NE tissue, 30.6 ± 15\% from HE tissue, and 10.4 ± 5\% from tissue identified as normal. Paired t-test comparisons were all statistically significant among NE, HE, and normal tissue types (p less than 0.001). Arrhythmia recurrence at 1-year follow-up correlated with the degree of wall enhancement 3moPA (p = 0.02).
Conclusion: Radiofrequency ablation results in heterogeneous injury on LGE MRI with both HE and NE wall lesions. The NE lesions demonstrate no-reflow characteristics and reveal a better predictor of final scar at 3 months. Scar correlates with procedure outcomes, further highlighting the importance of early scar prediction. (J Am Coll Cardiol 2011;58:177–85) © 2011 by the American College of Cardiology Foundation
Smoothness-Increasing Accuracy-Conserving (SIAC) Postprocessing for Discontinuous Galerkin Solutions Over Structured Triangular Meshes|
H. Mirzaee, Liangyue, J.K. Ryan, R.M. Kirby. In SIAM Journal of Numerical Analysis, Vol. 49, No. 5, pp. 1899--1920. 2011.
Theoretically and computationally, it is possible to demonstrate that the order of accuracy of a discontinuous Galerkin (DG) solution for linear hyperbolic equations can be improved from order k+1 to 2k+1 through the use of smoothness-increasing accuracy-conserving (SIAC) filtering. However, it is a computationally complex task to perform this in an efficient manner, which becomes an even greater issue considering nonquadrilateral mesh structures. In this paper, we present an extension of this SIAC filter to structured triangular meshes. The basic theoretical assumption in the previous implementations of the postprocessor limits the use to numerical solutions solved over a quadrilateral mesh. However, this assumption is restrictive, which in turn complicates the application of this postprocessing technique to general tessellations. Additionally, moving from quadrilateral meshes to triangulated ones introduces more complexity in the calculations as the number of integrations required increases. In this paper, we extend the current theoretical results to variable coefficient hyperbolic equations over structured triangular meshes and demonstrate the effectiveness of the application of this postprocessor to structured triangular meshes as well as exploring the effect of using inexact quadrature. We show that there is a direct theoretical extension to structured triangular meshes for hyperbolic equations with bounded variable coefficients. This is a challenging first step toward implementing SIAC filters for unstructured tessellations. We show that by using the usual B-spline implementation, we are able to improve on the order of accuracy as well as decrease the magnitude of the errors. These results are valid regardless of whether exact or inexact integration is used. The results here demonstrate that it is still possible, both theoretically and computationally, to improve to 2k+1 over the DG solution itself for structured triangular meshes.
Efficient Implementation of Smoothness-Increasing Accuracy-Conserving (SIAC) Filters for Discontinuous Galerkin Solutions|
H. Mirzaee, J.K. Ryan, R.M. Kirby. In Journal of Scientific Computing, pp. (in press). 2011.
The discontinuous Galerkin (DG) methods provide a high-order extension of the finite volume method in much the same way as high-order or spectral/hp elements extend standard finite elements. However, lack of inter-element continuity is often contrary to the smoothness assumptions upon which many post-processing algorithms such as those used in visualization are based. Smoothness-increasing accuracy-conserving (SIAC) filters were proposed as a means of ameliorating the challenges introduced by the lack of regularity at element interfaces by eliminating the discontinuity between elements in a way that is consistent with the DG methodology; in particular, high-order accuracy is preserved and in many cases increased. The goal of this paper is to explicitly define the steps to efficient computation of this filtering technique as applied to both structured triangular and quadrilateral meshes. Furthermore, as the SIAC filter is a good candidate for parallelization, we provide, for the first time, results that confirm anticipated performance scaling when parallelized on a shared-memory multi-processor machine.
Numerical Solution of Linear Volterra Integral Equations of the Second Kind with Sharp Gradients|
S.A. Isaacson, R.M. Kirby. In Journal of Computational and Applied Mathematics, Vol. 235, No. 14, pp. 4283--4301. 2011.
Collocation methods are a well-developed approach for the numerical solution of smooth and weakly singular Volterra integral equations. In this paper, we extend these methods through the use of partitioned quadrature based on the qualocation framework, to allow the efficient numerical solution of linear, scalar Volterra integral equations of the second kind with smooth kernels containing sharp gradients. In this case, the standard collocation methods may lose computational efficiency despite the smoothness of the kernel. We illustrate how the qualocation framework can allow one to focus computational effort where necessary through improved quadrature approximations, while keeping the solution approximation fixed. The computational performance improvement introduced by our new method is examined through several test examples. The final example we consider is the original problem that motivated this work: the problem of calculating the probability density associated with a continuous-time random walk in three dimensions that may be killed at a fixed lattice site. To demonstrate how separating the solution approximation from quadrature approximation may improve computational performance, we also compare our new method to several existing Gregory, Sinc, and global spectral methods, where quadrature approximation and solution approximation are coupled.
Towards the Development on an h-p-Refinement Strategy Based Upon Error Estimate Sensitivity|
P.K. Jimack, R.M. Kirby. In Computers and Fluids, Vol. 46, No. 1, pp. 277--281. 2011.
The use of (a posteriori) error estimates is a fundamental tool in the application of adaptive numerical methods across a range of fluid flow problems. Such estimates are incomplete however, in that they do not necessarily indicate where to refine in order to achieve the most impact on the error, nor what type of refinement (for example h-refinement or p-refinement) will be best. This paper extends preliminary work of the authors (Comm Comp Phys, 2010;7:631–8), which uses adjoint-based sensitivity estimates in order to address these questions, to include application with p-refinement to arbitrary order and the use of practical a posteriori estimates. Results are presented which demonstrate that the proposed approach can guide both the h-refinement and the p-refinement processes, to yield improvements in the adaptive strategy compared to the use of more orthodox criteria.
To CG or to HDG: A Comparative Study|
R.M. Kirby, B. Cockburn, S.J. Sherwin. In Journal of Scientific Computing, Note: published online, 2011.
Hybridization through the border of the elements (hybrid unknowns) combined with a Schur complement procedure (often called static condensation in the context of continuous Galerkin linear elasticity computations) has in various forms been advocated in the mathematical and engineering literature as a means of accomplishing domain decomposition, of obtaining increased accuracy and convergence results, and of algorithm optimization. Recent work on the hybridization of mixed methods, and in particular of the discontinuous Galerkin (DG) method, holds the promise of capitalizing on the three aforementioned properties; in particular, of generating a numerical scheme that is discontinuous in both the primary and flux variables, is locally conservative, and is computationally competitive with traditional continuous Galerkin (CG) approaches. In this paper we present both implementation and optimization strategies for the Hybridizable Discontinuous Galerkin (HDG) method applied to two dimensional elliptic operators. We implement our HDG approach within a spectral/hp element framework so that comparisons can be done between HDG and the traditional CG approach.
We demonstrate that the HDG approach generates a global trace space system for the unknown that although larger in rank than the traditional static condensation system in CG, has significantly smaller bandwidth at moderate polynomial orders. We show that if one ignores set-up costs, above approximately fourth-degree polynomial expansions on triangles and quadrilaterals the HDG method can be made to be as efficient as the CG approach, making it competitive for time-dependent problems even before taking into consideration other properties of DG schemes such as their superconvergence properties and their ability to handle hp-adaptivity.
Formal Specification of MPI 2.0: Case Study in Specifying a Practical Concurrent Programming API|
G. Li, R. Palmer, M. DeLisi, G. Gopalakrishnan, R.M. Kirby. In Science of Computer Programming, Vol. 76, pp. 65--81. 2011.
We describe the first formal specification of a non-trivial subset of MPI, the dominant communication API in high performance computing. Engineering a formal specification for a non-trivial concurrency API requires the right combination of rigor, executability, and traceability, while also serving as a smooth elaboration of a pre-existing informal specification. It also requires the modularization of reusable specification components to keep the length of the specification in check. Long-lived APIs such as MPI are not usually 'textbook minimalistic' because they support a diverse array of applications, a diverse community of users, and have efficient implementations over decades of computing hardware. We choose the TLA+ notation to write our specifications, and describe how we organized the specification of around 200 of the 300 MPI 2.0 functions. We detail a handful of these functions in this paper, and assess our specification with respect to the aforementioned requirements. We close with a description of possible approaches that may help render the act of writing, understanding, and validating the specifications of concurrency APIs much more productive.
Direct Isosurface Visualization of Hex-Based High-Order Geometry and Attribute Representations|
T. Martin, E. Cohen, R.M. Kirby. In IEEE Transactions on Visualization and Computer Graphics (TVCG), Vol. PP, No. 99, pp. 1--14. 2011.
In this paper, we present a novel isosurface visualization technique that guarantees the accuarate visualization of isosurfaces with complex attribute data defined on (un-)structured (curvi-)linear hexahedral grids. Isosurfaces of high-order hexahedralbased finite element solutions on both uniform grids (including MRI and CT scans) and more complex geometry represent a domain of interest that can be rendered using our algorithm. Additionally, our technique can be used to directly visualize solutions and attributes in isogeometric analysis, an area based on trivariate high-order NURBS (Non-Uniform Rational B-splines) geometry and attribute representations for the analysis. Furthermore, our technique can be used to visualize isosurfaces of algebraic functions. Our approach combines subdivision and numerical root-finding to form a robust and efficient isosurface visualization algorithm that does not miss surface features, while finding all intersections between a view frustum and desired isosurfaces. This allows the use of view-independent transparency in the rendering process. We demonstrate our technique through a straightforward CPU implementation on both complexstructured and complex-unstructured geometry with high-order simulation solutions, isosurfaces of medical data sets, and isosurfaces of algebraic functions.
Finite element implementation of mechanochemical phenomena in neutral deformable porous media under finite deformation|
G.A. Ateshian, M.B. Albro, S.A. Maas, J.A. Weiss. In Journal of Biomechanical Engineering, Vol. 133, No. 8, 2011.
Biological soft tissues and cells may be subjected to mechanical as well as chemical (osmotic) loading under their natural physiological environment or various experimental conditions. The interaction of mechanical and chemical effects may be very significant under some of these conditions, yet the highly nonlinear nature of the set of governing equations describing these mechanisms poses a challenge for the modeling of such phenomena. This study formulated and implemented a finite element algorithm for analyzing mechanochemical events in neutral deformable porous media under finite deformation. The algorithm employed the framework of mixture theory to model the porous permeable solid matrix and interstitial fluid, where the fluid consists of a mixture of solvent and solute. A special emphasis was placed on solute-solid matrix interactions, such as solute exclusion from a fraction of the matrix pore space (solubility) and frictional momentum exchange that produces solute hindrance and pumping under certain dynamic loading conditions. The finite element formulation implemented full coupling of mechanical and chemical effects, providing a framework where material properties and response functions may depend on solid matrix strain as well as solute concentration. The implementation was validated using selected canonical problems for which analytical or alternative numerical solutions exist. This finite element code includes a number of unique features that enhance the modeling of mechanochemical phenomena in biological tissues. The code is available in the public domain, open source finite element program FEBio (http://mrl.sci.utah.edu/software). [DOI: 10.1115/1.4004810]
From h to p Efficiently: Selecting the Optimal Spectral/hp Discretisation in Three Dimensions|
C.D. Cantwell, S.J. Sherwin, R.M. Kirby, P.H.J. Kelly. In Mathematical Modelling of Natural Phenomena, Vol. 6, No. 3, pp. 84--96. 2011.
Finding consistent strain distributions in the glenohumeral capsule between two subjects: Implications for development of physical examinations|
N.J. Drury, B.J. Ellis, J.A. Weiss, P.J. McMahon, R.E. Debski. In Journal of Biomechanics, Vol. 44, No. 4, pp. 607-613. February, 2011.
The anterior-inferior glenohumeral capsule is the primary passive stabilizer to the glenohumeral joint during anterior dislocation. Physical examinations following dislocation are crucial for proper diagnosis of capsule pathology; however,they are not standardized for joint position which may lead to misdiagnoses and poor outcomes. To suggest joint positions for physical examinations where the stability provided by the capsule may be consistent among patients, the objective of this study was to evaluate the distribution of maximum principal strain on the anterior-inferior capsule using two validated subject-specific finite element models of the glenohumeral joint at clinically relevant joint positions. The joint positions with 25 N anterior load applied at 60° of glenohumeral abduction and 10°, 20°, 30° and 40° of external rotation resulted in distributions of strain that were similar between shoulders(r2≥0.7). Furthermore, those positions with 20-40° of external rotation resulted in capsule strains on the glenoid side of the anterior band of the inferior glenohumeral ligament that were significantly greater than in all other capsule regions. These findings suggest that anterior stability provided by the anterior-inferior capsule may be consistent among subjects at joint positions with 60° of glenohumeral abduction and a mid-range (20-40°) of external rotation, and that the glenoid side has the greatest contribution to stability at these joint positions. Therefore, it may be possible to establish standard joint positions for physical examinations that clinicians can use to effectively diagnose pathology in the anterior-inferior capsule following dislocation and lead to improved outcomes.
The capsule's contribution to total hip construct stability - a finite element analysis|
J.M. Elkins, J.S. Stroud, M.J. Rudert, Y. Tochigi, D.R. Pedersen, B.J. Ellis, J.J. Callaghan, J.A. Weiss, T.D. Brown. In Journal of Orthopedic Research, Vol. 29, No. 11, Note: William Harris, MD Award, pp. 1642--1648. November, 2011.
Instability is a significant concern in total hip arthroplasty (THA), particularly when there is structural compromise of the capsule due to pre-existing pathology or due to necessities of surgical approach. An experimentally grounded fiber-direction-based finite element model of the hip capsule was developed, and was integrated with an established three-dimensional model of impingement/dislocation. Model validity was established by close similarity to results from a cadaveric experiment in a servohydraulic hip simulator. Parametric computational runs explored effects of graded levels of capsule thickness, of regional detachment from the capsule's femoral or acetabular insertions, of surgical incisions of capsule substance, and of capsule defect repairs. Depending strongly upon the specific site, localized capsule defects caused varying degrees of construct stability compromise, with several specific situations involving over 60\% decrement in dislocation resistance. Construct stability was returned substantially toward intact-capsule levels following well-conceived repairs, although the suture sites involved were often at substantial risk of failure. These parametric model results underscore the importance of retaining or robustly repairing capsular structures in THA, in order to maximize overall construct stability. © 2011 Orthopaedic Research Society. Published by Wiley Periodicals, Inc. J Orthop Res 29:1642–1648, 2011
Defect Sampling in Global Error Estimation for ODEs and Method-Of-Lines PDEs Using Adjoint Methods|
SCI Technical Report, L.T. Tran, M. Berzins. No. UUSCI-2011-006, SCI Institute, University of Utah, 2011.
A Conservered Developmental Patterning Network Produces Quantitatively Different Output in Multiple Species of Drosophila|
C. Fowlkes, K. Eckenrode, M. Bragdon, M.D. Meyer, Z. Wunderlich, L. Simirenko, C. Luengo, S. Keranen, C. Henriquez, D. Knowles, M. Biggin, M. Eisen, A. DePace. In PLoS Genetics, Vol. 7, No. 10:e1002346, pp. 17 pages. October, 2011.
Differences in the level, timing, or location of gene expression can contribute to alternative phenotypes at the molecular and organismal level. Understanding the origins of expression differences is complicated by the fact that organismal morphology and gene regulatory networks could potentially vary even between closely related species. To assess the scope of such changes, we used high-resolution imaging methods to measure mRNA expression in blastoderm embryos of Drosophila yakuba and Drosophila pseudoobscura and assembled these data into cellular resolution atlases, where expression levels for 13 genes in the segmentation network are averaged into species-specific, cellular resolution morphological frameworks. We demonstrate that the blastoderm embryos of these species differ in their morphology in terms of size, shape, and number of nuclei. We present an approach to compare cellular gene expression patterns between species, while accounting for varying embryo morphology, and apply it to our data and an equivalent dataset for Drosophila melanogaster. Our analysis reveals that all individual genes differ quantitatively in their spatio-temporal expression patterns between these species, primarily in terms of their relative position and dynamics. Despite many small quantitative differences, cellular gene expression profiles for the whole set of genes examined are largely similar. This suggests that cell types at this stage of development are conserved, though they can differ in their relative position by up to 3-4 cell widths and in their relative proportion between species by as much as 5-fold. Quantitative differences in the dynamics and relative level of a subset of genes between corresponding cell types may reflect altered regulatory functions between species. Our results emphasize that transcriptional networks can diverge over short evolutionary timescales and that even small changes can lead to distinct output in terms of the placement and number of equivalent cells.
A fast iterative method for solving the Eikonal equation on triangulated surfaces|
Z. Fu, W.-K. Jeong, Y. Pan, R.M. Kirby, R.T. Whitaker. In SIAM Journal of Scientific Computing, Vol. 33, No. 5, pp. 2468--2488. 2011.
PubMed Central ID: PMC3360588
This paper presents an efficient, fine-grained parallel algorithm for solving the Eikonal equation on triangular meshes. The Eikonal equation, and the broader class of Hamilton–Jacobi equations to which it belongs, have a wide range of applications from geometric optics and seismology to biological modeling and analysis of geometry and images. The ability to solve such equations accurately and efficiently provides new capabilities for exploring and visualizing parameter spaces and for solving inverse problems that rely on such equations in the forward model. Efficient solvers on state-of-the-art, parallel architectures require new algorithms that are not, in many cases, optimal, but are better suited to synchronous updates of the solution. In previous work [W. K. Jeong and R. T. Whitaker, SIAM J. Sci. Comput., 30 (2008), pp. 2512–2534], the authors proposed the fast iterative method (FIM) to efficiently solve the Eikonal equation on regular grids. In this paper we extend the fast iterative method to solve Eikonal equations efficiently on triangulated domains on the CPU and on parallel architectures, including graphics processors. We propose a new local update scheme that provides solutions of first-order accuracy for both architectures. We also propose a novel triangle-based update scheme and its corresponding data structure for efficient irregular data mapping to parallel single-instruction multiple-data (SIMD) processors. We provide detailed descriptions of the implementations on a single CPU, a multicore CPU with shared memory, and SIMD architectures with comparative results against state-of-the-art Eikonal solvers.
Formal Analysis of MPI-Based Parallel Programs: Present and Future|
G. Gopalakrishnan, R.M. Kirby, S. Siegel, R. Thakur, W. Gropp, E. Lusk, B.R. de Supinski, M. Schultz, G. Bronevetsky. In Communications of the ACM, pp. (accepted). 2011.