Martin BerzinsParallel ComputingGPUs |
Mike KirbyFinite Element MethodsUncertainty Quantification GPUs |
Valerio PascucciScientific Data Management |
Chris JohnsonProblem Solving Environments |
Ross WhitakerGPUs |
Chuck HansenGPUs |
Smoothness-Increasing Accuracy-Conserving (SIAC) Filtering for discontinuous Galerkin Solutions: Improved Errors Versus Higher-Order Accuracy J. King, H. Mirzaee, J.K. Ryan, R.M. Kirby. In Journal of Scientific Computing, Vol. 53, pp. 129--149. 2012. DOI: 10.1007/s10915-012-9593-8 Smoothness-increasing accuracy-conserving (SIAC) filtering has demonstrated its effectiveness in raising the convergence rate of discontinuous Galerkin solutions from order k + 1/2 to order 2k + 1 for specific types of translation invariant meshes (Cockburn et al. in Math. Comput. 72:577–606, 2003; Curtis et al. in SIAM J. Sci. Comput. 30(1):272– 289, 2007; Mirzaee et al. in SIAM J. Numer. Anal. 49:1899–1920, 2011). Additionally, it improves the weak continuity in the discontinuous Galerkin method to k - 1 continuity. Typically this improvement has a positive impact on the error quantity in the sense that it also reduces the absolute errors. However, not enough emphasis has been placed on the difference between superconvergent accuracy and improved errors. This distinction is particularly important when it comes to understanding the interplay introduced through meshing, between geometry and filtering. The underlying mesh over which the DG solution is built is important because the tool used in SIAC filtering—convolution—is scaled by the geometric mesh size. This heavily contributes to the effectiveness of the post-processor. In this paper, we present a study of this mesh scaling and how it factors into the theoretical errors. To accomplish the large volume of post-processing necessary for this study, commodity streaming multiprocessors were used; we demonstrate for structured meshes up to a 50× speed up in the computational time over traditional CPU implementations of the SIAC filter. |
Multiscale Modeling of High Explosives for Transportation Accidents J.R. Peterson, J.C. Beckvermit, T. Harman, M. Berzins, C.A. Wight. In Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond, 2012. DOI: 10.1145/2335755.2335828 The development of a reaction model to simulate the accidental detonation of a large array of seismic boosters in a semi-truck subject to fire is considered. To test this model large scale simulations of explosions and detonations were performed by leveraging the massively parallel capabilities of the Uintah Computational Framework and the XSEDE computational resources. Computed stress profiles in bulk-scale explosive materials were validated using compaction simulations of hundred micron scale particles and found to compare favorably with experimental data. A validation study of reaction models for deflagration and detonation showed that computational grid cell sizes up to 10 mm could be used without loss of fidelity. The Uintah Computational Framework shows linear scaling up to 180K cores which combined with coarse resolution and validated models will now enable simulations of semi-truck scale transportation accidents for the first time. |
Radiation Modeling Using the Uintah Heterogeneous CPU/GPU Runtime System A. Humphrey, Q. Meng, M. Berzins, T. Harman. In Proceedings of the first conference of the Extreme Science and Engineering Discovery Environment (XSEDE'12), Association for Computing Machinery, 2012. DOI: 10.1145/2335755.2335791 The Uintah Computational Framework was developed to provide an environment for solving fluid-structure interaction problems on structured adaptive grids on large-scale, long-running, data-intensive problems. Uintah uses a combination of fluid-flow solvers and particle-based methods for solids, together with a novel asynchronous task-based approach with fully automated load balancing. Uintah demonstrates excellent weak and strong scalability at full machine capacity on XSEDE resources such as Ranger and Kraken, and through the use of a hybrid memory approach based on a combination of MPI and Pthreads, Uintah now runs on up to 262k cores on the DOE Jaguar system. In order to extend Uintah to heterogeneous systems, with ever-increasing CPU core counts and additional onnode GPUs, a new dynamic CPU-GPU task scheduler is designed and evaluated in this study. This new scheduler enables Uintah to fully exploit these architectures with support for asynchronous, outof- order scheduling of both CPU and GPU computational tasks. A new runtime system has also been implemented with an added multi-stage queuing architecture for efficient scheduling of CPU and GPU tasks. This new runtime system automatically handles the details of asynchronous memory copies to and from the GPU and introduces a novel method of pre-fetching and preparing GPU memory prior to GPU task execution. In this study this new design is examined in the context of a developing, hierarchical GPUbased ray tracing radiation transport model that provides Uintah with additional capabilities for heat transfer and electromagnetic wave propagation. The capabilities of this new scheduler design are tested by running at large scale on the modern heterogeneous systems, Keeneland and TitanDev, with up to 360 and 960 GPUs respectively. On these systems, we demonstrate significant speedups per GPU against a standard CPU core for our radiation problem. Keywords: Uintah, hybrid parallelism, scalability, parallel, adaptive, GPU, heterogeneous systems, Keeneland, TitanDev |
Biomedical Visual Computing: Case Studies and Challenges C.R. Johnson. In IEEE Computing in Science and Engineering, Vol. 14, No. 1, pp. 12--21. 2012. PubMed ID: 22545005 PubMed Central ID: PMC3336198 Computer simulation and visualization are having a substantial impact on biomedicine and other areas of science and engineering. Advanced simulation and data acquisition techniques allow biomedical researchers to investigate increasingly sophisticated biological function and structure. A continuing trend in all computational science and engineering applications is the increasing size of resulting datasets. This trend is also evident in data acquisition, especially in image acquisition in biology and medical image databases. For example, in a collaboration between neuroscientist Robert Marc and our research team at the University of Utah's Scientific Computing and Imaging (SCI) Institute (www.sci.utah.edu), we're creating datasets of brain electron microscopy (EM) mosaics that are 16 terabytes in size. However, while there's no foreseeable end to the increase in our ability to produce simulation data or record observational data, our ability to use this data in meaningful ways is inhibited by current data analysis capabilities, which already lag far behind. Indeed, as the NIH-NSF Visualization Research Challenges report notes, to effectively understand and make use of the vast amounts of data researchers are producing is one of the greatest scientific challenges of the 21st century. Visual data analysis involves creating images that convey salient information about underlying data and processes, enabling the detection and validation of expected results while leading to unexpected discoveries in science. This allows for the validation of new theoretical models, provides comparison between models and datasets, enables quantitative and qualitative querying, improves interpretation of data, and facilitates decision making. Scientists can use visual data analysis systems to explore \"what if\" scenarios, define hypotheses, and examine data under multiple perspectives and assumptions. In addition, they can identify connections between numerous attributes and quantitatively assess the reliability of hypotheses. In essence, visual data analysis is an integral part of scientific problem solving and discovery. As applied to biomedical systems, visualization plays a crucial role in our ability to comprehend large and complex data-data that, in two, three, or more dimensions, convey insight into many diverse biomedical applications, including understanding neural connectivity within the brain, interpreting bioelectric currents within the heart, characterizing white-matter tracts by diffusion tensor imaging, and understanding morphology differences among different genetic mice phenotypes. Keywords: kaust |
Fast, Effective BVH Updates for Animated Scenes D. Kopta, T. Ize, J. Spjut, E. Brunvand, A. Davis, A. Kensler. In Proceedings of the Symposium on Interactive 3D Graphics and Games (I3D '12), pp. 197--204. 2012. DOI: 10.1145/2159616.2159649 Bounding volume hierarchies (BVHs) are a popular acceleration structure choice for animated scenes rendered with ray tracing. This is due to the relative simplicity of refitting bounding volumes around moving geometry. However, the quality of such a refitted tree can degrade rapidly if objects in the scene deform or rearrange significantly as the animation progresses, resulting in dramatic increases in rendering times and a commensurate reduction in the frame rate. The BVH could be rebuilt on every frame, but this could take significant time. We present a method to efficiently extend refitting for animated scenes with tree rotations, a technique previously proposed for off-line improvement of BVH quality for static scenes. Tree rotations are local restructuring operations which can mitigate the effects that moving primitives have on BVH quality by rearranging nodes in the tree during each refit rather than triggering a full rebuild. The result is a fast, lightweight, incremental update algorithm that requires negligible memory, has minor update times, parallelizes easily, avoids significant degradation in tree quality or the need for rebuilding, and maintains fast rendering times. We show that our method approaches or exceeds the frame rates of other techniques and is consistently among the best options regardless of the animated scene. |
Understanding Quasi-Periodic Fieldlines and Their Topology in Toroidal Magnetic Fields A.R. Sanderson, G. Chen, X. Tricoche, E. Cohen. In Topological Methods in Data Analysis and Visualization II, Edited by R. Peikert and H. Carr and H. Hauser and R. Fuchs, Springer, pp. 125--140. 2012. DOI: 10.1007/478-3-642-23175-9 |
Adaptive High-Order Discontinuous Galerkin Solution of Elastohydrodynamic Lubrication Point Contact Problems H. Lu, M. Berzins, C.E. Goodyer, P.K. Jimack. In Advances in Engineering Software, Vol. 45, No. 1, pp. 313--324. 2012. DOI: 10.1016/j.advengsoft.2011.10.006 This paper describes an adaptive implementation of a high order Discontinuous Galerkin (DG) method for the solution of elastohydrodynamic lubrication (EHL) point contact problems. These problems arise when modelling the thin lubricating film between contacts which are under sufficiently high pressure that the elastic deformation of the contacting elements cannot be neglected. The governing equations are highly nonlinear and include a second order partial differential equation that is derived via the thin-film approximation. Furthermore, the problem features a free boundary, which models where cavitation occurs, and this is automatically captured as part of the solution process. The need for spatial adaptivity stems from the highly variable length scales that are present in typical solutions. Results are presented which demonstrate both the effectiveness and the limitations of the proposed adaptive algorithm. Keywords: Elastohydrodynamic lubrication, Discontinuous Galerkin, High polynomial degree, h-adaptivity, Nonlinear systems |
An optimization framework for inversely estimating myocardial transmembrane potentials and localizing ischemia D. Wang, R.M. Kirby, R.S. Macleod, C.R. Johnson. In Proceedings of the International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS), pp. 1680--1683. 2011. DOI: 10.1109/IEMBS.2011.6090483 PubMed ID: 22254648 PubMed Central ID: PMC3336368 By combining a static bidomain heart model with a torso conduction model, we studied the inverse electrocardiographic problem of computing the transmembrane potentials (TMPs) throughout the myocardium from a body-surface potential map, and then used the recovered potentials to localize myocardial ischemia. Our main contribution is solving the inverse problem within a constrained optimization framework, which is a generalization of previous methods for calculating transmembrane potentials. The framework offers ample flexibility for users to apply various physiologically-based constraints, and is well supported by mature algorithms and solvers developed by the optimization community. By avoiding the traditional inverse ECG approach of building the lead-field matrix, the framework greatly reduces computation cost and, by setting the associated forward problem as a constraint, the framework enables one to flexibly set individualized resolutions for each physical variable, a desirable feature for balancing model accuracy, ill-conditioning and computation tractability. Although the task of computing myocardial TMPs at an arbitrary time instance remains an open problem, we showed that it is possible to obtain TMPs with moderate accuracy during the ST segment by assuming all cardiac cells are at the plateau phase. Moreover, the calculated TMPs yielded a good estimate of ischemic regions, which was of more clinical interest than the voltage values themselves. We conducted finite element simulations of a phantom experiment over a 2D torso model with synthetic ischemic data. Preliminary results indicated that our approach is feasible and suitably accurate for the common case of transmural myocardial ischemia. |
Dark Regions of No-Reflow on Late Gadolinium Enhancement Magnetic Resonance Imaging Result in Scar Formation After Atrial Fibrillation Ablation C.J. McGann, E.G. Kholmovski, J.J. Blauer, S. Vijayakumar, T.S. Haslam, J.E. Cates, E.V. DiBella, N.S. Burgon, B. Wilson, A.J. Alexander, M.W. Prastawa, M. Daccarett, G. Vergara, N.W. Akoum, D.L. Parker, R.S. MacLeod, N.F. Marrouche. In Journal of the American College of Cardiology, Vol. 58, No. 2, pp. 177--185. 2011. DOI: 10.1016/j.jacc.2011.04.008 PubMed ID: 21718914 Objectives: The aim of this study was to assess acute ablation injuries seen on late gadolinium enhancement (LGE) magnetic resonance imaging (MRI) immediately post-ablation (IPA) and the association with permanent scar 3 months post-ablation (3moPA). Background: Success rates for atrial fibrillation catheter ablation vary significantly, in part because of limited information about the location, extent, and permanence of ablation injury at the time of procedure. Although the amount of scar on LGE MRI months after ablation correlates with procedure outcomes, early imaging predictors of scar remain elusive. Methods: Thirty-seven patients presenting for atrial fibrillation ablation underwent high-resolution MRI with a 3-dimensional LGE sequence before ablation, IPA, and 3moPA using a 3-T scanner. The acute left atrial wall injuries on IPA scans were categorized as hyperenhancing (HE) or nonenhancing (NE) and compared with scar 3moPA. Results: Heterogeneous injuries with HE and NE regions were identified in all patients. Dark NE regions in the left atrial wall on LGE MRI demonstrate findings similar to the \"no-reflow\" phenomenon. Although the left atrial wall showed similar amounts of HE, NE, and normal tissue IPA (37.7 ± 13\%, 34.3 ± 14\%, and 28.0 ± 11\%, respectively; p = NS), registration of IPA injuries with 3moPA scarring demonstrated that 59.0 ± 19\% of scar resulted from NE tissue, 30.6 ± 15\% from HE tissue, and 10.4 ± 5\% from tissue identified as normal. Paired t-test comparisons were all statistically significant among NE, HE, and normal tissue types (p less than 0.001). Arrhythmia recurrence at 1-year follow-up correlated with the degree of wall enhancement 3moPA (p = 0.02). Conclusion: Radiofrequency ablation results in heterogeneous injury on LGE MRI with both HE and NE wall lesions. The NE lesions demonstrate no-reflow characteristics and reveal a better predictor of final scar at 3 months. Scar correlates with procedure outcomes, further highlighting the importance of early scar prediction. (J Am Coll Cardiol 2011;58:177–85) © 2011 by the American College of Cardiology Foundation |
Efficient Implementation of Smoothness-Increasing Accuracy-Conserving (SIAC) Filters for Discontinuous Galerkin Solutions H. Mirzaee, J.K. Ryan, R.M. Kirby. In Journal of Scientific Computing, pp. (in press). 2011. DOI: 10.1007/s10915-011-9535-x The discontinuous Galerkin (DG) methods provide a high-order extension of the finite volume method in much the same way as high-order or spectral/hp elements extend standard finite elements. However, lack of inter-element continuity is often contrary to the smoothness assumptions upon which many post-processing algorithms such as those used in visualization are based. Smoothness-increasing accuracy-conserving (SIAC) filters were proposed as a means of ameliorating the challenges introduced by the lack of regularity at element interfaces by eliminating the discontinuity between elements in a way that is consistent with the DG methodology; in particular, high-order accuracy is preserved and in many cases increased. The goal of this paper is to explicitly define the steps to efficient computation of this filtering technique as applied to both structured triangular and quadrilateral meshes. Furthermore, as the SIAC filter is a good candidate for parallelization, we provide, for the first time, results that confirm anticipated performance scaling when parallelized on a shared-memory multi-processor machine. |