Designed especially for neurobiologists, FluoRender is an interactive tool for multi-channel fluorescence microscopy data visualization and analysis.

BrainStimulator is a set of networks that are used in SCIRun to perform simulations of brain stimulation such as transcranial direct current stimulation (tDCS) and magnetic transcranial stimulation (TMS).

Developing software tools for science has always been a central vision of the SCI Institute.

Scientific Computing

Numerical simulation of real-world phenomena provides fertile ground for building interdisciplinary relationships. The SCI Institute has a long tradition of building these relationships in a win-win fashion – a win for the theoretical and algorithmic development of numerical modeling and simulation techniques and a win for the discipline-specific science of interest. High-order and adaptive methods, uncertainty quantification, complexity analysis, and parallelization are just some of the topics being investigated by SCI faculty. These areas of computing are being applied to a wide variety of engineering applications ranging from fluid mechanics and solid mechanics to bioelectricity.

Martin Berzins

Parallel Computing
GPUs

Mike Kirby

Finite Element Methods
Uncertainty Quantification
GPUs

Valerio Pascucci

Scientific Data Management

Chris Johnson

Problem Solving Environments

Ross Whitaker

GPUs

Chuck Hansen

GPUs

Amir Arzani

Scientific machine learning
Data-driven fluid flow modeling

Funded Research Projects:

Optimal Approximation Algorithms in High Dimensions

Akil Narayan
The increasing power of modern computational hardware has enabled computer-based simulation of sophisticated mathematical models that resolve important physical phenomena in great detail. With the advent of these computational abilities has come an increased demand to include more complex physical interactions in the models, and thus an increased strain on computational resources. Modern engineering design utilizes such models, and these design problems typically involve (1) numerous tunable parameters that affect reliability, cost, and failure, (2) uncertainty about external influences manifesting as randomness in the model, and (3) epistemic ignorance involving model form uncertainty. In realistic applications, the collection of these effects leads to predictions that depend on a cumulatively high-dimensional parameter. This project focuses on development and deployment of novel, near-optimal experimental design and sampling algorithms for the accurate and efficient simulation of physical models parameterized by high-dimensional inputs. The work of this project involves the application of recently developed approximation theory results in the computational arena, targeted advances that extend theoretical mathematics for computational purposes, and the development and implementation of algorithms for large-scale computations.

The technical aspects of this project are designed to provide feasible computational algorithms and concrete mathematical guarantees for tasks in high-dimensional approximation. The three major core components for the completion of this task involve the design, implementation, and analysis of algorithms that leverage optimality characteristics of (1) random and deterministic experimental and sampling design, (2) computational algorithms for identifying efficient sampling schemes, and (3) strategies and techniques for emerging approximation paradigms such as sparse approximation and dimension reduction. A crosscutting theme is application of these methods to problems of modern interest in scientific computing. This project involves fundamental contributions to the fields of applied approximation theory and computational approximation methods through the development of applications-oriented sampling designs with provable near-optimality. Theoretical investigations of this project connect classical techniques in approximation and linear algebra with emerging algorithms in data reduction and reduced order modeling. The implementation of these algorithms will significantly enhance theoretical understanding and computational feasibility for goal-oriented design, parameter study and reduction, sparse and compressive representations, model verification and calibration, and data-driven simulations.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Cyberinfrastructure Center of Excellence Pilot Study

Ewa Deelman, Valerio Pascucci, Anirban Mandal, Jaroslaw Nabrzyski, Robert Ricci
University of Southern California, Los Angeles, CA, United States

NSF's major multi-user research facilities (large facilities) are sophisticated research instruments and platforms - such as large telescopes, interferometers and distributed sensor arrays - that serve diverse scientific disciplines from astronomy and physics to geoscience and biological science. Large facilities are increasingly dependent on advanced cyberinfrastructure (CI) - computing, data and software systems, networking, and associated human capital - to enable broad delivery and analysis of facility-generated data. As a result of these cyber infrastructure tools, scientists and the public gain new insights into fundamental questions about the structure and history of the universe, the world we live in today, and how our plants and animals may change in the coming decades. The goal of this pilot project is to develop a model for a Cyberinfrastructure Center of Excellence (CI CoE) that facilitates community building and sharing and applies knowledge of best practices and innovative solutions for facility CI.

The pilot project will explore how such a center would facilitate CI improvements for existing facilities and for the design of new facilities that exploit advanced CI architecture designs and leverage establish tools and solutions. The pilot project will also catalyze a key function of an eventual CI CoE - to provide a forum for exchange of experience and knowledge among CI experts. The project will also gather best practices for large facilities, with the aim of enhancing individual facility CI efforts in the broader CI context. The discussion forum and planning effort for a future CI CoE will also address training and workforce development by expanding the pool of skilled facility CI experts and forging career paths for CI professionals. The result of this work will be a strategic plan for a CI CoE that will be evaluated and refined through community interactions: workshops and direct engagement with the facilities and the broader CI community.

This project is being supported by the Office of Advanced Cyberinfrastructure in the Directorate for Computer and Information Science and Engineering and the Division of Emerging Frontiers in the Directorate for Biological Sciences.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Efficiency and Productivity through Artificial Intelligence

Valerio Pascucci
Efficient cyberinfrastructure (advanced computing, data, software and networking infrastructure) is a critical component of the support that NSF provides for new discoveries in science and engineering. Cyberinfrastructure is complex and traditionally requires years of human hand-tuning to fully achieve maximal performance for scientific users. We propose to introduce Artificial Intelligence (AI) as a way to automatically and quickly optimize the performance and broadest use of recent NSF-supported advanced computing resources. Through this pilot effort our ultimate aim is to enable and accelerate scientific advances in widely diverse fields such as biology, chemistry, oceanography, materials science, climate modeling, and cosmology.

As the research cyberinfrastructure grows rapidly in scale and complexity, it is essential to integrate new technologies based on Machine Learning (ML) and AI to ensure that the investments in new hardware and software components result in proportional improvements in performance and capability. This project will undertake a transformative research activity targeting: (1) scaling ML algorithms to make them easily available to the scientific community; and (2) improving cyberinfrastructure efficiency through AI-based predictive models. This technical work will be complemented and informed by a community engagement effort to jointly catalog the state of the art and identify future challenges and opportunities in enabling a new smart cyberinfrastructure.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Robust and Scalable Multi-Fidelity Algorithms for Model-Based Predictions

Akil Narayan
Modern computational models are complex in nature: accurate predictions of physics require detailed and intensive computational resources. As such, development of accurate scientific models has been the area of research emphasis in recent decades. Today’s scientific models involve largescale simulation tools, often with many interdependent components, and sometimes requiring days to complete a single simulation. Adding to this complexity is the presence of uncertainty, which is often encoded into models via parameters or random variables. Any direct approach to analyze the impact of parametric variation on such expensive models is infeasible.
One approach to circumvent this limitation is to utilize hierarchies of models, each with differing computational costs and predictive fidelities. Research in the past few years has demonstrated that intelligent allocation of resources across this ensemble of models can produce predictions with much greater accuracy than concentrating all resources in a single model. Such multi-fidelity procedures hold the potential to optimally utilize ensembles of models to make predictions.

The main components of this proposed project address optimal resource allocation and robust and scalable model reduction, generation, and learning via low-rank multi-fidelity and multilevel procedures. The overall goal is the construction of surrogate models with accuracy guarantees that can be used in design optimization, inference, and general uncertainty quantification scenarios. The tasks associated with this project involve fundamental mathematical and algorithmic advances in low-rank multi-fidelity methods. Error certificates to ensure accuracy will be developed when possible. Kernel learning techniques will be employed to explore problem-dependent low-rank structure and optimize allocation of resources. Algorithmic methods to handle heterogeneous models, data, and parameter spaces will be developed resulting in a comprehensive framework for utilizing low-rank multi-fidelity methods.

The multi-fidelity procedures devised in this project will also aid in developing novel strategies for model comparison, ranking, discrimination, and genesis. Model comparison and ranking will enable development of a comprehensive multi-fidelity pipeline to automatically learn and update model hierarchies and fidelities. Model generation using the simulation data from a multi-fidelity pipeline allows the automated construction of model emulators that can more easily be explored to detect and exploit low-rank structure.

This project will explore usage of low-rank multi-fidelity methods in two main application areas. The first area is in robust design under uncertainty, which requires robust, accurate, and efficient forward model evaluations. The second area of application is in statistical inference, requiring computationally expensive exploration of posterior distributions. This project will demonstrate the utility of low-rank multi-fidelity methods in acceleration of robust design and inferential tasks. Problems addressed by the work in this project include simulations in topology optimization, nonlocal/fractional differential equation models, modeling of multi-physics solar power receivers, and supersonic channel flow.

UINTAH + HEDGEHOG -- Hybrid Task Graph Execution Library Development for Generalized Work Loads

Martin Berzins
The Overall Objective is to develop a new Uintah runtime environment that demonstrates a flexible approach for accommodating different task execution and state management strategies consistent with a starting point:

1. Uintah uses an asynchronous manytask (AMT) approach that has been shown to strong and weak scale to 256K cores with 16K GPUs on Titan and 768K cores on Mira, through its asynchronous adaptive and over-decomposition based runtime scheduler. This scheduler works on many different and diverse architectures, from many DOE and NSF leadership class machines to Chinas Sunway Tiahulight. In addition this AMT approach when combined with mesh coarsening allows for an efficient approach to resilience.

2. HTGS/Hedgehog is a high performance single node multi-CPU/GPU tasked based system developed at NIST. Internal state management and execution strategies at the level of a single node is maintained within an explicit task graph representation. HTGS/Hedgehog has produced good competitive results on a single node.

3. Demonstrating that the integration of two different task execution paradigms and the sharing of both local and global state can occur with minimal changes to either libraries.

The objective is to integrate the HTGS/Hedgehog Task Graph library into the Uintah Runtime. This new runtime would combine the global state management and multi-nodal execution characteristics of Uintah with the local single node execution facilities of HTGS/Hedgehog. This work would demonstrate and show how state management would be managed with these two different libraries. While the two libraries share many commonalities and architectural similarities, they are distinct in the underlying implementation. Understanding and developing a robust mechanism for sharing global and local state between the two libraries along with integrating the overall resource management strategies and task execution for multiple CPU/GPU architectures is the focus of this work.

The objectives will be carried out by first conducting feasibility studies between two different applications (3D structured grid application and an imaging analysis application) followed by the prototype implementation of new Uintah Scheduler that integrates the HTGS/Hedgehog library at the nodal level. The two different applications will be used demonstrate scalability and performance on both single node and multi-node systems. Finally, the proof-of-concept prototype Uintah Scheduler implementation will be transformed into a production level system in the third year of this effort.

Portable Applications Driven Approach to Scalability on Frontera and Future Exascale Systems

Martin Berzins
The present uncertainty in computer architectures requires software design to allow applications codes to both be able to scale across 20K to 100K nodes and to be able to run portably on a range of possible nodal architectures with a variety of processor technologies being involved, ranging from i86, ARM, GPU to possibly FPGAs. At the same time it isi important to use challenging applications to validate the software solutions and to ensure that they are realistic. This project led by Professor Martin Berzins will use the Frontera system to help address and demonstrate portability for an important class of engineering applications using the Uintah software.

Uintah software employs an asynchronous many task-based approach that has proved to be exceptionally robust at enabling complex engineering applications to run at scale on a broad range of architectures. As new and different architectures require not only the ability to execute tasks asynchronously but to deal with memory hierarchies and to execute efficiently on i86 architectures to GPUs and to a broad range of other possible architectures. Uintah use an approach based upon the Kokkos portability library that makes it possible to build a simple clean loop level interface that enables the loops themselves to execute efficiently on different architectures.

The work program will first port and evaluate existing Uintah architectures to Frontera and then consider new applications that apply the Uintah methodology to areas such as unstructured mesh calculations and particle methods applied to biomedical problems. The work program described here covers the application of these ideas to Frontera. The main effort will be through other funded projects, but any funding variability will be accomodated through an adaptive appropach to the applications space.

Collaborative Research: Detecting and Preventing Covid-19 with Privacy-Preserving Decentralized Machine Learning

Bao Wang
We are facing scientific challenges caused by the COVID-19, including detecting COVID-19 accurately and preventing its spread efficiently. Cutting-edge machine learning technologies, especially modern deep learning arts, provide feasible avenues to resolve these challenges. Deep learning-based computational imaging algorithms facilitate accurate and rapid COVID-19 diagnosis; sequential modeling with recurrent neural networks or transformers enables accurate and real-time COVID-19 spread prediction. However, most existing black-box deep learning research on the COVID-19 is the alchemy of turning unstructured data into gold and based on systematic trial and error. The current deep learning-based COVID-19 research raises many untrustworthy issues, including unreliable diagnosis, data privacy sacrifice, and lack of interpretability. Lacking interpretable and reliable predictions puts substantial strains on practitioners to leverage deep learning approaches to detect and prevent COVID-19. Data privacy constraints bring us many unraveling challenges. Thus, developing trustworthy machine learning algorithms while preserving data privacy is crucial to detect and prevent COVID-19.

We are a team of researchers with different expertise and common research interests, who jointly seek to develop theoretically principled decentralized machine learning algorithms that can provide reliable predictions. Furthermore, we focus on applying these machine learning algorithms to accurately and rapidly diagnose COVID-19 patients and predict the virus spread. We propose a challenging but walkable path towards developing privacy-preserving machine learning algorithms to detect and prevent COVID-19. We will integrate our expertise synergistically to develop privacy-preserving decentralized machine learning algorithms with performance guarantees and a high-throughput and low-latency software package to accurately and rapidly detect COVID-19 and effectively prevent its spread. As such, we propose three interconnected thrusts to develop novel neural network architectures based on mathematical principles, efficient privacy-preserving decentralized optimization algorithms, algorithms for spatiotemporal data forecasting and medical image processing and analysis, and an integrated software package to assist fighting against the coronavirus. Each thrust contains multiple theoretical explorations and numerical validation.

Intellectual Merit:
The proposal's intellectual merit include: (i) development of robust and mathematically principled recurrent neural networks for accurate real-time spatio-temporal forecasting, (ii) development of novel efficient federated and decentralized machine learning algorithms with a performance guarantee, (iii) leveraging the stochastic differential equations theory to develop new privacy-preserving machine learning mechanisms, (iv) construction of new epidemiology models-principled recurrent neural networks with accurate and interpretable predictions, (v) development of trustworthy deep learning-based frameworks for COVID-19 diagnosis from multi-modal medical measurements.

Broader Impacts:
The broader impacts of this project are in applying the proposed algorithms and their analysis over a wide range of science and engineering disciplines, such as scientific and medical image analysis, epidemic forecasting, patient monitoring, and microscopic imaging. The projects shall train a diverse body of the graduate and undergraduate students at Michigan State University, the University of Kentucky, and the University of Utah through collaborative education and research activities in applied mathematics, statistics, computer science, data science, physics, and social science. The project also plans to have research activities involving under-represented students in three universities located in three states. Besides the interdisciplinary collaboration across other institutions, we also aim to establish industrial partnerships to extend the proposed project's impact. The developed software will be shared with the general public through Github.

Sub-Pilot-Scale Production of High-Value Products for U.S. Coals

Chris Johnson
The primary objectives of this project are to: 1) provide sub-pilot scale verification of lab-scale developments on the production of isotropic and mesophase coal-tar pitch (CTP) for carbon fiber production, using coals from five U.S. coal-producing regions (UT, WY, WV, AK, IL); 2) investigate the production of a high-value b-SiC byproduct using residual coal char from the tar production process, and 3) develop an extensive database and suite of tools for data analysis and economic modeling, to relate process conditions to product quality, to assess the economic viability of coals from different regions for producing specific high-value products.

The University of Utah will use a 0.5 ton/day rotary reactor to pyrolyze coals to produce tars suitable for upgrading to coal tar pitch. The same reactor technology will be used in a second stage to perform the tar upgrading to either mesophase or isotropic pitch, depending on the properties of the original coal. The University of Wyoming will spin the product pitch into carbon fiber, to assess fiber quality arising from different coals and from different processing conditions. The solid char byproduct from coal pyrolysis will be used by the University of Wyoming to produce b-SiC. The University of Utah will work with Marshall University to develop a novel database, coupled with detailed economic models and analysis tools, to provide a means for understanding correlations between coal properties, process conditions and product quality, to allow assessment of the potential economic viability of coals from different regions for producing specific high-value products. Access to these some of these computational tools will become available to the public through a web-based community portal.

This effort is a major step towards providing a low-cost carbon fiber product from coal for potential use in automotive and other important markets, and will also lead to new economic development opportunities for communities with coal-based economies.

Experimental Characterization and Modeling of Failure in Post-Buckled Composite Stiffened Panels with a Scarf Repair

Alliance for Multiscale Modeling of Electronic Materials for an Energy Efficient Army

Mike Kirby
The objective of this Alliance is to conduct fundamental research to create MSME to support development of future electronic materials and devices for the Army. The U.S. Army Research Laboratory (ARL) envisions the MultiScale multidisciplinary Modeling of Electronic materials (MSME) Collaborative Research Alliance (CRA) which will bring together government, industrial, and academic institutions to undertake the fundamental research necessary to enable the quantitative understanding of electronic materials from the smallest to the largest relevant scales.

Augmented Design Through Analysis and Visualization Facilitating Better Designs and Enhanced Designers

In Situ Feature Extraction and Visualization from Discontinuous Galerkin Based High-Order Methods

Mike Kirby
The use of simulation science as a means of scientific inquiry is increasing at a tremendous rate. The process of mathematically modeling physical phenomena, estimating key modeling parameters, numerically approximating the solution, and computationally solving the resulting algorithm has inundated the scientific and engineering worlds, allowing for rapid advances in our understanding and utilization of the world around us. The efficacy of simulation science has been, in part, due to two critical components: (1) the identification and minimization of the error budget (e.g. modeling, discretization and uncertainty errors), and equally importantly, (2) evaluation mechanisms (such as visualization) by which the investigator assimilates the data produced through simulation. The latter allows for further refinement of the simulation science process (through model correction, increased numerical resolution, or algorithm debugging, etc.) and makes possible scientific statements about the physical phenomena being investigated.

Tremendous effort has been exerted over many decades in the pursuit of numerical methods that are both flexible and accurate, hence providing sufficient fidelity to be employed in the numerical solution of a large number of models, and sufficient analysis of accuracy to allow researchers to focus their attention on model refinement and uncertainty quantification. High-order finite element methods (also known as spectral/hp element methods), using either the continuous Galerkin or discontinuous Galerkin formulation, have reached a level of sophistication that allows them to be commonly applied to a diverse set of real-life engineering problems in computational solid mechanics, fluid dynamics, acoustics and electromagnetics. Many of the physical problems of interest are, unfortunately, not steady-state --- leading to simulations that must run for a long time (days, weeks and in some cases months). Thus, in the absence of creative solutions, datasets can easily consume all available storage and networking resources. Examples of such simulations within fluid dynamics include all simulations in which the fluid is in transition or fully turbulent. With regards to ARO interests, problems in turbo-machinery and rotorcraft, where aspects of the geometry are rotating and/or sliding past one other, fall into this category. High-order finite element methods are now beginning to be used to simulate these physical systems due to their inherent ability to capture complex structures (such as vortices) with little numerical dissipation and dispersion. The transient nature of these simulations complicates the data handling (post processing requires the time history) and renders single snap-shots of the solution insufficient to understand the time-varying nature of the physics.

Objective
Our research objectives are two-fold: (1) We will generate "high-order FEM" appropriate dimensionality reduction feature extraction methods such as vortex cores which can be accomplished as part of an in situ data processing pipeline. (2) Given the exploratory nature inherent in analyzing and visualizing transient phenomena, we may specify regions of interest in an in situ fashion within a simulation field based upon the visualization objective, extract and transmit the result of working on relevant high-order FEM information to our visualization system, and then reconstruct the visualization features of interest with the cognizance of V&V.

Publications in Scientific Computing:

Page 15 of 28

Start
Prev
10
11
12
13
14
15
16
17
18
19
Next
End

Large Scale Parallel Solution of Incompressible Flow Problems using Uintah and hypre
J. Schmidt, M. Berzins, J. Thornock, T. Saad, J. Sutherland. In 2013 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 458--465. 2013.

The Uintah Software framework was developed to provide an environment for solving fluid-structure interaction problems on structured adaptive grids on large-scale, longrunning, data-intensive problems. Uintah uses a combination of fluid-flow solvers and particle-based methods for solids together with a novel asynchronous task-based approach with fully automated load balancing. As Uintah is often used to solve incompressible flow problems in combustion applications it is important to have a scalable linear solver. While there are many such solvers available, the scalability of those codes varies greatly. The hypre software offers a range of solvers and preconditioners for different types of grids. The weak scalability of Uintah and hypre is addressed for particular examples of both packages when applied to a number of incompressible flow problems. After careful software engineering to reduce startup costs, much better than expected weak scalability is seen for up to 100K cores on NSFs Kraken architecture and up to 260K cpu cores, on DOEs new Titan machine. The scalability is found to depend in a crtitical way on the choice of algorithm used by hypre for a realistic application problem.

Keywords: Uintah, hypre, parallelism, scalability, linear equations

Uncertainty Visualization in Forward and Inverse Cardiac Models
B. Burton, B. Erem, K. Potter, P. Rosen, C.R. Johnson, D. Brooks, R.S. Macleod. In Computing in Cardiology CinC, pp. 57--60. 2013.
ISSN: 2325-8861

Quantification and visualization of uncertainty in cardiac forward and inverse problems with complex geometries is subject to various challenges. Specific to visualization is the observation that occlusion and clutter obscure important regions of interest, making visual assessment difficult. In order to overcome these limitations in uncertainty visualization, we have developed and implemented a collection of novel approaches. To highlight the utility of these techniques, we evaluated the uncertainty associated with two examples of modeling myocardial activity. In one case we studied cardiac potentials during the repolarization phase as a function of variability in tissue conductivities of the ischemic heart (forward case). In a second case, we evaluated uncertainty in reconstructed activation times on the epicardium resulting from variation in the control parameter of Tikhonov regularization (inverse case). To overcome difficulties associated with uncertainty visualization, we implemented linked-view windows and interactive animation to the two respective cases. Through dimensionality reduction and superimposed mean and standard deviation measures over time, we were able to display key features in large ensembles of data and highlight regions of interest where larger uncertainties exist.

Specimen-specific predictions of contact stress under physiological loading in the human hip: validation and sensitivity studies
C.R. Henak, A.K. Kapron, B.J. Ellis, S.A. Maas, A.E. Anderson, J.A. Weiss. In Biomechanics and Modeling in Mechanobiology, pp. 1-14. 2013.
DOI: 10.1007/s10237-013-0504-1

Hip osteoarthritis may be initiated and advanced by abnormal cartilage contact mechanics, and finite element (FE) modeling provides an approach with the potential to allow the study of this process. Previous FE models of the human hip have been limited by single specimen validation and the use of quasi-linear or linear elastic constitutive models of articular cartilage. The effects of the latter assumptions on model predictions are unknown, partially because data for the instantaneous behavior of healthy human hip cartilage are unavailable. The aims of this study were to develop and validate a series of specimen-specific FE models, to characterize the regional instantaneous response of healthy human hip cartilage in compression, and to assess the effects of material nonlinearity, inhomogeneity and specimen-specific material coefficients on FE predictions of cartilage contact stress and contact area. Five cadaveric specimens underwent experimental loading, cartilage material characterization and specimen-specific FE modeling. Cartilage in the FE models was represented by average neo-Hookean, average Veronda Westmann and specimen- and region-specific Veronda Westmann hyperelastic constitutive models. Experimental measurements and FE predictions compared well for all three cartilage representations, which was reflected in average RMS errors in contact stress of less than 25 %. The instantaneous material behavior of healthy human hip cartilage varied spatially, with stiffer acetabular cartilage than femoral cartilage and stiffer cartilage in lateral regions than in medial regions. The Veronda Westmann constitutive model with average material coefficients accurately predicted peak contact stress, average contact stress, contact area and contact patterns. The use of subject- and region-specific material coefficients did not increase the accuracy of FE model predictions. The neo-Hookean constitutive model underpredicted peak contact stress in areas of high stress. The results of this study support the use of average cartilage material coefficients in predictions of cartilage contact stress and contact area in the normal hip. The regional characterization of cartilage material behavior provides the necessary inputs for future computational studies, to investigate other mechanical parameters that may be correlated with OA and cartilage damage in the human hip. In the future, the results of this study can be applied to subject-specific models to better understand how abnormal hip contact stress and contact area contribute to OA.

Relationship of the intercondylar roof and the tibial footprint of the ACL: implications for ACL reconstruction
P.T. Scheffel, H.B. Henninger, R.T. Burks. In American Journal of Sports Medicine, Vol. 41, No. 2, pp. 396--401. 2013.
DOI: 10.1177/0363546512467955

Background: Debate exists on the proper relation of the anterior cruciate ligament (ACL) footprint with the intercondylar notch in anatomic ACL reconstructions. Patient-specific graft placement based on the inclination of the intercondylar roof has been proposed. The relationship between the intercondylar roof and native ACL footprint on the tibia has not previously been quantified.

Hypothesis: No statistical relationship exists between the intercondylar roof angle and the location of the native footprint of the ACL on the tibia.

Study Design: Case series; Level of evidence, 4.

Methods: Knees from 138 patients with both lateral radiographs and MRI, without a history of ligamentous injury or fracture, were reviewed to measure the intercondylar roof angle of the femur. Roof angles were measured on lateral radiographs. The MRI data of the same knees were analyzed to measure the position of the central tibial footprint of the ACL (cACL). The roof angle and tibial footprint were evaluated to determine if statistical relationships existed.

Results: Patients had a mean ± SD age of 40 ± 16 years. Average roof angle was 34.7° ± 5.2° (range, 23°-48°; 95% CI, 33.9°-35.5°), and it differed by sex but not by side (right/left). The cACL was 44.1% ± 3.4% (range, 36.1%-51.9%; 95% CI, 43.2%-45.0%) of the anteroposterior length of the tibia. There was only a weak correlation between the intercondylar roof angle and the cACL (R = 0.106). No significant differences arose between subpopulations of sex or side.

Conclusion: The tibial footprint of the ACL is located in a position on the tibia that is consistent and does not vary according to intercondylar roof angle. The cACL is consistently located between 43.2% and 45.0% of the anteroposterior length of the tibia. Intercondylar roof–based guidance may not predictably place a tibial tunnel in the native ACL footprint. Use of a generic ACL footprint to place a tibial tunnel during ACL reconstruction may be reliable in up to 95% of patients.

Evaluation of a post-processing approach for multiscale analysis of biphasic mechanics of chondrocytes
S.C. Sibole, S.A. Maas, J.P. Halloran, J.A. Weiss, A. Erdemir. In Computer Methods in Biomechanical and Biomedical Engineering, Vol. 16, No. 10, pp. 1112--1126. 2013.
DOI: 10.1080/10255842.2013.809711
PubMed ID: 23809004

Understanding the mechanical behaviour of chondrocytes as a result of cartilage tissue mechanics has significant implications for both evaluation of mechanobiological function and to elaborate on damage mechanisms. A common procedure for prediction of chondrocyte mechanics (and of cell mechanics in general) relies on a computational post-processing approach where tissue-level deformations drive cell-level models. Potential loss of information in this numerical coupling approach may cause erroneous cellular-scale results, particularly during multiphysics analysis of cartilage. The goal of this study was to evaluate the capacity of first- and second-order data passing to predict chondrocyte mechanics by analysing cartilage deformations obtained for varying complexity of loading scenarios. A tissue-scale model with a sub-region incorporating representation of chondron size and distribution served as control. The post-processing approach first required solution of a homogeneous tissue-level model, results of which were used to drive a separate cell-level model (same characteristics as the sub-region of control model). The first-order data passing appeared to be adequate for simplified loading of the cartilage and for a subset of cell deformation metrics, for example, change in aspect ratio. The second-order data passing scheme was more accurate, particularly when asymmetric permeability of the tissue boundaries was considered. Yet, the method exhibited limitations for predictions of instantaneous metrics related to the fluid phase, for example, mass exchange rate. Nonetheless, employing higher order data exchange schemes may be necessary to understand the biphasic mechanics of cells under lifelike tissue loading states for the whole time history of the simulation.

Evaluation of Current Algorithms for Segmentation of Scar Tissue from Late Gadolinium Enhancement Cardiovascular Magnetic Resonance of the Left Atrium: An Open-Access Grand Challenge
R. Karim, R.J. Housden, M. Balasubramaniam, Z. Chen, D. Perry, A. Uddin, Y. Al-Beyatti, E. Palkhi, P. Acheampong, S. Obom, A. Hennemuth, Y. Lu, W. Bai, W. Shi, Y. Gao, H.-O. Peitgen, P. Radau, R. Razavi, A. Tannenbaum, D. Rueckert, J. Cates, T. Schaeffter, D. Peters, R.S. MacLeod, K. Rhode. In Journal of Cardiovascular Magnetic Resonance, Vol. 15, No. 105, 2013.
DOI: 10.1186/1532-429X-15-105

Background: Late Gadolinium enhancement (LGE) cardiovascular magnetic resonance (CMR) imaging can be used to visualise regions of fibrosis and scarring in the left atrium (LA) myocardium. This can be important for treatment stratification of patients with atrial fibrillation (AF) and for assessment of treatment after radio frequency catheter ablation (RFCA). In this paper we present a standardised evaluation benchmarking framework for algorithms segmenting fibrosis and scar from LGE CMR images. The algorithms reported are the response to an open challenge that was put to the medical imaging community through an ISBI (IEEE International Symposium on Biomedical Imaging) workshop.

Methods: The image database consisted of 60 multicenter, multivendor LGE CMR image datasets from patients with AF, with 30 images taken before and 30 after RFCA for the treatment of AF. A reference standard for scar and fibrosis was established by merging manual segmentations from three observers. Furthermore, scar was also quantified using 2, 3 and 4 standard deviations (SD) and full-width-at-half-maximum (FWHM) methods. Seven institutions responded to the challenge: Imperial College (IC), Mevis Fraunhofer (MV), Sunnybrook Health Sciences (SY), Harvard/Boston University (HB), Yale School of Medicine (YL), King’s College London (KCL) and Utah CARMA (UTA, UTB). There were 8 different algorithms evaluated in this study.

Results: Some algorithms were able to perform significantly better than SD and FWHM methods in both pre- and post-ablation imaging. Segmentation in pre-ablation images was challenging and good correlation with the reference standard was found in post-ablation images. Overlap scores (out of 100) with the reference standard were as follows: Pre: IC = 37, MV = 22, SY = 17, YL = 48, KCL = 30, UTA = 42, UTB = 45; Post: IC = 76, MV = 85, SY = 73, HB = 76, YL = 84, KCL = 78, UTA = 78, UTB = 72.

Conclusions: The study concludes that currently no algorithm is deemed clearly better than others. There is scope for further algorithmic developments in LA fibrosis and scar quantification from LGE CMR images. Benchmarking of future scar segmentation algorithms is thus important. The proposed benchmarking framework is made available as open-source and new participants can evaluate their algorithms via a web-based interface.

Keywords: Late gadolinium enhancement, Cardiovascular magnetic resonance, Atrial fibrillation, Segmentation, Algorithm benchmarking

Investigating Applications Portability with the Uintah DAG-based Runtime System on PetaScale Supercomputers
Q. Meng, A. Humphrey, J. Schmidt, M. Berzins. In Proceedings of SC13: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 96:1--96:12. 2013.
ISBN: 978-1-4503-2378-9
DOI: 10.1145/2503210.2503250

Present trends in high performance computing present formidable challenges for applications code using multicore nodes possibly with accelerators and/or co-processors and reduced memory while still attaining scalability. Software frameworks that execute machine-independent applications code using a runtime system that shields users from architectural complexities offer a possible solution. The Uintah framework for example, solves a broad class of large-scale problems on structured adaptive grids using fluid-flow solvers coupled with particle-based solids methods. Uintah executes directed acyclic graphs of computational tasks with a scalable asynchronous and dynamic runtime system for CPU cores and/or accelerators/co-processors on a node. Uintah's clear separation between application and runtime code has led to scalability increases of 1000x without significant changes to application code. This methodology is tested on three leading Top500 machines; OLCF Titan, TACC Stampede and ALCF Mira using three diverse and challenging applications problems. This investigation of scalability with regard to the different processors and communications performance leads to the overall conclusion that the adaptive DAG-based approach provides a very powerful abstraction for solving challenging multi-scale multi-physics engineering problems on some of the largest and most powerful computers available today.

Keywords: Blue Gene/Q, GPU, Xeon Phi, adaptive, application, co-processor, heterogeneous systems, hybrid parallelism, parallel, scalability, software, uintah, NETL

Preliminary Experiences with the Uintah Framework on Intel Xeon Phi and Stampede
Q. Meng, A. Humphrey, J. Schmidt, M. Berzins. In Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery (XSEDE 2013), San Diego, California, pp. 48:1--48:8. 2013.
DOI: 10.1145/2484762.2484779

In this work, we describe our preliminary experiences on the Stampede system in the context of the Uintah Computational Framework. Uintah was developed to provide an environment for solving a broad class of fluid-structure interaction problems on structured adaptive grids. Uintah uses a combination of fluid-flow solvers and particle-based methods, together with a novel asynchronous task-based approach and fully automated load balancing. While we have designed scalable Uintah runtime systems for large CPU core counts, the emergence of heterogeneous systems presents considerable challenges in terms of effectively utilizing additional on-node accelerators and co-processors, deep memory hierarchies, as well as managing multiple levels of parallelism. Our recent work has addressed the emergence of heterogeneous CPU/GPU systems with the design of a Unified heterogeneous runtime system, enabling Uintah to fully exploit these architectures with support for asynchronous, out-of-order scheduling of both CPU and GPU computational tasks. Using this design, Uintah has run at full scale on the Keeneland System and TitanDev. With the release of the Intel Xeon Phi co-processor and the recent availability of the Stampede system, we show that Uintah may be modified to utilize such a co-processor based system. We also explore the different usage models provided by the Xeon Phi with the aim of understanding portability of a general purpose framework like Uintah to this architecture. These usage models range from the pragma based offload model to the more complex symmetric model, utilizing all co-processor and host CPU cores simultaneously. We provide preliminary results of the various usage models for a challenging adaptive mesh refinement problem, as well as a detailed account of our experience adapting Uintah to run on the Stampede system. Our conclusion is that while the Stampede system is easy to use, obtaining high performance from the Xeon Phi co-processors requires a substantial but different investment to that needed for GPU-based systems.

Keywords: MIC, Xeon Phi, adaptive, co-processor, heterogeneous systems, hybrid parallelism, parallel, scalability, stampede, uintah, c-safe

Multiscale Modeling of Accidental Explosions and Detonations
J. Beckvermit, J. Peterson, T. Harman, S. Bardenhagen, C. Wight, Q. Meng, M. Berzins. In Computing in Science and Engineering, Vol. 15, No. 4, pp. 76--86. 2013.
DOI: 10.1109/MCSE.2013.89

Accidental explosions are exceptionally dangerous and costly, both in lives and money. Regarding world-wide conflict with small arms and light weapons, the Small Arms Survey has recorded over 297 accidental explosions in munitions depots across the world that have resulted in thousands of deaths and billions of dollars in damage in the past decade alone [45]. As the recent fertilizer plant explosion that killed 15 people in West, Texas demonstrates, accidental explosions are not limited to military operations. Transportation accidents also pose risks, as illustrated by the occasional train derailment/explosion in the nightly news, or the semi-truck explosion detailed in the following section. Unlike other industrial accident scenarios, explosions can easily affect the general public, a dramatic example being the PEPCON disaster in 1988, where windows were shattered, doors blown off their hinges, and flying glass and debris caused injuries up to 10 miles away.

While the relative rarity of accidental explosions speaks well of our understanding to date, their violence rightly gives us pause. A better understanding of these materials is clearly still needed, but a significant barrier is the complexity of these materials and the various length scales involved. In typical military applications, explosives are known to be ignited by the coalescence of hot spots which occur on micrometer scales. Whether this reaction remains a deflagration (burning) or builds to a detonation depends both on the stimulus and the boundary conditions or level of confinement. Boundary conditions are typically on the scale of engineered parts, approximately meters. Additional dangers are present at the scale of trucks and factories. The interaction of various entities, such as barrels of fertilizer or crates of detonators, admits the possibility of a sympathetic detonation, i.e. the unintended detonation of one entity by the explosion of another, generally caused by an explosive shock wave or blast fragments.

While experimental work has been and will continue to be critical to developing our fundamental understanding of explosive initiation, de agration and detonation, there is no practical way to comprehensively assess safety on the scale of trucks and factories experimentally. The scenarios are too diverse and the costs too great. Numerical simulation provides a complementary tool that, with the steadily increasing computational power of the past decades, makes simulations at this scale begin to look plausible. Simulations at both the micrometer scale, the "mesoscale", and at the scale of engineered parts, the "macro-scale", have been contributing increasingly to our understanding of these materials. Still, simulations on this scale require both massively parallel computational infrastructure and selective sampling of mesoscale response, i.e. advanced computational tools and modeling. The computational framework Uintah [1] has been developed for exactly this purpose.

Keywords: uintah, c-safe, accidents, explosions, military computing, risk analysis

Rethinking Abstractions for Big Data: Why, Where, How, and What
M. Hall, R.M. Kirby, F. Li, M.D. Meyer, V. Pascucci, J.M. Phillips, R. Ricci, J. Van der Merwe, S. Venkatasubramanian. In Cornell University Library, 2013.

Big data refers to large and complex data sets that, under existing approaches, exceed the capacity and capability of current compute platforms, systems software, analytical tools and human understanding [7]. Numerous lessons on the scalability of big data can already be found in asymptotic analysis of algorithms and from the high-performance computing (HPC) and applications communities. However, scale is only one aspect of current big data trends; fundamentally, current and emerging problems in big data are a result of unprecedented complexity |in the structure of the data and how to analyze it, in dealing with unreliability and redundancy, in addressing the human factors of comprehending complex data sets, in formulating meaningful analyses, and in managing the dense, power-hungry data centers that house big data.

The computer science solution to complexity is finding the right abstractions, those that hide as much triviality as possible while revealing the essence of the problem that is being addressed. The "big data challenge" has disrupted computer science by stressing to the very limits the familiar abstractions which define the relevant subfields in data analysis, data management and the underlying parallel systems. Efficient processing of big data has shifted systems towards increasingly heterogeneous and specialized units, with resilience and energy becoming important considerations. The design and analysis of algorithms must now incorporate emerging costs in communicating data driven by IO costs, distributed data, and the growing energy cost of these operations. Data analysis representations as structural patterns and visualizations surpass human visual bandwidth, structures studied at small scale are rare at large scale, and large-scale high-dimensional phenomena cannot be reproduced at small scale.

As a result, not enough of these challenges are revealed by isolating abstractions in a traditional soft-ware stack or standard algorithmic and analytical techniques, and attempts to address complexity either oversimplify or require low-level management of details. The authors believe that the abstractions for big data need to be rethought, and this reorganization needs to evolve and be sustained through continued cross-disciplinary collaboration.

In what follows, we first consider the question of why big data and why now. We then describe the where (big data systems), the how (big data algorithms), and the what (big data analytics) challenges that we believe are central and must be addressed as the research community develops these new abstractions. We equate the biggest challenges that span these areas of big data with big mythological creatures, namely cyclops, that should be conquered.

Statistical Shape Modeling of Cam Femoroacetabular Impingement
M.D. Harris, M. Datar, R.T. Whitaker, E.R. Jurrus, C.L. Peters, A.E. Anderson. In Journal of Orthopaedic Research, Vol. 31, No. 10, pp. 1620--1626. 2013.
DOI: 10.1002/jor.22389

Statistical shape modeling (SSM) was used to quantify 3D variation and morphologic differences between femurs with and without cam femoroacetabular impingement (FAI). 3D surfaces were generated from CT scans of femurs from 41 controls and 30 cam FAI patients. SSM correspondence particles were optimally positioned on each surface using a gradient descent energy function. Mean shapes for groups were defined. Morphological differences between group mean shapes and between the control mean and individual patients were calculated. Principal component analysis described anatomical variation. Among all femurs, the first six modes (or principal components) captured significant variations, which comprised 84% of cumulative variation. The first two modes, which described trochanteric height and femoral neck width, were significantly different between groups. The mean cam femur shape protruded above the control mean by a maximum of 3.3 mm with sustained protrusions of 2.5–3.0 mm along the anterolateral head-neck junction/distal anterior neck. SSM described variations in femoral morphology that corresponded well with areas prone to damage. Shape variation described by the first two modes may facilitate objective characterization of cam FAI deformities; variation beyond may be inherent population variance. SSM could characterize disease severity and guide surgical resection of bone.

A High-Performance Multi-Element Processing Framework on GPUs
SCI Technical Report, L.K. Ha, J. King, Z. Fu, R.M. Kirby. No. UUSCI-2013-005, SCI Institute, University of Utah, 2013.

Many computational engineering problems ranging from finite element methods to image processing involve the batch processing on a large number of data items. While multielement processing has the potential to harness computational power of parallel systems, current techniques often concentrate on maximizing elemental performance. Frameworks that take this greedy optimization approach often fail to extract the maximum processing power of the system for multi-element processing problems. By ultilizing the knowledge that the same operation will be accomplished on a large number of items, we can organize the computation to maximize the computational throughput available in parallel streaming hardware. In this paper, we analyzed weaknesses of existing methods and we proposed efficient parallel programming patterns implemented in a high performance multi-element processing framework to harness the processing power of GPUs. Our approach is capable of levering out the performance curve even on the range of small element size.

Lateral ventricle morphology analysis via mean latitude axis
B. Paniagua, A. Lyall, J.-B. Berger, C. Vachet, R.M. Hamer, S. Woolson, W. Lin, J. Gilmore, M. Styner. In Proceedings of SPIE 8672, Biomedical Applications in Molecular, Structural, and Functional Imaging, 86720M, 2013.
DOI: 10.1117/12.2006846
PubMed ID: 23606800
PubMed Central ID: PMC3630372

Statistical shape analysis has emerged as an insightful method for evaluating brain structures in neuroimaging studies, however most shape frameworks are surface based and thus directly depend on the quality of surface alignment. In contrast, medial descriptions employ thickness information as alignment-independent shape metric. We propose a joint framework that computes local medial thickness information via a mean latitude axis from the well-known spherical harmonic (SPHARM-PDM) shape framework. In this work, we applied SPHARM derived medial representations to the morphological analysis of lateral ventricles in neonates. Mild ventriculomegaly (MVM) subjects are compared to healthy controls to highlight the potential of the methodology. Lateral ventricles were obtained from MRI scans of neonates (9- 144 days of age) from 30 MVM subjects as well as age- and sex-matched normal controls (60 total). SPHARM-PDM shape analysis was extended to compute a mean latitude axis directly from the spherical parameterization. Local thickness and area was straightforwardly determined. MVM and healthy controls were compared using local MANOVA and compared with the traditional SPHARM-PDM analysis. Both surface and mean latitude axis findings differentiate successfully MVM and healthy lateral ventricle morphology. Lateral ventricles in MVM neonates show enlarged shapes in tail and head. Mean latitude axis is able to find significant differences all along the lateral ventricle shape, demonstrating that local thickness analysis provides significant insight over traditional SPHARM-PDM. This study is the first to precisely quantify 3D lateral ventricle morphology in MVM neonates using shape analysis.

Modeling 4D changes in pathological anatomy using domain adaptation: analysis of TBI imaging using a tumor database
Bo Wang, M. Prastawa, A. Saha, S.P. Awate, A. Irimia, M.C. Chambers, P.M. Vespa, J.D. Van Horn, V. Pascucci, G. Gerig. In Proceedings of the 2013 MICCAI-MBIA Workshop, Lecture Notes in Computer Science (LNCS), Vol. 8159, Note: Awarded Best Paper!, pp. 31--39. 2013.
DOI: 10.1007/978-3-319-02126-3_4

Analysis of 4D medical images presenting pathology (i.e., lesions) is signi cantly challenging due to the presence of complex changes over time. Image analysis methods for 4D images with lesions need to account for changes in brain structures due to deformation, as well as the formation and deletion of new structures (e.g., edema, bleeding) due to the physiological processes associated with damage, intervention, and recovery. We propose a novel framework that models 4D changes in pathological anatomy across time, and provides explicit mapping from a healthy template to subjects with pathology. Moreover, our framework uses transfer learning to leverage rich information from a known source domain, where we have a collection of completely segmented images, to yield effective appearance models for the input target domain. The automatic 4D segmentation method uses a novel domain adaptation technique for generative kernel density models to transfer information between different domains, resulting in a fully automatic method that requires no user interaction. We demonstrate the effectiveness of our novel approach with the analysis of 4D images of traumatic brain injury (TBI), using a synthetic tumor database as the source domain.

Investigating Applications Portability with the Uintah DAG-based Runtime System on PetaScale Supercomputers
SCI Technical Report, Q. Meng, A. Humphrey, J. Schmidt, M. Berzins. No. UUSCI-2013-003, SCI Institute, University of Utah, 2013.

Present trends in high performance computing present formidable challenges for applications code using multicore nodes possibly with accelerators and/or co-processors and reduced memory while still attaining scalability. Software frameworks that execute machineindependent applications code using a runtime system that shields users from architectural complexities offer a possible solution. The Uintah framework for example, solves a broad class of large-scale problems on structured adaptive grids using fluid-flow solvers coupled with particle-based solids methods. Uintah executes directed acyclic graphs of computational tasks with a scalable asynchronous and dynamic runtime system for CPU cores and/or accelerators/coprocessors on a node. Uintah's clear separation between application and runtime code has led to scalability increases of 1000x without significant changes to application code. This methodology is tested on three leading Top500 machines; OLCF Titan, TACC Stampede and ALCF Mira using three diverse and challenging applications problems. This investigation of scalability with regard to the different processors and communications performance leads to the overall conclusion that the adaptive DAG-based approach provides a very powerful abstraction for solving challenging multiscale multi-physics engineering problems on some of the largest and most powerful computers available today.

Keywords: Uintah, hybrid parallelism, scalability, parallel, adaptive, MIC, Xeon Phi, heterogeneous systems, Stampede, co-processor

A new discrete element analysis method for predicting hip joint contact stresses
C.L. Abraham, S.A. Maas, J.A. Weiss, B.J. Ellis, C.L. Peters, A.E. Anderson. In Journal of Biomechanics, Vol. 46, No. 6, pp. 1121--1127. 2013.
DOI: 10.1016/j.jbiomech.2013.01.012

Quantifying cartilage contact stress is paramount to understanding hip osteoarthritis. Discrete element analysis (DEA) is a computationally efficient method to estimate cartilage contact stresses. Previous applications of DEA have underestimated cartilage stresses and yielded unrealistic contact patterns because they assumed constant cartilage thickness and/or concentric joint geometry. The study objectives were to: (1) develop a DEA model of the hip joint with subject-specific bone and cartilage geometry, (2) validate the DEA model by comparing DEA predictions to those of a validated finite element analysis (FEA) model, and (3) verify both the DEA and FEA models with a linear-elastic boundary value problem. Springs representing cartilage in the DEA model were given lengths equivalent to the sum of acetabular and femoral cartilage thickness and gap distance in the FEA model. Material properties and boundary/loading conditions were equivalent. Walking, descending, and ascending stairs were simulated. Solution times for DEA and FEA models were ∼7 s and ∼65 min, respectively. Irregular, complex contact patterns predicted by DEA were in excellent agreement with FEA. DEA contact areas were 7.5%, 9.7% and 3.7% less than FEA for walking, descending stairs, and ascending stairs, respectively. DEA models predicted higher peak contact stresses (9.8–13.6 MPa) and average contact stresses (3.0–3.7 MPa) than FEA (6.2–9.8 and 2.0–2.5 MPa, respectively). DEA overestimated stresses due to the absence of the Poisson's effect and a direct contact interface between cartilage layers. Nevertheless, DEA predicted realistic contact patterns when subject-specific bone geometry and cartilage thickness were used. This DEA method may have application as an alternative to FEA for pre-operative planning of joint-preserving surgery such as acetabular reorientation during peri-acetabular osteotomy.

Three-dimensional Quantification of Femoral Head Shape in Controls and Patients with Cam-type Femoroacetabular Impingement
M.D. Harris, S.P. Reese, C.L. Peters, J.A. Weiss, A.E. Anderson. In Annals of Biomedical Engineering, Vol. 41, No. 6, pp. 1162--1171. 2013.
DOI: 10.1007/s10439-013-0762-1

An objective measurement technique to quantify 3D femoral head shape was developed and applied to normal subjects and patients with cam-type femoroacetabular impingement (FAI). 3D reconstructions were made from high-resolution CT images of 15 cam and 15 control femurs. Femoral heads were fit to ideal geometries consisting of rotational conchoids and spheres. Geometric similarity between native femoral heads and ideal shapes was quantified. The maximum distance native femoral heads protruded above ideal shapes and the protrusion area were measured. Conchoids provided a significantly better fit to native femoral head geometry than spheres for both groups. Cam-type FAI femurs had significantly greater maximum deviations (4.99 ± 0.39 mm and 4.08 ± 0.37 mm) than controls (2.41 ± 0.31 mm and 1.75 ± 0.30 mm) when fit to spheres or conchoids, respectively. The area of native femoral heads protruding above ideal shapes was significantly larger in controls when a lower threshold of 0.1 mm (for spheres) and 0.01 mm (for conchoids) was used to define a protrusion. The 3D measurement technique described herein could supplement measurements of radiographs in the diagnosis of cam-type FAI. Deviations up to 2.5 mm from ideal shapes can be expected in normal femurs while deviations of 4–5 mm are characteristic of cam-type FAI.

Synergistic Challenges in Data-Intensive Science and Exascale Computing
J. Chen, A. Choudhary, S. Feldman, B. Hendrickson, C.R. Johnson, R. Mount, V. Sarkar, V. White, D. Williams. Note: Summary Report of the Advanced Scientific Computing Advisory Committee (ASCAC) Subcommittee, March, 2013.

The ASCAC Subcommittee on Synergistic Challenges in Data-Intensive Science and Exascale Computing has reviewed current practice and future plans in multiple science domains in the context of the challenges facing both Big Data and the Exascale Computing. challenges. The review drew from public presentations, workshop reports and expert testimony. Data-intensive research activities are increasing in all domains of science, and exascale computing is a key enabler of these activities. We briefly summarize below the key findings and recommendations from this report from the perspective of identifying investments that are most likely to positively impact both data-intensive science goals and exascale computing goals.

Preliminary Experiences with the Uintah Framework on Intel Xeon Phi and Stampede
SCI Technical Report, Q. Meng, A. Humphrey, J. Schmidt, M. Berzins. No. UUSCI-2013-002, SCI Institute, University of Utah, 2013.

In this work, we describe our preliminary experiences on the Stampede system in the context of the Uintah Computational Framework. Uintah was developed to provide an environment for solving a broad class of fluid-structure interaction problems on structured adaptive grids. Uintah uses a combination of fluid-flow solvers and particle-based methods, together with a novel asynchronous taskbased approach and fully automated load balancing. While we have designed scalable Uintah runtime systems for large CPU core counts, the emergence of heterogeneous systems presents considerable challenges in terms of effectively utilizing additional on-node accelerators and co-processors, deep memory hierarchies, as well as managing multiple levels of parallelism. Our recent work has addressed the emergence of heterogeneous CPU/GPU systems with the design of a Unified heterogeneous runtime system, enabling Uintah to fully exploit these architectures with support for asynchronous, out-of-order scheduling of both CPU and GPU computational tasks. Using this design, Uintah has run at full scale on the Keeneland System and TitanDev. With the release of the Intel Xeon Phi co-processor and the recent availability of the Stampede system, we show that Uintah may be modified to utilize such a coprocessor based system. We also explore the different usage models provided by the Xeon Phi with the aim of understanding portability of a general purpose framework like Uintah to this architecture. These usage models range from the pragma based offload model to the more complex symmetric model, utilizing all co-processor and host CPU cores simultaneously. We provide preliminary results of the various usage models for a challenging adaptive mesh refinement problem, as well as a detailed account of our experience adapting Uintah to run on the Stampede system. Our conclusion is that while the Stampede system is easy to use, obtaining high performance from the Xeon Phi co-processors requires a substantial but different investment to that needed for GPU-based systems.

Keywords: Uintah, hybrid parallelism, scalability, parallel, adaptive, MIC, Xeon Phi, heterogeneous systems, Stampede, co-processor

Applying high-performance computing to petascale explosive simulations
J.R. Peterson, C.A. Wight, M. Berzins. In Procedia Computer Science, 2013.

Hazardous scenarios involving explosives are difficult to experimentally study and simulation is often the only viable approach to study highly reactive phenomena. Explosive simulations are computationally expensive, requiring supercomputing resources for continued scientific discovery in the field. Here an idealized mesoscale simulation of explosive grains under mechanical insult by a high-speed projectile with reaction represented by a novel kinetic model is designed to test the scalability of the Uintah software on petascale supercomputers. Good scalability is found up to 49K processors. Timing breakdown of computational tasks are determined with relocation of Lagrangian particles and interpolation of those particles to the grid identified as the most expensive operation and ideal for optimization. Potential optimization strategies are identified. Realistic model simulations rather than toy model simulations are found to better represent scalability of a science code on a supercomputer. Estimations for total supercomputer hours necessary to complete the kinetic model validation study are reported.

Keywords: Energetic Material Hazards, Uintah, MPM, ICE, MPMICE, Scalable Parallelism, C-SAFE

Page 15 of 28

Start
Prev
10
11
12
13
14
15
16
17
18
19
Next
End

SCI