RISE: Reducing I/O Contention in Staging-based Extreme-Scale In-situ Workflows, In 2021 IEEE International Conference on Cluster Computing (CLUSTER), pp. 146--156. 2021.P. Subedi, P.E .Davis, M. Parashar.
While in-situ workflow formulations have addressed some of the data-related challenges associated with extreme-scale scientific workflows, these workflows involve complex interactions and different modes of data exchange. In the context of increasing system complexity, such workflows present significant resource management challenges, requiring complex cost-performance tradeoffs. This paper presents RISE, an intelligent staging-based data management middleware, which builds on the DataSpaces framework and performs intelligent scheduling of data management operations to reduce I/O contention. In RISE, data are always written immediately to local buffers to reduce the effect of the transfer impact upon application performance. RISE identifies applications’ data access patterns and moves data towards data consumers only when the network is expected to be idle, reducing the impact of asynchronous …
E. Suchyta, S. Klasky, N. Podhorszki, M. Wolf, A. Adesoji, C.S. Chang, J. Choi, P. E. Davis, J. Dominski, S. Ethier, I. Foster, K. Germaschewski, B. Geveci, C. Harris, K. A. Huck, Q. Liu, J. Logan, K. Mehta, G. Merlo, S. V. Moore, T. Munson, M. Parashar, D. Pugmire, M. S. Shephard, C. W. Smith, P. Subedi, L. Wan, R. Wang, S. Zhang. The Exascale Framework for High Fidelity coupled Simulations (EFFIS): Enabling whole device modeling in fusion science, In The International Journal of High Performance Computing Applications, SAGE Publications, pp. 10943420211019119. 2021.
We present the Exascale Framework for High Fidelity coupled Simulations (EFFIS), a workflow and code coupling framework developed as part of the Whole Device Modeling Application (WDMApp) in the Exascale Computing Project.EFFIS consists of a library, command line utilities, and a collection of run-time daemons. Together, these software products enable users to easily compose and execute workflows that include: strong or weak coupling, in situ (or offline)analysis/visualization/monitoring, command-and-control actions, remote dashboard integration, and more. We describe WDMApp physics coupling cases and computer science requirements that motivate the design of the EFFIS framework. Furthermore, we explain the essential enabling technology that EFFIS leverages: ADIOS for performant data movement, PerfStubs/TAU for performance monitoring, and an advanced COUPLER for transforming coupling data from its native format to the representation needed by another application. Finally, we demonstrate EFFIS using coupled multi-simulation WDMApp workflows and exemplify how the framework supports the project’s needs. We show that EFFIS and its associated services for data movement, visualization, and performance collection does not introduce appreciable overhead to the WDMApp workflow and that the resource-dominant application’s idle time while waiting for data is minimal.
T. Sun, D. Li, B. Wang. Decentralized Federated Averaging, Subtitled arXiv preprint arXiv:2104.11375, 2021.
Federated averaging (FedAvg) is a communication efficient algorithm for the distributed training with an enormous number of clients. In FedAvg, clients keep their data locally for privacy protection; a central parameter server is used to communicate between clients. This central server distributes the parameters to each client and collects the updated parameters from clients. FedAvg is mostly studied in centralized fashions, which requires massive communication between server and clients in each communication. Moreover, attacking the central server can break the whole system's privacy. In this paper, we study the decentralized FedAvg with momentum (DFedAvgM), which is implemented on clients that are connected by an undirected graph. In DFedAvgM, all clients perform stochastic gradient descent with momentum and communicate with their neighbors only. To further reduce the communication cost, we also consider the quantized DFedAvgM. We prove convergence of the (quantized) DFedAvgM under trivial assumptions; the convergence rate can be improved when the loss function satisfies the P\L property. Finally, we numerically verify the efficacy of DFedAvgM.
T. Sun, D. Li, B. Wang. Stability and Generalization of the Decentralized Stochastic Gradient Descent, Subtitled arXiv preprint arXiv:2102.01302, 2021.
The stability and generalization of stochastic gradient-based methods provide valuable insights into understanding the algorithmic performance of machine learning models. As the main workhorse for deep learning, stochastic gradient descent has received a considerable amount of studies. Nevertheless, the community paid little attention to its decentralized variants. In this paper, we provide a novel formulation of the decentralized stochastic gradient descent. Leveraging this formulation together with (non) convex optimization theory, we establish the first stability and generalization guarantees for the decentralized stochastic gradient descent. Our theoretical results are built on top of a few common and mild assumptions and reveal that the decentralization deteriorates the stability of SGD for the first time. We verify our theoretical findings by using a variety of decentralized settings and benchmark machine learning models.
A Gaussian Process Model for Unsupervised Analysis of High Dimensional Shape Data, In Machine Learning in Medical Imaging, Springer International Publishing, pp. 356--365. 2021.
Applications of medical image analysis are often faced with the challenge of modelling high-dimensional data with relatively few samples. In many settings, normal or healthy samples are prevalent while pathological samples are rarer, highly diverse, and/or difficult to model. In such cases, a robust model of the normal population in the high-dimensional space can be useful for characterizing pathologies. In this context, there is utility in hybrid models, such as probabilistic PCA, which learns a low-dimensional model, commensurates with the available data, and combines it with a generic, isotropic noise model for the remaining dimensions. However, the isotropic noise model ignores the inherent correlations that are evident in so many high-dimensional data sets associated with images and shapes in medicine. This paper describes a method for estimating a Gaussian model for collections of images or shapes that exhibit underlying correlations, e.g., in the form of smoothness. The proposed method incorporates a Gaussian-process noise model within a generative formulation. For optimization, we derive a novel expectation maximization (EM) algorithm. We demonstrate the efficacy of the method on synthetic examples and on anatomical shape data.
Uncertainty Quantification of the Effects of Segmentation Variability in ECGI, In Functional Imaging and Modeling of the Heart, Springer International Publishing, pp. 515--522. 2021.
Despite advances in many of the techniques used in Electrocardiographic Imaging (ECGI), uncertainty remains insufficiently quantified for many aspects of the pipeline. The effect of geometric uncertainty, particularly due to segmentation variability, may be the least explored to date. We use statistical shape modeling and uncertainty quantification (UQ) to compute the effect of segmentation variability on ECGI solutions. The shape model was made with Shapeworks from nine segmentations of the same patient and incorporated into an ECGI pipeline. We computed uncertainty of the pericardial potentials and local activation times (LATs) using polynomial chaos expansion (PCE) implemented in UncertainSCI. Uncertainty in pericardial potentials from segmentation variation mirrored areas of high variability in the shape model, near the base of the heart and the right ventricular outflow tract, and that ECGI was less sensitive to uncertainty in the posterior region of the heart. Subsequently LAT calculations could vary dramatically due to segmentation variability, with a standard deviation as high as 126ms, yet mainly in regions with low conduction velocity. Our shape modeling and UQ pipeline presented possible uncertainty in ECGI due to segmentation variability and can be used by researchers to reduce said uncertainty or mitigate its effects. The demonstrated use of statistical shape modeling and UQ can also be extended to other types of modeling pipelines.
J. Tate, S. Rampersad, C. Charlebois, Z. Liu, J. Bergquist, D. White, L. Rupp, D. Brooks, A. Narayan, R. MacLeod. Uncertainty Quantification in Brain Stimulation using UncertainSCI, In Brain Stimulation: Basic, Translational, and Clinical Research in Neuromodulation, Vol. 14, No. 6, Elsevier, pp. 1659-1660. 2021.
Predicting the effects of brain stimulation with computer models presents many challenges, including estimating the possible error from the propagation of uncertain input parameters through the model. Quantification and control of these errors through uncertainty quantification (UQ) provide statistics on the likely impact of parameter variation on solution accuracy, including total variance and sensitivity associated to each parameter. While the need and importance of UQ in clinical modeling is generally accepted, tools for implementing UQ techniques remain limited or inaccessible for many researchers.
M. Thorpe, B. Wang. Robust Certification for Laplace Learning on Geometric Graphs, Subtitled arXiv preprint arXiv:2104.10837, 2021.
Graph Laplacian (GL)-based semi-supervised learning is one of the most used approaches for classifying nodes in a graph. Understanding and certifying the adversarial robustness of machine learning (ML) algorithms has attracted large amounts of attention from different research communities due to its crucial importance in many security-critical applied domains. There is great interest in the theoretical certification of adversarial robustness for popular ML algorithms. In this paper, we provide the first adversarial robust certification for the GL classifier. More precisely we quantitatively bound the difference in the classification accuracy of the GL classifier before and after an adversarial attack. Numerically, we validate our theoretical certification results and show that leveraging existing adversarial defenses for the -nearest neighbor classifier can remarkably improve the robustness of the GL classifier.
J. P. Torres, Z. Lin, M. Watkins, P. F. Salcedo, R. P. Baskin, S. Elhabian, H. Safavi-Hemami, D. Taylor, J. Tun, G. P. Concepcion, N. Saguil, A. A. Yanagihara, Y. Fang, J. R. McArthur, H. Tae, R. K. Finol-Urdaneta, B. D. Özpolat, B. M. Olivera, E. W. Schmidt. Small-molecule mimicry hunting strategy in the imperial cone snail, Conus imperialis, In Science Advances, Vol. 7, No. 11, American Association for the Advancement of Science, 2021.
Venomous animals hunt using bioactive peptides, but relatively little is known about venom small molecules and the resulting complex hunting behaviors. Here, we explored the specialized metabolites from the venom of the worm-hunting cone snail, Conus imperialis. Using the model polychaete worm Platynereis dumerilii, we demonstrate that C. imperialis venom contains small molecules that mimic natural polychaete mating pheromones, evoking the mating phenotype in worms. The specialized metabolites from different cone snails are species-specific and structurally diverse, suggesting that the cones may adopt many different prey-hunting strategies enabled by small molecules. Predators sometimes attract prey using the prey’s own pheromones, in a strategy known as aggressive mimicry. Instead, C. imperialis uses metabolically stable mimics of those pheromones, indicating that, in biological mimicry, even the molecules themselves may be disguised, providing a twist on fake news in chemical ecology.
N. Truong, C. Yuksel, C. Watcharopas, J. A. Levine, R. M. Kirby. Particle Merging-and-Splitting, In IEEE Transactions on Visualization and Computer Graphics, IEEE, 2021.
Robustly handling collisions between individual particles in a large particle-based simulation has been a challenging problem. We introduce particle merging-and-splitting, a simple scheme for robustly handling collisions between particles that prevents inter-penetrations of separate objects without introducing numerical instabilities. This scheme merges colliding particles at the beginning of the time-step and then splits them at the end of the time-step. Thus, collisions last for the duration of a time-step, allowing neighboring particles of the colliding particles to influence each other. We show that our merging-and-splitting method is effective in robustly handling collisions and avoiding penetrations in particle-based simulations. We also show how our merging-and-splitting approach can be used for coupling different simulation systems using different and otherwise incompatible integrators. We present simulation tests …
W. Usher, X. Huang, S. Petruzza, S. Kumar, S. R. Slattery, S. T. Reeve, F. Wang, C. R. Johnson,, V. Pascucci. Adaptive Spatially Aware I/O for Multiresolution Particle Data Layouts, In IPDPS, 2021.
V. Vedam-Mai, K. Deisseroth, J. Giordano, G. Lazaro-Munoz, W. Chiong, N. Suthana, J. Langevin, J. Gill, W. Goodman, N. R. Provenza, C. H. Halpern, R. S. Shivacharan, T. N. Cunningham, S. A. Sheth, N. Pouratian, K. W. Scangos, H. S. Mayberg, A. Horn, K. A. Johnson, C. R. Butson, R. Gilron, C. de Hemptinne, R. Wilt, M. Yaroshinsky, S. Little, P. Starr, G. Worrell, P. Shirvalkar, E. Chang, J. Volkmann, M. Muthuraman, S. Groppa, A. A. Kühn, L. Li, M. Johnson, K. J. Otto, R. Raike, S. Goetz, C. Wu, P. Silburn, B. Cheeran, Y. J. Pathak, M. Malekmohammadi, A. Gunduz, J. K. Wong, S. Cernera, A. W. Shukla, A. Ramirez-Zamora, W. Deeb, A. Patterson, K. D. Foote, M. S. Okun.
Proceedings of the Eighth Annual Deep Brain Stimulation Think Tank: Advances in Optogenetics, Ethical Issues Affecting DBS Research, Neuromodulatory Approaches for Depression, Adaptive Neurostimulation, and Emerging DBS Technologies, In Frontiers in Human Neuroscience, Vol. 15, pp. 169. 2021.
We estimate that 208,000 deep brain stimulation (DBS) devices have been implanted to address neurological and neuropsychiatric disorders worldwide. DBS Think Tank presenters pooled data and determined that DBS expanded in its scope and has been applied to multiple brain disorders in an effort to modulate neural circuitry. The DBS Think Tank was founded in 2012 providing a space where clinicians, engineers, researchers from industry and academia discuss current and emerging DBS technologies and logistical and ethical issues facing the field. The emphasis is on cutting edge research and collaboration aimed to advance the DBS field. The Eighth Annual DBS Think Tank was held virtually on September 1 and 2, 2020 (Zoom Video Communications) due to restrictions related to the COVID-19 pandemic. The meeting focused on advances in: (1) optogenetics as a tool for comprehending neurobiology of diseases and on optogenetically-inspired DBS, (2) cutting edge of emerging DBS technologies, (3) ethical issues affecting DBS research and access to care, (4) neuromodulatory approaches for depression, (5) advancing novel hardware, software and imaging methodologies, (6) use of neurophysiological signals in adaptive neurostimulation, and (7) use of more advanced technologies to improve DBS clinical outcomes. There were 178 attendees who participated in a DBS Think Tank survey, which revealed the expansion of DBS into several indications such as obesity, post-traumatic stress disorder, addiction and Alzheimer’s disease. This proceedings summarizes the advances discussed at the Eighth Annual DBS Think Tank.
A. Venkat, A. Gyulassy, G. Kosiba, A. Maiti, H. Reinstein, R. Gee, P.-T. Bremer, V. Pascucci. Towards replacing physical testing of granular materials with a Topology-based Model, Subtitled arXiv preprint arXiv:2109.08777, 2021.
In the study of packed granular materials, the performance of a sample (e.g., the detonation of a high-energy explosive) often correlates to measurements of a fluid flowing through it. The "effective surface area," the surface area accessible to the airflow, is typically measured using a permeametry apparatus that relates the flow conductance to the permeable surface area via the Carman-Kozeny equation. This equation allows calculating the flow rate of a fluid flowing through the granules packed in the sample for a given pressure drop. However, Carman-Kozeny makes inherent assumptions about tunnel shapes and flow paths that may not accurately hold in situations where the particles possess a wide distribution in shapes, sizes, and aspect ratios, as is true with many powdered systems of technological and commercial interest. To address this challenge, we replicate these measurements virtually on micro-CT images of the powdered material, introducing a new Pore Network Model based on the skeleton of the Morse-Smale complex. Pores are identified as basins of the complex, their incidence encodes adjacency, and the conductivity of the capillary between them is computed from the cross-section at their interface. We build and solve a resistive network to compute an approximate laminar fluid flow through the pore structure. We provide two means of estimating flow-permeable surface area: (i) by direct computation of conductivity, and (ii) by identifying dead-ends in the flow coupled with isosurface extraction and the application of the Carman-Kozeny equation, with the aim of establishing consistency over a range of particle shapes, sizes, porosity levels, and void distribution patterns.
B. Wang, D. Zou, Q. Gu, S. J. Osher. Laplacian smoothing stochastic gradient markov chain monte carlo, In SIAM Journal on Scientific Computing, Vol. 43, No. 1, SIAM, pp. A26-A53. 2021.
As an important Markov chain Monte Carlo (MCMC) method, the stochastic gradient Langevin dynamics (SGLD) algorithm has achieved great success in Bayesian learning and posterior sampling. However, SGLD typically suffers from a slow convergence rate due to its large variance caused by the stochastic gradient. In order to alleviate these drawbacks, we leverage the recently developed Laplacian smoothing technique and propose a Laplacian smoothing stochastic gradient Langevin dynamics (LS-SGLD) algorithm. We prove that for sampling from both log-concave and non-log-concave densities, LS-SGLD achieves strictly smaller discretization error in 2-Wasserstein distance, although its mixing rate can be slightly slower. Experiments on both synthetic and real datasets verify our theoretical results and demonstrate the superior performance of LS-SGLD on different machine learning tasks including posterior …
Z. Wang, W. Xing, R. Kirby, S. Zhe. Multi-Fidelity High-Order Gaussian Processes for Physical Simulation, In International Conference on Artificial Intelligence and Statistics, PMLR, pp. 847-855. 2021.
The key task of physical simulation is to solve partial differential equations (PDEs) on discretized domains, which is known to be costly. In particular, high-fidelity solutions are much more expensive than low-fidelity ones. To reduce the cost, we consider novel Gaussian process (GP) models that leverage simulation examples of different fidelities to predict high-dimensional PDE solution outputs. Existing GP methods are either not scalable to high-dimensional outputs or lack effective strategies to integrate multi-fidelity examples. To address these issues, we propose Multi-Fidelity High-Order Gaussian Process (MFHoGP) that can capture complex correlations both between the outputs and between the fidelities to enhance solution estimation, and scale to large numbers of outputs. Based on a novel nonlinear coregionalization model, MFHoGP propagates bases throughout fidelities to fuse information, and places a deep matrix GP prior over the basis weights to capture the (nonlinear) relationships across the fidelities. To improve inference efficiency and quality, we use bases decomposition to largely reduce the model parameters, and layer-wise matrix Gaussian posteriors to capture the posterior dependency and to simplify the computation. Our stochastic variational learning algorithm successfully handles millions of outputs without extra sparse approximations. We show the advantages of our method in several typical applications.
The main objective for understanding fluorescence microscopy data is to investigate and evaluate the fluorescent signal intensity distributions as well as their spatial relationships across multiple channels. The quantitative analysis of 3D fluorescence microscopy data needs interactive tools for researchers to select and focus on relevant biological structures. We developed an interactive tool based on volume visualization techniques and GPU computing for streamlining rapid data analysis. Our main contribution is the implementation of common data quantification functions on streamed volumes, providing interactive analyses on large data without lengthy preprocessing. Data segmentation and quantification are coupled with brushing and executed at an interactive speed. A large volume is partitioned into data bricks, and only user-selected structures are analyzed to constrain the computational load. We designed a framework to assemble a sequence of GPU programs to handle brick borders and stitch analysis results. Our tool was developed in collaboration with domain experts and has been used to identify cell types. We demonstrate a workflow to analyze cells in vestibular epithelia of transgenic mice.
In-situ and in-transit processing alleviate the gap between the computing and I/O capabilities by scheduling data analytics close to the data source. Hybrid in-situ processing splits data analytics into two stages: the data processing that runs in-situ aims to extract regions of interest, which are then transferred to staging services for further in-transit analytics. To facilitate this type of hybrid in-situ processing, the data staging service needs to support complex intermediate data representations generated/consumed by the in-situ tasks. Unstructured (or irregular) mesh is one such derived data representation that is typically used and bridges simulation data and analytics. However, how staging services efficiently support unstructured mesh transfer and processing remains to be explored. This paper investigates design options for transferring and processing unstructured mesh data using staging services. Using polygonal mesh data as an example, we show that hybrid in-situ workflows with staging-based unstructured mesh processing can effectively support hybrid in-situ workflows, and can significantly decrease data movement overheads.
In-situ processing alleviates the gap between computation and I/O capabilities by performing data analysis close to the data source. With simulation data varying in size and content during workflow execution, it becomes necessary for insitu processing to support resource elasticity, i.e., the ability to change resource configurations such as the number of computing nodes/processes during workflow execution. An elastic job may dynamically adjust resource configurations; it may use a few resources at the beginning and more resources towards the end of the job when interesting data appears. However, it is hard to predict a priori how many computing nodes/processes need to be added/removed during the workflow execution to adapt to changing workflow needs. How to efficiently guide elasticity operations, such as growing or shrinking the number of processes used for in-situ analysis during workflow execution, is an open-ended research question. In this paper, we present an adaptive elasticity policy that adopts workflow runtime information collected online to predict how to trigger the addition and removal of processes in order to minimize in-situ processing overheads. We integrate the presented elasticity policy into a staging-based elastic workflow and evaluate its efficiency in multiple elasticity scenarios. The results indicate that an adaptive elasticity policy can save overhead in finding a proper resource configuration, when compared with a static policy that uses a fixed number of processes for each rescaling operation. Finally, we discuss multiple existing research opportunities of elastic insitu processing from different aspects.
In-situ processing addresses the gap between speeds of computing and I/O capabilities by processing data close to the data source, i.e., on the same system as the data source (e.g., a simulation). However, the effective implementation of in-situ processing workflows requires the optimization of several design parameters such as where on the system workflow data analysis/visualization (ana/vis) as placed and how execution as well as the interaction and data exchanges between ana/vis are coordinated. For example, in the case of hybrid in-situ processing, interacting ana/vis may be tightly or loosely coupled depending on their placement, and this can lead to very different performance and scalability. A key challenge is deciding the most appropriate ana/vis placement, which depends on dynamic applications, workflow, and system characteristics that might change at runtime. In this paper, we present a framework to support online adaptive data analysis placement during the execution of an in-situ workflow. Specifically, the paper presents a model and architecture, and explores several data analysis placement strategies. Evaluation results show that dynamically choosing appropriate data analysis placement strategies can balance the benefits and overhead of different data analysis placement patterns to reduce in-situ processing time.
W. W. Xing, A. A. Shah, P. Wang, S. Zhe, Q. Fu, R. M. Kirby. Residual Gaussian process: A tractable nonparametric Bayesian emulator for multi-fidelity simulations, In Applied Mathematical Modelling, Vol. 97, Elsevier, pp. 36-56. 2021.
Challenges in multi-fidelity modelling relate to accuracy, uncertainty estimation and high-dimensionality. A novel additive structure is introduced in which the highest fidelity solution is written as a sum of the lowest fidelity solution and residuals between the solutions at successive fidelity levels, with Gaussian process priors placed over the low fidelity solution and each of the residuals. The resulting model is equipped with a closed-form solution for the predictive posterior, making it applicable to advanced, high-dimensional tasks that require uncertainty estimation. Its advantages are demonstrated on univariate benchmarks and on three challenging multivariate problems. It is shown how active learning can be used to enhance the model, especially with a limited computational budget. Furthermore, error bounds are derived for the mean prediction in the univariate case.