Cells create physical connections with the extracellular environment through adhesions. Nascent adhesions form at the leading edge of migrating cells and either undergo cycles of disassembly and reassembly, or elongate and stabilize at the end of actin fibers. How adhesions assemble has been addressed in several studies, but the exact role of actin fibers in the elongation and stabilization of nascent adhesions remains largely elusive. To address this question, here we extended our computational model of adhesion assembly by incorporating an actin fiber that locally promotes integrin activation. The model revealed that an actin fiber promotes adhesion stabilization and elongation. Actomyosin contractility from the fiber also promotes adhesion stabilization and elongation, by strengthening integrin-ligand interactions, but only up to a force threshold. Above this force threshold, most integrin-ligand bonds fail, and the adhesion disassembles. In the absence of contraction, actin fibers still support adhesions stabilization. Collectively, our results provide a picture in which myosin activity is dispensable for adhesion stabilization and elongation under an actin fiber, offering a framework for interpreting several previous experimental observations.
K.R. Carney, A.M. Khan, S. Stam, S.C. Samson, N. Mittal, S. Han, T.C. Bidone, M. Mendoza. Nascent adhesions shorten the period of lamellipodium protrusion through the Brownian ratchet mechanism, In Mol Biol Cell, 2023.
Directional cell migration is driven by the conversion of oscillating edge motion into lasting periods of leading edge protrusion. Actin polymerization against the membrane and adhesions control edge motion, but the exact mechanisms that determine protrusion period remain elusive. We addressed this by developing a computational model in which polymerization of actin filaments against a deformable membrane and variable adhesion dynamics support edge motion. Consistent with previous reports, our model showed that actin polymerization and adhesion lifetime power protrusion velocity. However, increasing adhesion lifetime decreased the protrusion period. Measurements of adhesion lifetime and edge motion in migrating cells confirmed that adhesion lifetime is associated with and promotes protrusion velocity, but decreased duration. Our model showed that adhesions’ control of protrusion persistence originates from the Brownian ratchet mechanism for actin filament polymerization. With longer adhesion lifetime or increased adhesion density, the proportion of actin filaments tethered to the substrate increased, maintaining filaments against the cell membrane. The reduced filament-membrane distance generated pushing force for high edge velocity, but limited further polymerization needed for protrusion duration. We propose a mechanism for cell edge protrusion in which adhesion strength regulates actin filament polymerization to control the periods of leading edge protrusion.
B. Charoenwong, R.M. Kirby, J. Reiter. Computer Science Abstractions To Help Reason About Decentralized Stablecoin Design, In IEEE Access, IEEE, 2023.
Computer science as a discipline is known for its penchant for using abstractions as a tool for reasoning. It is no surprise that computer science might have something valuable to lend to the world of decentralized stablecoin design, as it is in fact a “computing" problem. In this paper, we examine the possibility of a decentralized and capital-efficient stablecoin using smart contracts that algorithmically trade to maintain stability and study the potential new functionality that smart contracts enable. By exploiting traditional abstractions from computer science, we show that a capital-efficient algorithmic stablecoin cannot be provably stable. Additionally, we provide a formal exposition of the workings of Central Bank Digital Currencies, connecting this to the space of possible stablecoin designs. We then discuss several outstanding conjectures from both academics and practitioners and finally highlight the regulatory similarities between money-market funds and working stablecoins. Our work builds upon the current and growing interplay between the realms of engineering and financial services, and it also demonstrates how ways of thinking as a computer scientist can aid practitioners. We believe this research is vital for understanding and developing the future of financial technology.
H. Dai, M. Penwarden, R.M. Kirby, S. Joshi. Neural Operator Learning for Ultrasound Tomography Inversion, Subtitled arXiv:2304.03297v1, 2023.
Neural operator learning as a means of mapping between complex function spaces has garnered significant attention in the field of computational science and engineering (CS&E). In this paper, we apply Neural operator learning to the time-of-flight ultrasound computed tomography (USCT) problem. We learn the mapping between time-of-flight (TOF) data and the heterogeneous sound speed field using a full-wave solver to generate the training data. This novel application of operator learning circumnavigates the need to solve the computationally intensive iterative inverse problem. The operator learns the non-linear mapping offline and predicts the heterogeneous sound field with a single forward pass through the model. This is the first time operator learning has been used for ultrasound tomography and is the first step in potential real-time predictions of soft tissue distribution for tumor identification in beast imaging.
Modeling the Shape of the Brain Connectome via Deep Neural Networks, In Information Processing in Medical Imaging, Springer Nature Switzerland, pp. 291--302. 2023.
The goal of diffusion-weighted magnetic resonance imaging (DWI) is to infer the structural connectivity of an individual subject's brain in vivo. To statistically study the variability and differences between normal and abnormal brain connectomes, a mathematical model of the neural connections is required. In this paper, we represent the brain connectome as a Riemannian manifold, which allows us to model neural connections as geodesics. This leads to the challenging problem of estimating a Riemannian metric that is compatible with the DWI data, i.e., a metric such that the geodesic curves represent individual fiber tracts of the connectomics. We reduce this problem to that of solving a highly nonlinear set of partial differential equations (PDEs) and study the applicability of convolutional encoder-decoder neural networks (CEDNNs) for solving this geometrically motivated PDE. Our method achieves excellent performance in the alignment of geodesics with white matter pathways and tackles a long-standing issue in previous geodesic tractography methods: the inability to recover crossing fibers with high fidelity. Code is available at https://github.com/aarentai/Metric-Cnn-3D-IPMI.
D. Dai, Y. Epshteyn, A. Narayan. Energy Stable and Structure-Preserving Schemes for the Stochastic Galerkin Shallow Water Equations, Subtitled arXiv:2310.06229, 2023.
The shallow water flow model is widely used to describe water flows in rivers, lakes, and coastal areas. Accounting for uncertainty in the corresponding transport-dominated non-linear PDE models presents theoretical and numerical challenges that motivate the central advances of this paper. Starting with a spatially one-dimensional hyperbolicity-preserving, positivity-preserving stochastic Galerkin formulation of the parametric/uncertain shallow water equations, we derive an entropy-entropy flux pair for the system. We exploit this entropy-entropy flux pair to construct structure-preserving second-order energy conservative, and first- and second-order energy stable finite volume schemes for the stochastic Galerkin shallow water system. The performance of the methods is illustrated on several numerical experiments.
Objective: This study aims to characterize dose variations from the original plan for a cohort of patients with head-and-neck cancer (HNC) using high-quality CT on rails (CTOR) datasets and evaluate a predictive model for identifying patients needing replanning.
Materials and Methods: In total, 74 patients with HNC treated on our CTOR-equipped machine were evaluated in this retrospective study. Patients were treated at our facility using in-room, CTOR image guidance—acquiring CTOR kV fan beam CT images on a weekly to near-daily basis. For each patient, a particular day’s delivered treatment dose was calculated by applying the approved, planned beam set to the post image-guided alignment CT image of the day. Total accumulated delivered dose distributions were calculated and compared with the planned dose distribution, and differences were characterized by comparison of dose and biological response statistics.
Results: The majority of patients in the study saw excellent agreement between planned and delivered dose distribution in targets—the mean deviations of dose received by 95% and 98% of the planning target volumes of the cohort are −0.7% and −1.3%, respectively. In critical organs, we saw a +6.5% mean deviation of mean dose in the parotid glands, −2.3% mean deviation of maximum dose in the brainstem, and +0.7% mean deviation of maximum dose in the spinal cord. Of 74 patients, 10 experienced nontrivial variation of delivered parotid dose, which resulted in a normal tissue complication probability (NTCP) increase compared with the anticipated NTCP in the original plan, ranging from 11% to 44%.
Conclusion: We determined that a midcourse evaluation of dose deviation was not effective in predicting the need for replanning for our patient cohorts. The observed nontrivial dose difference to parotid gland delivered dose suggests that even when rigorous, high-quality image guidance is performed, clinically concerning variations to predicted dose delivery can still occur.
Y. Ding, J. Wilburn, H. Shrestha, A. Ndlovu, K. Gadhave, C. Nobre, A. Lex, L. Harrison. reVISit: Supporting Scalable Evaluation of Interactive Visualizations, Subtitled OSF Preprints, 2023.
reVISit is an open-source software toolkit and framework for creating, deploying, and monitoring empirical visualization studies. Running a quality empirical study in visualization can be demanding and resource-intensive, requiring substantial time, cost, and technical expertise from the research team. These challenges are amplified as research norms trend towards more complex and rigorous study methodologies, alongside a growing need to evaluate more complex interactive visualizations. reVISit aims to ameliorate these challenges by introducing a domain-specific language for study set-up, and a series of software components, such as UI elements, behavior provenance, and an experiment monitoring and management interface. Together with interactive or static stimuli provided by the experimenter, these are compiled to a ready-to-deploy web-based experiment. We demonstrate reVISit's functionality by re-implementing two studies – a graphical perception task and a more complex, interactive study. reVISit is an open-source community project, available at https://revisit.dev/
S. Dubey, T. Kataria, B. Knudsen, S.Y. Elhabian. Structural Cycle GAN for Virtual Immunohistochemistry Staining of Gland Markers in the Colon, Subtitled arXiv:2308.13182, 2023.
With the advent of digital scanners and deep learning, diagnostic operations may move from a microscope to a desktop. Hematoxylin and Eosin (H&E) staining is one of the most frequently used stains for disease analysis, diagnosis, and grading, but pathologists do need different immunohistochemical (IHC) stains to analyze specific structures or cells. Obtaining all of these stains (H&E and different IHCs) on a single specimen is a tedious and time-consuming task. Consequently, virtual staining has emerged as an essential research direction. Here, we propose a novel generative model, Structural Cycle-GAN (SC-GAN), for synthesizing IHC stains from H&E images, and vice versa. Our method expressly incorporates structural information in the form of edges (in addition to color data) and employs attention modules exclusively in the decoder of the proposed generator model. This integration enhances feature localization and preserves contextual information during the generation process. In addition, a structural loss is incorporated to ensure accurate structure alignment between the generated and input markers. To demonstrate the efficacy of the proposed model, experiments are conducted with two IHC markers emphasizing distinct structures of glands in the colon: the nucleus of epithelial cells (CDX2) and the cytoplasm (CK818). Quantitative metrics such as FID and SSIM are frequently used for the analysis of generative models, but they do not correlate explicitly with higher-quality virtual staining results. Therefore, we propose two new quantitative metrics that correlate directly with the virtual staining specificity of IHC markers.
In structural biology, validation and verification of new atomic models are crucial and necessary steps which limit the production of reliable molecular models for publications and databases. An atomic model is the result of meticulous modeling and matching and is evaluated using a variety of metrics that provide clues to improve and refine the model so it fits our understanding of molecules and physical constraints. In cryo electron microscopy (cryo-EM) the validation is also part of an iterative modeling process in which there is a need to judge the quality of the model during the creation phase. A shortcoming is that the process and results of the validation are rarely communicated using visual metaphors. This work presents a visual framework for molecular validation. The framework was developed in close collaboration with domain experts in a participatory design process. Its core is a novel visual representation based on 2D heatmaps that shows all available validation metrics in a linear fashion, presenting a global overview of the atomic model and provide domain experts with interactive analysis tools. Additional information stemming from the underlying data, such as a variety of local quality measures, is used to guide the user's attention toward regions of higher relevance. Linked with the heatmap is a three-dimensional molecular visualization providing the spatial context of the structures and chosen metrics. Additional views of statistical properties of the structure are included in the visual framework. We demonstrate the utility of the framework and its visual guidance with examples from cryo-EM.
S. Fang, S. Zhe, H.M. Lin, A.A. Azad, H. Fettke, E.M. Kwan, L. Horvath, B. Mak, T. Zheng, P. Du, S. Jia, R.M. Kirby, M. Kohli. Multi-Omic Integration of Blood-Based Tumor-Associated Genomic and Lipidomic Profiles Using Machine Learning Models in Metastatic Prostate Cancer, In Clinical Cancer Informatics, 2023.
To determine prognostic and predictive clinical outcomes in metastatic hormone-sensitive prostate cancer (mHSPC) and metastatic castrate-resistant prostate cancer (mCRPC) on the basis of a combination of plasma-derived genomic alterations and lipid features in a longitudinal cohort of patients with advanced prostate cancer.
A multifeature classifier was constructed to predict clinical outcomes using plasma-based genomic alterations detected in 120 genes and 772 lipidomic species as informative features in a cohort of 71 patients with mHSPC and 144 patients with mCRPC. Outcomes of interest were collected over 11 years of follow-up. These included in mHSPC state early failure of androgen-deprivation therapy (ADT) and exceptional responders to ADT; early death (poor prognosis) and long-term survivors in mCRPC state. The approach was to build binary classification models that identified discriminative candidates with optimal weights to predict outcomes. To achieve this, we built multi-omic feature-based classifiers using traditional machine learning (ML) methods, including logistic regression with sparse regularization, multi-kernel Gaussian process regression, and support vector machines.
The levels of specific ceramides (d18:1/14:0 and d18:1/17:0), and the presence of CHEK2 mutations, AR amplification, and RB1 deletion were identified as the most crucial factors associated with clinical outcomes. Using ML models, the optimal multi-omics feature combination determined resulted in AUC scores of 0.751 for predicting mHSPC survival and 0.638 for predicting ADT failure; and in mCRPC state, 0.687 for prognostication and 0.727 for exceptional survival. The models were observed to be superior than using a limited candidate number of features for developing multi-omic prognostic and predictive signatures.
Using a ML approach that incorporates multiple omic features improves the prediction accuracy for metastatic prostate cancer outcomes significantly. Validation of these models will be needed in independent data sets in future.
S. Fang, X. Yu, S. Li, Z. Wang, R. Kirby, S. Zhe. Streaming Factor Trajectory Learning for Temporal Tensor Decomposition, Subtitled arxiv.org/abs/2310.17021, 2023.
Practical tensor data is often along with time information. Most existing temporal decomposition approaches estimate a set of fixed factors for the objects in each tensor mode, and hence cannot capture the temporal evolution of the objects' representation. More important, we lack an effective approach to capture such evolution from streaming data, which is common in real-world applications. To address these issues, we propose Streaming Factor Trajectory Learning for temporal tensor decomposition. We use Gaussian processes (GPs) to model the trajectory of factors so as to flexibly estimate their temporal evolution. To address the computational challenges in handling streaming data, we convert the GPs into a state-space prior by constructing an equivalent stochastic differential equation (SDE). We develop an efficient online filtering algorithm to estimate a decoupled running posterior of the involved factor states upon receiving new data. The decoupled estimation enables us to conduct standard Rauch-Tung-Striebel smoothing to compute the full posterior of all the trajectories in parallel, without the need for revisiting any previous data. We have shown the advantage of SFTL in both synthetic tasks and real-world applications.
S. Fang, M. Cooley, D. Long, S. Li, R. Kirby, S. Zhe. Solving High Frequency and Multi-Scale PDEs with Gaussian Processes, Subtitled arXiv:2311.04465, 2023.
Machine learning based solvers have garnered much attention in physical simulation and scientific computing, with a prominent example, physics-informed neural networks (PINNs). However, PINNs often struggle to solve high-frequency and multi-scale PDEs, which can be due to the spectral bias during neural network training. To address this problem, we resort to the Gaussian process (GP) framework. To flexibly capture the dominant frequencies, we model the power spectrum of the PDE solution with a student t mixture or Gaussian mixture. We then apply inverse Fourier transform to obtain the covariance function (according to the Wiener-Khinchin theorem). The covariance derived from the Gaussian mixture spectrum corresponds to the known spectral mixture kernel. We are the first to discover its rationale and effectiveness for PDE solving. Next, we estimate the mixture weights in the log domain, which we show is equivalent to placing a Jeffreys prior. It automatically induces sparsity, prunes excessive frequencies, and adjusts the remaining toward the ground truth. Third, to enable efficient and scalable computation on massive collocation points, which are critical to capture high frequencies, we place the collocation points on a grid, and multiply our covariance function at each input dimension. We use the GP conditional mean to predict the solution and its derivatives so as to fit the boundary condition and the equation itself. As a result, we can derive a Kronecker product structure in the covariance matrix. We use Kronecker product properties and multilinear algebra to greatly promote computational efficiency and scalability, without any low-rank approximations. We show the advantage of our method in systematic experiments.
M. Hall, G. Gopalakrishnan, E. Eide, J. Cohoon, J. Phillips, M. Zhang, S. Elhabian, A. Bhaskara, H. Dam, A. Yadrov, T. Kataria. An NSF REU Site Based on Trust and Reproducibility of Intelligent Computation: Experience Report, In Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, 2023.
This paper presents an overview of an NSF Research Experience for Undergraduate (REU) Site on Trust and Reproducibility of Intelligent Computation, delivered by faculty and graduate students in the Kahlert School of Computing at University of Utah. The chosen themes bring together several concerns for the future in producing computational results that can be trusted: secure, reproducible, based on sound algorithmic foundations, and developed in the context of ethical considerations. The research areas represented by student projects include machine learning, high-performance computing, algorithms and applications, computer security, data science, and human-centered computing. In the first four weeks of the program, the entire student cohort spent their mornings in lessons from experts in these crosscutting topics, and used one-of-a-kind research platforms operated by the University of Utah, namely NSF-funded CloudLab and POWDER facilities; reading assignments, quizzes, and hands-on exercises reinforced the lessons. In the subsequent five weeks, lectures were less frequent, as students branched into small groups to develop their research projects. The final week focused on a poster presentation and final report. Through describing our experiences, this program can serve as a model for preparing a future workforce to integrate machine learning into trustworthy and reproducible applications.
R. Han, A. Narayan, Y. Xu. An approximate control variates approach to multifidelity distribution estimation, Subtitled arXiv:2303.06422v1, 2023.
Forward simulation-based uncertainty quantification that studies the output distribution of quantities of interest (QoI) is a crucial component for computationally robust statistics and engineering. There is a large body of literature devoted to accurately assessing statistics of QoI, and in particular, multilevel or multifidelity approaches are known to be effective, leveraging cost-accuracy tradeoffs between a given ensemble of models. However, effective algorithms that can estimate the full distribution of outputs are still under active development. In this paper, we introduce a general multifidelity framework for estimating the cumulative distribution functions (CDFs) of vector-valued QoI associated with a high-fidelity model under a budget constraint. Given a family of appropriate control variates obtained from lower fidelity surrogates, our framework involves identifying the most cost-effective model subset and then using it to build an approximate control variates estimator for the target CDF. We instantiate the framework by constructing a family of control variates using intermediate linear approximators and rigorously analyze the corresponding algorithm. Our analysis reveals that the resulting CDF estimator is uniformly consistent and budget-asymptotically optimal, with only mild moment and regularity assumptions. The approach provides a robust multifidelity CDF estimator that is adaptive to the available budget, does not require a priori knowledge of cross-model statistics or model hierarchy, and is applicable to general output dimensions. We demonstrate the efficiency and robustness of the approach using several test examples.
Metabolic networks are interconnected and influence diverse cellular processes. The protein-metabolite interactions that mediate these networks are frequently low affinity and challenging to systematically discover. We developed mass spectrometry integrated with equilibrium dialysis for the discovery of allostery systematically (MIDAS) to identify such interactions. Analysis of 33 enzymes from human carbohydrate metabolism identified 830 protein-metabolite interactions, including known regulators, substrates, and products as well as previously unreported interactions. We functionally validated a subset of interactions, including the isoform-specific inhibition of lactate dehydrogenase by long-chain acyl–coenzyme A. Cell treatment with fatty acids caused a loss of pyruvate-lactate interconversion dependent on lactate dehydrogenase isoform expression. These protein-metabolite interactions may contribute to the dynamic, tissue-specific metabolic flexibility that enables growth and survival in an ever-changing nutrient environment. Understanding how metabolic state influences cellular processes requires systematic analysis of low-affinity interactions of metabolites with proteins. Hicks et al. describe a method called MIDAS (mass spectrometry integrated with equilibrium dialysis for the discovery of allostery systematically), which allowed them to probe such interactions for 33 enzymes of human carbohydrate metabolism and more than 400 metabolites. The authors detected many known and many new interactions, including regulation of lactate dehydrogenase by ATP and long-chain acyl coenzyme A, which may help to explain known physiological relations between fat and carbohydrate metabolism in different tissues. —LBR A mass spectrometry and dialysis method detects metabolite-protein interactions that help to control physiology.
Scientific simulations and observations using particles have been creating large datasets that require effective and efficient data reduction to store, transfer, and analyze. However, current approaches either compress only small data well while being inefficient for large data, or handle large data but with insufficient compression. Toward effective and scalable compression/decompression of particle positions, we introduce new kinds of particle hierarchies and corresponding traversal orders that quickly reduce reconstruction error while being fast and low in memory footprint. Our solution to compression of large-scale particle data is a flexible block-based hierarchy that supports progressive, random-access, and error-driven decoding, where error estimation heuristics can be supplied by the user. For low-level node encoding, we introduce new schemes that effectively compress both uniform and densely structured particle distributions.
J. K. Holmen, V. G. Vergara Larrea, E. W. Draeger, E. T. Phipps, P. J. Smith, M. Berzins, S. T. Smith, J. N. Thornock, S. Parete-Koon. Strengthening the US Department of Energy's Recruitment Pipeline: The DOE/NNSA Predictive Science Academic Alliance Program (PSAAP) Experience, In Practice and Experience in Advanced Research Computing, ACM, pp. 137--144. 2023.
The US Department of Energy (DOE) oversees a system of 17 national laboratories responsible for developing unique scientific capabilities beyond the scope of academic and industrial institutions. These labs strive to keep America at the forefront of discovery and are home to some of the Nation’s best minds and the world’s best scientific and research facilities. Collaborations between national laboratories and academic institutions are critical to develop and recruit talent for the DOE workforce. Academia’s cooperative education model poses challenges for DOE recruitment pipelines centered around traditional internships. This paper discusses a promising DOE recruitment pipeline, the National Nuclear Security Administration’s (NNSA) Predictive Science Academic Alliance Program (PSAAP) initiative. As a part of this, experiences capturing the successes and challenges faced by the University of Utah’s Carbon Capture Multidisciplinary Simulation Center (CCMSC) through their participation in the PSAAP-II initiative are shared. These experiences demonstrate the success of Utah’s PSAAP center as a recruitment pipeline with approximately 43% of CCMSC students going to a national laboratory after graduation. Potential opportunities to strengthen the DOE’s recruitment pipeline are also discussed.
In this study, a systematic review and meta-analysis were conducted to identify, categorize, and investigate the effectiveness of passive cooling strategies (PCSs) for residential buildings. Forty-two studies published between 2000 and 2021 were reviewed; they examined the effects of PCSs on indoor temperature decrease, cooling load reduction, energy savings, and thermal comfort hour extension. In total, 30 passive strategies were identified and classified into three categories: design approach, building envelope, and passive cooling system. The review found that using various passive strategies can achieve, on average, (i) an indoor temperature decrease of 2.2 °C, (ii) a cooling load reduction of 31%, (iii) energy savings of 29%, and (v) a thermal comfort hour extension of 23%. Moreover, the five most effective passive strategies were identified as well as the differences between hot and dry climates and hot and humid climates.
K. Iyer, S. Elhabian. Mesh2SSM: From Surface Meshes to Statistical Shape Models of Anatomy, Subtitled arXiv:2305.07805, 2023.
Statistical shape modeling is the computational process of discovering significant shape parameters from segmented anatomies captured by medical images (such as MRI and CT scans), which can fully describe subject-specific anatomy in the context of a population. The presence of substantial non-linear variability in human anatomy often makes the traditional shape modeling process challenging. Deep learning techniques can learn complex non-linear representations of shapes and generate statistical shape models that are more faithful to the underlying population-level variability. However, existing deep learning models still have limitations and require established/optimized shape models for training. We propose Mesh2SSM, a new approach that leverages unsupervised, permutation-invariant representation learning to estimate how to deform a template point cloud to subject-specific meshes, forming a correspondence-based shape model. Mesh2SSM can also learn a population-specific template, reducing any bias due to template selection. The proposed method operates directly on meshes and is computationally efficient, making it an attractive alternative to traditional and deep learning-based SSM approaches.