This seminar introduces a set of innovative techniques to reduce the time required for completing distributed scientific workflows, aiming to achieve response times in milliseconds. Current near-real-time experiments fall short of meeting stringent DOE requirements for Integrated Research Infrastructure (IRI) outlined in the recent reports. The proposed solution leverages data management strategies that exploit high-performance computing (HPC) facilities and advanced storage technologies, addressing challenges in cost and efficiency.
Key components include an object-based data management approach, enabling users to define semantically meaningful data objects and facilitating efficient data operations across wide-area networks. The distributed object store could orchestrate in-memory objects, reducing data management and transfer times while enabling automatic parallelization of analyses on HPC resources. The impact of this work extends to diverse scientific domains, including mineral resource surveys, light source management, and material synthesis control, enabling seamless distributed data processing and reducing dependency on custom computing resources.Posted by: Kate Craven