 |
Collaborative Monitoring and Analysis for Simulation Scientists. R. Tchoua, S. Klasky, N. Podhorszki, B. Grimm, A. Khan, E. Santos, C.T. Silva, P. Mouallem, M. Vouk. In Proceedings of The 2010 International Symposium on Collaborative Technologies and Systems (CTS 2010), pp. (accepted). 2010.
Collaboratively monitoring and analyzing large scale simulations from petascale computers is an important area of research and development within the scientific community. This paper addresses these issues when teams of colleagues from different research areas work together to help understand the complex data generated from these simulations. In particular, we address the issues when geographically diverse teams of disparate researchers work together to understand the complex science being simulated on high performance computers. Most application scientists want to focus on the sciences and spend a minimum amount of time learning new tools or adopting new techniques to monitor and analyze their simulation data. The challenge of eSimMon, our web-based system is to decrease or eliminate some of the hurdles on the scientists’ path to scientific discovery, and allow these collaborations to flourish.
Full Publication |
|
 |
A First Study on Strategies for Generating Workflow Snippets. T. Ellkvist, L. Stromback, L. Lins, J. Freire. In Proceedings of the ACM SIGMOD Intenational Workshop on Keyword Search on Structured Data (KEYS), pp. 15--20. 2009. ISBN: 978-1-60558-570-3
Workflows are increasingly being used to specify computational tasks, from simulations and data analysis to the creation of Web mashups. Recently, a number of public workflow repositories have become available, for example, myExperiment for scientific workflows, and Yahoo! Pipes. Workflow collections are also commonplace in many scientific projects. Having such collections opens up new opportunities for knowledge sharing and re-use. But for this to become a reality, mechanisms are needed that help users explore these collections and locate useful workflows. Although there has been work on querying workflows, not much attention has been given to presenting query results. In this paper, we take a first look at the requirements for workflow snippets and study alternative techniques for deriving concise, yet informative snippets.
Full Publication |
|
 |
Using Mediation to Achieve Provenance Interoperability. T. Ellkvist, D. Koop, J. Freire, C.T. Silva, L. Stromback. In Proceedings of the IEEE International Workshop on Scientific Workflows, 2009, pp. 291--298. 2009. ISBN: 978-0-7695-3708-5
Provenance is essential in scientific experiments. It contains information that is key to preserving data, to determining its quality and authorship, and to reproducing as well as validating the results. In complex experiments and analyses, where multiple tools are used to derive data products, provenance captured by these tools must be combined in order to determine the complete lineage of the derived products. In this paper, we describe a mediator-based architecture for integrating provenance information from multiple sources. This architecture contains two key components: a global mediated schema that is general and capable of representing provenance information represented in different model; and a new system-independent query API that is general and able to express complex queries over provenance information from different sources. We also present a case study where we show how this model was applied to integrate provenance from three provenance-enabled systems and discuss the issues involved in this integration process.
Full Publication |
|
|
Provenance Management: Challenges and Opportunities. J. Freire. In Datenbanksysteme in Business, Technologie und Web (BTW), pp. 4. 2009.
Computing has been an enormous accelerator to science and industry alike and it has led to an information explosion in many different fields. The unprecedented volume of data acquired from sensors, derived by simulations and data analysis processes, accumulated in warehouses, and often shared on the Web, has given rise to a new field of research: provenance management. Provenance (also referred to as audit trail, lineage, and pedigree) captures information about the steps used to generate a given data product. Such information provides important documentation that is key to preserve data, to determine the data’s quality and authorship, to understand, reproduce, as well as validate results. Provenance solutions are needed in many different domains and applications, from environmental science and physics simulations, to business processes and data integration in warehouses. In this talk, we survey recent research results and outline challenges involved in building provenance management systems. We also discuss emerging applications that are enabled by provenance and outline open problems and new directions for database-related research.
|
|
 |
Using Workflow Medleys to Streamline Exploratory Tasks. E. Santos, D. Koop, H.T. Vo, E. Anderson, J. Freire, C.T. Silva. In 21st International Conference on Scientific and Statistical Database Management (SSDBM), pp. 292--301. 2009.
To analyze and understand the growing wealth of scientific data, complex workflows need to be assembled, often requiring the combination of loosely coupled resources, specialized libraries, distributed computing infrastructure, and Web services. However, constructing these workflows is a non-trivial task, especially for users who do not have programming expertise. This problem is compounded for exploratory tasks, where the workflows need to be iteratively refined. In this paper, we introduce workflow medleys, a new approach for manipulating collections of workflows. We propose a workflow manipulation language that includes operations that are common in exploratory tasks and present a visual interface designed for this language. We briefly discuss how medleys have been applied in two (real) applications.
Full Publication |
|
|
User-Driven Application Development. E. Santos, L. Lins, J. Ahrens, J. Freire, C.T. Silva. In IEEE Transactions on Visualization and Computer Graphics, Proceedings of the 2009 IEEE Visualization Conference. Sept/Oct, 2009.
Visualization is essential for understanding the increasing volumes of digital data. However, the process required to create insightful visualizations is involved and time consuming. Although several visualization tools are available, including tools with sophisticated visual interfaces, they are out of reach for users who have little or no knowledge of visualization techniques and/or who do not have programming expertise. In this paper, we propose VISMASHUP, a new framework for streamlining the creation of customized visualization applications. Because these applications can be customized for very specific tasks, they can hide much of the complexity in a visualization specification and make it easier for users to explore visualizations by manipulating a small set of parameters. We describe the framework and how it supports the various tasks a designer needs to carry out to develop an application, from mining and exploring a set of visualization specifications (pipelines), to the creation of simplified views of the pipelines, and the automatic generation of the application and its interface. We also describe the implementation of the system and demonstrate its use in two real application scenarios. |
|
|
|