About

A Meta-Workflow Cyber-infrastructure System Designed for Environmental Observatories.

Peter Bajcsy, Rob Kooper, Luigi Marini, Barbara Minsker and Jim Myers

Technical Report NCSA-ISDA05-001 (ISDA01-2005), December 30, 2005

The purpose of this white paper is to outline computer science issues related to the Task entitled "Create end-to-end meta-workflow demonstration" which is one part of the NCSA Environmental Cyber-Infrastructure Development (ECID) effort. Our goal is to research and develop meta-workflow architectures to support a set of environmental science and hydrology demonstrations in the short term and to support a spectrum of application communities in the long term. From the NCSA institutional view point, this white paper documents our design phase and provides an overview of meta-workflow definitions, previous work on workflows, a set of requirements, proposed meta-workflow architecture, and the current features of the prototype meta-workflow implementation called CyberIntegrator.

From the computer science view point, the paper presents the problem of designing a highly interactive scientific meta-workflow system that aims at building complex problem-solving environments from heterogeneous tools. Driven by systems-science use cases and complex informatics problems, we identify the dimensions along which current workflow technologies must grow to become a robust cyber-infrastructure capable of scaling to meet the national needs. Being able to join workflows developed using modules from the multiple open source and commercial workflow systems in use in various sub-disciplines is an obvious need. Less obvious but also critically important are abilities to describe and share workflow fragments, to execute portions of workflows on different appropriate hosts, or to provide security, provenance and fault-tolerance features of software execution. We introduce the term meta-workflow to refer to workflow systems designed to meet these end-to-end needs. We then discuss the architecture and implementation of a meta-workflow prototype called CyberIntegrator developed at NCSA.

Our current meta-workflow architecture enables users (1) to browse registries of data, tools and computational resources, (2) to create meta-workflows by example or for batch processing, (3) to re-use and re-purpose meta-workflows, (4) to execute metaworkflows locally or remotely, and (5) to incorporate heterogeneous tools and link them transparently. The contribution of our work is (a) in defining the meta-workflow concept focused on science requirements and (b) in architecting technology and prototyping CyberIntegrator software supporting environmental observatories and other applications.