Environmental Cyber Infrastructure Demonstration Project: A Meta-Workflow Cyber-infrastructure System Designed for Environmental Observatories

National Center for Supercomputing Applications (NCSA)
University of Illinois at Urbana-Champaign (UIUC)
1205 W. Clark St, Room 1008, MC-257, Urbana, IL 61801

Introduction

This work addresses the problem of designing a highly interactive scientific meta-workflow system that aims at building complex problem-solving environments from heterogeneous tools. Driven by systems-science use cases and complex informatics problems, we identify the dimensions along which current workflow technologies must grow to become a robust cyber-infrastructure capable of scaling to meet the national needs. Being able to join workflows developed using modules from the multiple open source and commercial workflow systems in use in various sub-disciplines is an obvious need. Less obvious but also critically important are abilities to describe and share workflow fragments, to execute portions of workflows on different appropriate hosts, or to provide security, provenance and fault-tolerance features of software execution. We introduce the term meta-workflow to refer to workflow systems designed to meet these end-to-end needs.

The implementation of a meta-workflow prototype called CyberIntegrator developed at NCSA can be accessed from this web page. Our current meta-workflow architecture enables users
(1) to browse registries of data, tools and computational resources,
(2) to create meta-workflows by example or for batch processing,
(3) to re-use and re-purpose meta-workflows,
(4) to execute meta-workflows locally or remotely, and
(5) to incorporate heterogeneous tools and link them transparently.

The editor of CyberIntegrator is shown in Figure 1. The current meta-workflow editor includes three browsers of information registries (data left, tools middle and executors right), execution control (below browsers) and presentation of system information (bottom).


Figure 1: An editor of the current CyberIntegrator prototype.


An example workflow is illustrated in the following Figures 2-7. A user selects the tool "Load Image" first. The loaded image can be visualized by selecting the input image in the left browser (next to the image data structure), and the "Show PseudoColor Image" tool in the middle browser. The visualization is shown in Figure 2. The loaded image represents a digital elevation map (DEM) of Illinois and the location is illustrated by overlaying the Illinois county boundaries on top of the DEM image in Figure 3.


Figure 2: A DEM image loaded and visualized using "ShowPseudocolor" tool.




Figure 3: An overlay of the loaded DEM image and the Illinois county boundaries.


Next, a slope of the DEM image can be computed using either the Im2Learn or ArcGIS executor as shown in the right browser after selecting the input image (left browser), the tool "Calculate Slope" (middle browser) and choosing the executor in the right browser. Visualization of the slope image is shown in Figure 4.


Figure 4: A slope image compute from the DEM image and visualized using "ShowPseudocolor" tool.


The entire workflow process can be visualized by clicking on the bottom tab "Graph". The graph of the workflow is shown in Figure 5.


Figure 5: A DEM image loaded and visualized using "ShowPseudocolor" tool.


During the workflow execution provenance data are collected and stored in a meta-data repository. One can view the provenance information by clicking on the tab "Provenance". A conceptual organization of the workflow provenance information is shown in Figure 6. The CyberIntegrator can also save settings and outputs so that meta-workflows can be reproduced at a later time.


Figure 6: A conceptual organization of the workflow provenance information.


The provenance meta-data are analyzed by the knowledge provenance system called CI-KNOW. The CI-KNOW system provides real-time recommendations of tools and data sets that can be initialized from the CyberIntegrator by clicking on the buttom "Recommend" in the middle browser. For example, after loading an image, the recomendation is initialized and the pop-up message shown in Figure 7 indicates the past uses of the tools after loading an image.


Figure 7: A recommendation made by the CI-KNOW component after loading an image based on the provenance meta-data information.