Scientific Workflows and Business Workflow Standards
in e-Science (SWBES)
(Submission deadline extended to
Nowadays, scientific experiments often involve cooperation between large scale computing and data resources. Workflow management systems are emerging as a key element to help scientists to prototype and execute experiments processes and to accelerate the scientific discoveries from the experiment. Concerted research is carried out in several projects along the complete e-Science technology chain, ranging from applications to networking, focusing on new methodologies and re-usable components. A number of research groups worldwide are developing more and more advanced workflow features to support scientists to develop complex computing and data intensive application workflows using geographically distributed data and computing resources. Despite many successes, the gap between the workflow application developers and workflow system developers is still very big and it is still difficult for lots of potential application developers to fully utilize the features offered by the workflow systems. The workshop of SWBES08 focuses on practical aspects of utilising workflow techniques to fill the gap between the e-Science applications on one hand and the middleware (Grid) and the low level infrastructure on the other hand. The workshop aims to provide a forum for researchers and developers in the field of e-Science to exchange the latest experience and research ideas on scientific workflow management and e-Science. Live demos of workflow systems and workflow application highly recommended.
Authors are invited to submit original manuscripts that demonstrate current research in all areas of scientific workflow management in e-Science. The workshop solicits novel papers on using business workflow standards to tackle scientific workflow issues, including but not limited to:
Authors should submit electronically a full (6-page) paper to the
workshop upload
facilitate. The papers will be carefully evaluated based on originality,
significance, technical soundness, and clarity of expression. Accepted papers
should be presented at the workshop. All accepted papers will be
published by the IEEE Computer Society Press,
Program
Session 1- (
Abstract: Scientific workflows have become an archetype to model and run in silico experiments by scientists. These workflows primarily engage in computation and data transformation tasks to perform scientific analysis. There is, however, a whole class of workflows that are used to manage the scientific data when they arrive from external sensors and are prepared for becoming science ready and available for use. While not directly part of the scientific analysis, these workflows operating behind the scene on behalf of the “data valets” play an important role in end-to-end management of scientific data products. They share several traits with traditional scientific workflows: both are data intensive and use web resources. However, they also differ in significant respects, for example, in the degree of reliability required and the type of provenance collected. In this talk I will compare and contrast these two classes of workflows – Science Application workflows and Data Preparation workflows – and use these to drive observations, along with shared and unique requirements from workflow systems for eScience in the Cloud.
Abstract: Scientists in many fields are developing large-scale workflow applications consisting of hundreds of thousands of tasks and requiring thousands of hours of aggregate computation time. Acquiring the computational resources to execute these workflows poses many challenges for application developers. Although the grid provides ready access to large pools of computational resources, the traditional approach to accessing these resources suffers from many overheads that lead to poor performance when used for workflow execution. We describe how resource provisioning techniques such as advance reservations, multi-level scheduling, and cloud computing can be used to reduce scheduling overheads and improve application performance. We explain the advantages and disadvantages of these techniques in terms of cost, performance and usability.
Abstract:
Scientific communities are increasingly exposing information and tools as
online services in an effort to abstract complex scientific processes and large
data sets. Clients are able to access services without knowledge of their
internal workings simplifying the process of replicating scientific research.
Taking a service-oriented approach to science (
Abstract: Secure provenance techniques are essential in generating trustworthy provenance records, where one is interested in protecting their integrity, confidentiality, and availability. In this work, we suggest architecture to provide protection of authorship and temporal information in gridenabled provenance systems. It can be used in the resolution of conflicting intellectual property claims, and in the reliable chronological reconstitution of scientific experiments. We observe that some techniques from public key infrastructures can be readily applied for this purpose. We discuss the issues involved in the implementation of such architecture and describe some experiments realized with the proposed techniques.
Session 2- (
Abstract: The myExperiment social web site and virtual research
environment currently supports a community of some 1200 registered users, many
sharing the in silico scientific workflows of Taverna. The last year has seen significant growth in both
the user and developer communities, with new interfaces being developed over myExperiment's RESTful
Abstract: This paper discusses the application of existing workflow management systems to a real world science application (LQCD). Typical workflows and execution environment used in production are described. Requirements for the LQCD production system are discussed. The workflow management systems Askalon and Swift were tested by implementing the LQCD workflows and evaluated against the requirements. We report our findings and future work.
Abstract: In
this paper, we present novel web services o
ered by the StrainInfo.net bioportal.
This portal integrates information in the domain of microbiology and offers a
uniform web interface to the different data providers. By providing web
services, the integration results of StrainInfo.net become available for automated
processing. Several classes of web services are implemented and some examples
are discussed in more detail. Combined with third-party services, the
StrainInfo.net services can be integrated into workflows. We describe two
workflows: one basic work ow for the construction of
a phylogenetic tree based on 16S rRNA
gene sequences retrieved from the species of a given genus and a more advanced
workflow to collect data of several biomarkers, to calculate the corresponding
distance matrices, and to visualize the intra- and inter-species variation
among the different biomarkers using the TaxonGap
tool. Hereby, the tedious and manual work of collecting and analyzing data, and of visualizing the analysis results has become
fully automated.
Abstract: The
growth of data used by data-intensive computations, e.g. Geographical
Information Systems (
a design interface, a task scheduler, and a runtime support system.
The design interface has two options: a GUI-based workflow designer and an
Session 3 (
Abstract: Domain scientists synthesize different data and computing resources to solve their scientific problems. Making use of distributed execution within scientific workflows is a growing and promising way to achieve better execution performance and efficiency. This paper presents a high-level distributed execution framework, which is designed based on the distributed execution requirements identified within the Kepler community. It also discusses mechanisms to make the presented distributed execution framework easy-to-use, comprehensive, adaptable, extensible and efficient..
Abstract: To effectively support real-time monitoring and performance analysis of scientific workflow execution, varying levels of event data must be captured and made available to interested parties. This paper discusses the creation of an ontology-aware workflow monitoring system for use in the Trident system which utilizes a distributed publish/subscribe event model. The implementation of the publish/subscribe system is discussed and performance results are presented..
Abstract: This paper explores the use of cloud computing for scientific workflows, focusing on a widely used astronomy application-Montage. The approach is to evaluate from the point of view of a scientific workflow the tradeoffs between running in a local environment, if such is available, and running in a virtual environment via remote, wide-area network resource access. Our results show that for Montage, a workflow with short job runtimes, the virtual environment can provide good compute time performance but it can suffer from resource scheduling delays and wide-area communications.
Dr. Adam Belloum
email: adam@science.uva.nl
www.staff.science.uva.nl/~adam
Informatics
Institute,
1098SJ,
Prof. Carole
Goble
email: carole.goble@manchester.ac.uk
http://www.cs.man.ac.uk/~carole/
Tel: +44
Fax: +44
Dr. Zhiming
Zhao
email: zhiming@science.uva.nl
Tel: +31 20 5257599
Fax: +31 20 5257490
www: staff.science.uva.nl/~zhiming
Informatics
Institute,
1098SJ,