Middleware 10  
8th International Workshop on Middleware for Grids, Clouds and e-Science - MGC 2010

In conjunction with ACM/IFIP/USENIX 11th International Middleware Conference 2010
Bangalore, India . November 29 - December 3, 2010

Tuesday - November 30th, 2010

Time Title
08:00 - 09:00   Slides Text
 
Registration/ check-in

 
09:00 - 10:30 Chair: Edmundo Madeira (UNICAMP)
 
Tiago Garrochinho and Luis Veiga

 
 
Elton Mathias and Françoise Baude

 
 
Henrique Klôh, Bruno Schulze, Raquel Pinto and Antônio Mury

 
10:30 - 11:00 Break
 
Udayanga S. Wickramasinghe, Charith D. Wickramarachchi, Pradeep R. Fernando, Dulanjanie Sumanasena, Srinath Perera and Sanjiva Weerawarana

 
11:00 - 12:30 Chair: Jose Neuman (UFC)
 
Anton Beloglazov and Rajkumar Buyya

 
 
Barnaby Malet and Peter Pietzuch

 
 
Douglas Oliveira, Fabio Porto, Gilson Giraldi, Bruno Schulze and Raquel Pinto

 
12:30 - 14:00 Lunch
14:00 - 15:30 Chair: Luis Veiga
 
Tiago Macambira and Dorgival Guedes

 
 
Buddhika De Alwis, Supun Malinga, Kathiravelu Pradeeban, Denis Weerasiri, Srinath Perera and Vishaka Nanayakkara

 
 
Kailash Kotwani, James Myers, Joseph Mohr, Greg Daues, Bill Baker, Dora Cai, Michelle Gower, Tony Darnell, Shantanu Desai, Bob Armstrong, Terrence McLaren, Don Petravick, Ankit Chandra and Joel Plutchak

 
15:30 - 16:00 Break


CRM-OO-VM: A Checkpointing-enabled Java VM for Efficient and Reliable e-Science Applications in Grids

Tiago Garrochinho and Luis Veiga

  Abstract: Object-oriented programming languages are in current days, the dominant paradigm of application development (mostly Java and .NET languages). Recently, increasingly more Java applications have long (or very long) execution times and manipulate large amounts of data/information, gaining relevance in fields related with e-Science (with Grid and Cloud computing). Significant examples include chemistry, computational biology and bio-informatics, with many available Java-based APIs (e.g., Neobio). Often, when the execution of one of those applications is terminated abruptly due to a failure (regardless of it being caused by hardware of software fault, lack of available resources,...), all of its work already carried out is simply lost and, when the application is later re-executed, it has to restart its work from scratch, wasting resources and time, and being prone to another failure, to delay its completion with no deadline guarantees. A possible solution to solve these problems, is through mechanisms of checkpoint and migration. This makes applications more robust and flexible by being able to move to other nodes, without intervention from the programmer. This article provides a solution to Java applications with long execution times, by incorporating such mechanisms in a Java VM (JikesRVM).  
Multi-domain Grid/Cloud Computing Through a Hierarchical Component-Based Middleware
Elton Mathias and Françoise Baude
  Abstract: Current solutions for hybrid Grid/Cloud computing have been developed to hide from heterogeneity, dynamism and distributed nature of resources. These solutions are however insufficient to support distributed applications with non trivial communication patterns among processes, or that are structured so as to reflect the organization of resources they are deployed onto. In this paper, we present a generic, adaptable and extensible component-based middleware that seamlessly enables a transition of non-trivial applications from traditional Grids to hybrid Grid-Cloud platforms. This middleware goes beyond the resolution of well known technical challenges for multi-domain computing, as it offers mechanisms to exploit the hierarchical, heterogeneous and dynamic nature of platforms. We validate its capabilities and versatility through two use cases: an Internet-wide federation of Distributed Service Buses and a runtime supporting domain-decomposition HPC in heterogeneous computing environments using MPI-like programming. Performance results show the efficiency and usefulness of our middleware, and so contribute to promote research efforts geared towards flexible, on-demand IT solutions.  
A Scheduling Model for Workflows on Grids and Clouds
Henrique Klôh, Bruno Schulze, Raquel Pinto and Antônio Mury
  Abstract: This paper presents a set of comparisons of the performance of a bi-criteria scheduling algorithm for Workflows with Quality of Service (QoS) support. This work serves as basis to implement a bi-criteria hybrid scheduling algorithm for workflows with QoS support, aiming to optimize the criteria chosen by the users and based on the priority ordering and relaxation specified by them. Analyzing the comparisons and obtained results, indicates a performance improvement when adopting the model proposed in this paper.  
Adaptive Threshold-Based Approach for Energy-Efficient Consolidation of Virtual Machines in Cloud Data Centers
Anton Beloglazov and Rajkumar Buyya
  Abstract: The rapid growth in demand for computational power driven by modern service applications combined with the shift to the Cloud computing model have led to the establishment of large-scale virtualized data centers. Such data centers consume enormous amounts of electrical energy resulting in high operating costs and carbon dioxide emissions. Dynamic consolidation of virtual machines (VMs) and switching idle nodes off allow Cloud providers to optimize resource usage and reduce energy consumption. However, the obligation of providing high quality of service to customers leads to the necessity in dealing with the energy-performance trade-off. We propose a novel technique for dynamic consolidation of VMs based on adaptive utilization thresholds, which ensures a high level of meeting the Service Level Agreements (SLA). We validate the high efficiency of the proposed technique across different kinds of workloads using workload traces from more than a thousand PlanetLab servers.  
Resource Allocation across Multiple Cloud Data Centres
Barnaby Malet and Peter Pietzuch
  Abstract: Web applications with rich AJAX-driven user interfaces make asynchronous server-side calls to switch application state. To provide the best user experience, the response time of these calls must be as low as possible. Since response time is bounded by network delay, it can be minimised by placing application components closest to the network location of the majority of anticipated users. However, with a limited budget for hosting applications, developers need to select data centre locations strategically. In practice, the best choice is difficult to achieve manually due to dynamic client workloads and effects such as flash crowds. In this paper, we propose a cloud management middleware that automatically adjusts the placement of web application components across multiple cloud data centres. Based on observations and predictions of client request rates, it migrates application components between data centres. Our evaluation with two data centres and globally distributed clients on PlanetLab shows that our approach can decrease median client response times by 21% for a realistic multi-tier web application.  
Optimizing the Pre-processing of Scientific Visualization Techniques using QEF
Douglas Oliveira, Fabio Porto, Gilson Giraldi, Bruno Schulze and Raquel Pinto
  Abstract: Scientific Visualization is a computer-based field concerned with techniques that allow scientists to create graphical representations from datasets generated by computational simulations or acquisition instruments. To address the computational cost of visualization tasks, specially for large datasets, researchers have explored grid environments as a platform for their parallel evaluation. It is however not trivial to adapt each different visualization technique to run in grid environments. A desirable alternative would separate the specificities of data and process distribution in grids from visualization computation logic. In this work we claim that the QEF (query evaluation framework) leverages scientific visualization computation with the above mentioned characteristics. Visualization computation techniques are modeled as operators in an algebra and integrated with a set of control operators that manage data distribution leading to a parallel QEP (query execution plan). We show the benefits of parallelization for two of those techniques: particle tracing and volume rendering. For these techniques, our experiments demonstrate many positive aspects of the solution presented, as well as opportunities for future work.  
A middleware for parallel processing of large graphs
Tiago Macambira and Dorgival Guedes
  Abstract: With the increasing "data deluge" scientists face today, the analysis and processing of large datasets of structured data is a daring task. Among such data, large graphs are gaining particular importance with the growing interest on social networks and other complex networks. Given the dimensions considered, parallel processing is essential. However, users are generally not experts in writing parallel code to handle such structures. In this work we present Rendero, a middleware that makes it possible to easily describe graph algorithms in a form adequate for parallel execution. The system is based on the Bulk-Synchronous programming model and offers a vertex-based abstraction. Our current implementation offers good speed-up and scale-up results for large graphs ranging from tens of thousands to millions of vertices and edges in some cases.  
Mooshabaya - Mashup generator for XBaya
Buddhika De Alwis, Supun Malinga, Kathiravelu Pradeeban, Denis Weerasiri, Srinath Perera and Vishaka Nanayakkara
  Abstract: Visual composition of workflows enables end user to visually depict the workflow as a graph of activities in a process. Tools that support visual composition translate those visual models to traditional workflow languages such as BPEL and execute them thus freeing the end user of the need of knowing workflow languages. Mashups on the other hand, provide a lightweight mechanism for ordinary user centric service composition and creation, hence considered to have an active role in the web 2.0 paradigm. In this paper, we extend a visual workflow composition tool to support mashups, thus providing a comprehensive tooling platform for mashup development backed up by workflow style modeling capabilities, while expanding the reach of the workflow domain into web 2.0 resources with the potential of the mashups. Furthermore, our work opens up a new possibility of converging the mashup domain and workflow domain, thus capturing beneficial aspects from each domain.  
The Dark Energy Survey Data Management System as a Data Intensive Science Gateway
Kailash Kotwani, James Myers, Joseph Mohr, Greg Daues, Bill Baker, Dora Cai, Michelle Gower, Tony Darnell, Shantanu Desai, Bob Armstrong, Terrence McLaren, Don Petravick, Ankit Chandra and Joel Plutchak
  Abstract: The Dark Energy Survey (DES) collaboration is a multi-national science effort to understand cosmic acceleration and the nature of 'dark energy' responsible for this phenomenon. Dark Energy Survey Data Management (DESDM) system is a new observational astronomy processing pipeline and data management system that will be used to: process raw images obtained from a survey with the new DES field camera (DECam) covering 5000 sq degree of southern sky; archive intermediate and final co-added images; extract catalogs of celestial object from every image and deliver data products to the astronomy community through portals and services. DESDM has been designed as a data intensive Science Gateway coupling use of shared computational resources (e.g. Teragrid) wi project-owned databases and file systems for storage distributed across three continents. DESDM system over the next six years time will perform over 10 million CPU-hours (SUs) of image processing and serve over 4 Petabytes of images and 14 billion cataloged objects to the international DES collaboration. When delivered for operations in 2011, it will be one of, if not the, most scalable and powerful systems for processing telescope images, creating co-added deep images, and generating detailed star and galaxy catalogs in existence. The project's software components consist of a processing framework, an ensemble of astronomy codes, an integrated archive, a data-access framework and a portal infrastructure. This paper provides an overview of the DESDM scope and highlights, the architectural features developed and planned to be able to support Gateway-style management petascale intensive continuous processing and on-demand user queries for analysis.  
BISSA: Empowering Web-Gadget Communication with Tuple Spaces
Udayanga S. Wickramasinghe, Charith D. Wickramarachchi, Pradeep R. Fernando, Dulanjanie Sumanasena, Srinath Perera and Sanjiva Weerawarana
  Abstract: Modern web pages are not just static plain HTML files. With the invention of browser scripting languages such as JavaScript, web pages have come alive. Still, there is no such mechanism for unified communication between browser applications that use scripting languages. We propose BISSA, a communication model which provides a unified & time-decoupled communication platform based on tuple spaces & whiteboard architecture, for browser applications. Our proposed solution consists of an in-browser tuple space implementation & a highly scalable peer-to-peer global tuple space implementation which can act standalone as well as collaborate with each other seamlessly. In this paper we state & present our implementation methodology, and show how browser applications can use this infrastructure as an inter-gadget communication solution as well as storage platform for application generated data.