Monday, April 29, 2013

Product Creation: Bosco as a single interface to access both HPC Supercomputers and HTC clusters

What is the role of a product manager in open source teams? The lean business  startup principles  apply, but instead  of paying users we have scientists competing to  get results first.
Risk increases if one does not discover what future users really want
We found our that a team at University of California San Diego (UCSD) SDSC’s Gordon Supercomputer Assists in Crunching Large Hadron Collider Data. used Bosco to connect Gordon supercomputer  to OSG grid  in order to
do rapidly processing raw data from almost one billion particle collisions as part of a project to help define the future research agenda for the Large Hadron Collider (LHC).
According Lothar Bauerdick, OSG Executive Director,
 Giving us access to the Gordon supercomputer effectively doubled the data processing compute power available to us,”
Dan Fraser and I called Frank Wuerthwein , who is the soul of this project and a top researcher in the field of dark-matter.

We discovered that Bosco - off the shelf - needed some customization. Why? Because Bosco was never designed for this double purpose to access both HPC (High Performance Computing) and HTC (High Throughput Computing)  resources. Derek Weitzel from Bosco helped the team in San Diego. How Frank W. and the team did make Gordon work may be subject of an amazing article. They did it, and without the engineering creativity, no product manager can pull out a solution. What the product manager can do is to find out what the scientists really want. We to go to and fro from developers to the users, until we know what to deliver with minimum risks.

Risk is lower if we frequently ask Joe users what they really want
Gordon is part of XSEDE Extreme Science and Engineering Discovery Environment, which consists mainly of supercomputers. The name is impressive, but intimidating for a talented  user with laptop, normally a Mac trying to reach.Gordon and the like


I wrote to the team:
Can we work closely with UCSD team to make a Bosco-based two way interface to submit jobs from OSG to a designated XSEDE resource(s)? This is a special user case, that perhaps can be applied in other situations.
Dan Fraser has a special talent to find from a pool of talented engineers the ones with the synergy to make a functional  team ready to deliver. He recruited Mats Rynge who actually wrote the final Abstract we will present at XSEDE13 as a poster with the idea.

"Moving job submission and management close to the user and to systems they are already familiar with, will make using XSEDE resources accessible to users who might not have much UNIX and HPC experience. The model also makes it easier for users to access local campus clusters one day and then an XSEDE resource the next one. Bosco can also provide an interface to XSEDE for gateways and other portal systems.
In this Poster we demonstrate how users can easily download, install, and use the Bosco capability for managing distributed computing jobs from their desktops."

 The poster was accepted. Here are the reviewer comments.

----------------------- REVIEW ---------------------
PAPER: 243
TITLE: Bosco - A Simple Interface for managing jobs on both XSEDE and Campus computing resources
AUTHORS: Derek Weitzel, Daniel Fraser, Miha Ahronovitz and Mats Rynge
----------- REVIEW 1 -----------
Poster will present Bosco, a tool for bridging the gap between desktop users and HPC resources.  Such tools are key to making cyberinfrastructure more accessible to a broader audience, and the abstract indicates that this application is fully featured and ready for XSEDE users to install and use.
----------- REVIEW  2 -----------
This abstract describes an interface to submit tasks to cluster computing, regardless if it is a HPC in your institution or XSEDE resources. Apparently, Bosco provides the easy to use interface to submit tasks to heterogeneous resources and even facilitates to switch a task from one HPC to another. This topic is indeed relevant to the XSEDE 13 because creates a transparency to use HPC resource for new scientist that needs to interact with XSEDE resources.
----------- REVIEW 3  -----------
A tool that allows users easy access to multiple types of resources  - including different OS versions - is always of interest to this community.

A quote from the interview with Hans Meuer, the International Supercomputing Conference  ISC'13 General Chair,  also shows the hope to make the access to the TOP500 supercomputers more easy for simple human beings
Miha: Have you seen this University of California San Diego (UCSD) press release ? They used Bosco to link the HPC Gordon Super Computer to OSG (Open Science Grid),  an HTC resource.  The results improved in a spectacular manner.
Hans And I would love to cover this topic at ISC Big Data'13 conference in Heidelberg, September 25 and 26, 2013. Sverre Jarp from CERN is the conference chair. We just have started the preparation of this event.

