An interview with Miron Livny : Bosco, HTCondor and more

Miron Livny  is a professor of Computer Science. He leads  the Center for High Throughput Computing (CHTC) at the University of Wisconsin -Madison and serves as the technical director of the Open Science Grid  (OSG).

As described on HTCondor web site:
...most scientists are concerned with how many floating point operations per month or per year they can extract from their computing environment rather than the number of such operations the environment can provide them per second or minute. Floating point operations per second (FLOPS) has been the yardstick used by most High Performance Computing (HPC) efforts to evaluate their systems. Little attention has been devoted by the computing community to environments that can deliver large amounts of processing capacity over long periods of time. We refer to such environments as High Throughput Computing (HTC) environments.
 In essence, Miron's  research is driven by the following challenge:
How can we accommodate an unbounded need for computing and an unbounded amount of data with an unbounded amount of resources?
This is a daring thought to have and it inspired  a fascinating conversation  that was triggered by the upcoming release of the BOSCO tool by the Open Science Grid (OSG)

Professor Miron Livny
Miha:  You brought up for the first time the HTC  in a 1997 interview with HPCwire.  When did  the idea start?

Miron:  My PhD  at  Weizman Institute of Science in Israel (1978)  dealt with load balancing in distributed systems, a new field at the time. I looked into the undesired state of such a system.One of the many customers (users) are waiting for a  resource, while one or more of the resources capable and willing to serve this customer are idle, yet they can not serve this user. Distributed computing, scheduling, matchmaking  absorbed me.  I do the same thing now for over thirty years.

Miha:  And then?

Miron  Then in  1983 I came to University of Wisconsin.  While at Weizman Institute,  all the research was simulation work. I really started developing the technology when I came to US. Here we developed the concept of  separate, individual resource ownership. This was key.  In my doctoral  thesis,  the system owned everything.

Miha:. At Super-Computing events the popular perception is that TOP500 computers designed to maximize FLOPS are most desirable to have. What's best  platform to maximize HTC?

Miron   In principle, I can run a lot of independent tasks on IBM Blue Gene, for example. It may not be as easy because Blue Gene does not have a full stack of communication and people are running MPIs. In fact we work with IBM to add HTC capabilities to their machines, But is this platform  the most  cost effective way of doing it?

A successful company such as Google realized that following the distributed computing principles that underpin  HTC is the way to go.  They decided to design a  system where each component is assumed to be unreliable, and it is miracle if it actually does what it is intended to do. From the beginning they envisaged a system based on unreliable components. Giving work out to every node and the results eventually come back even if some of the nodes fail. In HPC, because of the MPI model,  everything has to work,  otherwise the whole thing falls apart

Miha:   Did Google adapted HTCondor open source code?

Miron:  While they refer to our system in their publications as an example for the resource manager they have developed, - no, they did not use our software.   One of the reasons for this - I believe - is that  is doing  more than Google needs in their environment. They wanted something minimal attuned to their needs and not carrying the extra luggage.

Miha:  Unofficial studies estimate last year Google Compute Engine had 800,000 cores. Today this number may be at least several million cores. It appears the decision to embrace these HTC principles paid off for them

Miron:  Google wrote everything in house. I visited them a few times to share my experiences, and some of my former students joined them. If you look in the chapter 13 of the Grid  book edited by Ian Foster et.al, Rajesh Raman - who joined Google - and I described the principles of HTC, not the HTCondor  technology.

We are computer scientists.  We create computer science principles, not just develop a software package. Anyone  who wants to use those principles is encourage to create  the software tailored to their needs, if doing so is cost effective for them. It goes without say that for Google it was!

Miha:  Do you have multiple adaptations of HTCondor for different users?

Miron:   We are funded by the National Science Foundation (NSF) and the Department of Energy ( DOE) to support the scientific HTC. It is therefore natural that our work is driven by the needs of the main HTC science communities and motivated by emerging science disciplines that embrace HTC.

Miha: What about industry?

Miron:   While we have a lot of users in industry, we are not developing HTCondor for their needs.  HTCondor enables science, and one major  discovery can have a long term impact on the future of HTC.

Miha:   One your collaborators, Cycle Computing  has a powerful Amazon Web Service cluster  priced on demand per hour that uses HTCondor. Are they other examples?

Miron:   Red Hat  offers support services for  for MRG that uses HTCondor. For example all of Dreamworks  Animation rendering farms are managed by HTCondor. It is a Red Hat account. Fedora and Debian also offer HTCondor.

An intuitive diagram of Bosco
Miha: Let's talk  Bosco,  Who had the idea to develop it separately  from  HTCondor?

Miron: The idea developed naturally. You just have to look at the HTCondor architecture under the hood.  The philosophy was always "Submit locally and run globally" – namely submit on your workstation and run on any workstation in the organization. Now,  why don't we  go to the researchers and say: "OK, we'll give you just this one  piece of software, called Bosco, that is running on your side.  You will  learn and install Bosco, in a few hours,and it will enable you to Submit locally and run on machines all over the world!"

You can view Bosco  "My  Personal High Throughput  Manager"

Miha:  This is the empathy Bosco seeks. Make the researchers happy. Offer them a simple interface to create a galaxy of HTC resources

Miron::  It is not just a job submission interface. It is a High-End HTC Manager.

I can share with you that I have a graduate student student who is building  a MatLabs-based piece of software that  we believe it will manage in its final incarnation  more than  1 million jobs.  But the user of this tool - yet unnamed - does not think in terms of jobs, it thinks about the points in the calculation of a multi-dimensional  function. Each of the points is calculated by another HTC-Job, originated via MatLabs, but behind the scenes, it  is Bosco that does the heavy lifting. This combinations of Matlabs with Bosco, enables a researcher with very basic computing skills to invoke a computation where every point is an independent job that is running on different machines in different parts of the world.

Miha:   If this works, it will be mind boggling. 

Miron:   Bosco can help us by spreading the practical incarnation of the "Submit Locally, Run Globally" concept in HTC. If you submit say to SGE (Sun Grid Engine on one its many flavors), PBS, LSF clusters,  you can get in, but you can not get out to another cluster. You are stuck with SGE or PBS or LSF  When you are submitting to Bosco, you can go everywhere. And that's the concept: Bosco helps a lot the researcher to build bridges to different on and off campus resources and thus help improve throughput. Regardless of the software that manages these remote resources, Bosco should get your jobs there and bring back the results!

Miha:  How did a scientist submit  and manage jobs on different clusters, before Bosco?

Miron:  If you want to run two jobs, don't use Bosco, because you don't have a High Throughput Computing Problem.  But if you have 1,000 or 100,000 jobs, then you need a tool that will help you manage the entire high throughput computing activity. Without Bosco, when you want to use two clusters, you "ssh" to one cluster and submit jobs, and then "ssh" to  the second one, and submit some jobs. This is a manual process that is not effective and does not scale. In Bosco there is only one submission for all the jobs.

Miha: What exactly this means?

Miron:  All the researcher sees is one Bosco interface where he writes his submission files. Bosco automatically sends the jobs to the remote clusters via  SSH. It can even load balance the workload among the remote clusters It takes care of everything needed to work seamlessly on those remote clusters (passwords, port numbers). The scientist does not know the name of the remote machine and its exact location.  The researchers – users -  do not have to install ANYTHING on the remote machine . There is no need to talk to the remote system administrator or ask anyone for any "special" treatment on those remote sites

Many cluster administrators put a limit to how many jobs you can have queued. In such a case, you must submit in batches of the  maximum allowed number of jobs, say 500 which for 100,000 jobs means  200 manual submissions. Bosco does this automatically, we call it throttling.

System administration tasks for a Bosco  researcher
managing multiple clusters from all over the world

Miha:    David Ungar - in an interview of how to program many  core processors said  that  the precision of a calculation  depends how  much money one needs to spend. How does it work in HTC?

Miron:  For many HTC applications throughput is linked to  accuracy / understanding / knowledge. Namely, if we can run more jobs (simulations / searches) we can establish more confidence in the results or explore more options / parameters / cases that will give us more information about the problem.

Miha: :  Is it possible to predict (one day) what will be the minimum number of resources needed to deliver the results in y accuracy within, say, x days?

Miron: Having clouds helps us with a cost model for the computing part of this challenge, something we did not have in the past . Today, if you give me a function that relates product quality to CPU hours,  I can translate it to $'s using clouds. It is not a trivial function to create Time to market can be a factor, too.

Miha:   We defined Bosco in many ways, but we need a single definition of Bosco

Miron:  I totally agree,

Miha:   Can you help us spell this message?

Miron: I want the message to come from  the Bosco team  and you are part of it, You are new , while I am doing the same thing for thirty years.

Notes:

(1) Bosco is another capabilities provided by the Open Science Grid for high-end high-throughput super-computing jobs running anywhere in the world and managed from local campuses

(2) Bosco beta v1.1 is available for download now

Comments

Popular Posts