Saturday, April 26, 2014

Comments to the History of the Grid paper


This quote from Ian Foster blog inspired me.
Two years ago, Carl Kesselman and I published a rather lengthy paper that purports to recount the “history of the grid.” (I. Foster, C. Kesselman, The History of the Grid (PDF), in Cloud Computing and Big Data, IOS Press, Amsterdam , 2013; 37 pages, 176 references).
We believe that this paper includes useful material. We also know that it can be much improved, and to that end we plan a second edition. We invite suggestions for improvements. For example: What did we get wrong? What work did we forget to mention? What do you see as the most important accomplishments  of the grid community? The most important influences? The most egregious failures

The  Grid Computing book

Few books made such a strong, passionate response as the 1998 Grid 1. The metaphor of having "compute power" delivered in its ultimate incarnation from a plug on the wall. very much like electricity had a mesmerizing effect in front of each audience I used this analogy. Even today, 15 years later my lay friends asks whether I was successful creating this plug on the wall.

At that time - 1999 - I was working with Wolfgang Gentzsch on Codine in Genias - which later - in 2000 - we sold to Sun and the product was renamed Sun Grid Engine.

Surprisingly Grid 2 published in 2005 got lukewarm reviews on Amazon. Only three reviews and not even one five star rating

Are Scientists super-powerful users?

The answer is No. Here is why.

I  conducted in depth interviews at various universities. The reality is that scientists are normal people, who have their own work to do and have no time to learn system administration.  Scientists want to do science and not mess with coding, command lines, and system administration. These are nagging distractions for them.

As an anecdote, there is an interview with Mr. Higgs (not Dr. Higgs, as the professor never completed a Ph.D. degree) from Guardian.  He hardly uses email. HTCondor - the de-facto open source cluster resource management software - would be totally unmanageable for the famous Nobel Prize physicist.

Even seasoned system administrators confess being in a state of "permanent beginners" (actual user wording on the HTCondor  users group)

Grids and cluster software and HTC were born in Academia. It attracted one of the nations best engineers and scientists working on grids, because the rewards were too great not to put up with it the hardship.  Like the identification of Higgs particle

The key is ease of use, the sentiment of "I like it" . Nir Eyal published a book titled
Hooked:  How to Build Habit-Forming Products'\\This has not happened in CHTC. Grids are still in a rarefied space well above mortals

Generating Startups

We live newer times, it is never too late to get people hooked on grids. If we can look at the fact laden history of Grid in academia, perhaps we can go back create a "happy-end"

The problem was the market for actual grid software was very small. At one SC 2007 conference, drinking a beer with PBS, LSF, Univa and Sun, we agreed the market for license sales was less than $100 million. Many people used free in house or open source software.

Despite the benefits - enterprises can deliver faster, with better quality products that were not possible before,-  large system houses could not delver and support grids (HP, IBM, Sun Dell) to the enterprise.

The academia business approach was: "Here is the technology. You must like it" This is best illustrated in this video "A scientist' business value proposition"

Modern Product Creation

There is not one word in the Grid about product management. Generations of computer scientists graduate without having a clue of what that is.

Look at the 4 minutes video on how Dropbox was created. Without writing one line of code they demo-ed a simulated UI of what this software will do. (MVP = Minimum Viable Product)
Contrary to traditional product development, which usually involves a long, thoughtful incubation period and strives for product perfection, the goal of the MVP is to begin the process of learning, not end it. Unlike a prototype or concept test, an MVP is designed not just to answer product design or technical questions. Its goal is to test fundamental business hypotheses
The lesson of the MVP is that any additional work beyond what was required to start learning is waste, no matter how important it might have seemed at the time.

Telling the truth

In his article from New York Times, The Employers Creed, David Brooks writes what people modern employers should recruit.
Bias toward truth-tellers. I recently ran into a fellow who hires a lot of people. He said he asks the following question during each interview. “Could you describe a time when you told the truth and it hurt you?” If the interviewee can’t immediately come up with an episode, there may be a problem here.
I can describe very well what happen when I went in front of OSG all hands meeting and simply put on two slides what the most distinguished scientists said
“Moving to  HTCondor from XXX was an uphill battle”
“Bosco? In the past, they came and said: “Try a new tool called SOAR - System of Automatic Runs. Supposedly to make it easier. But I did not see anything will feel us comfortable. We ended with the same problems as HTCondor itself” 
"In Biology, speaking for me and my colleagues, we are not interested to triple or so the number of hours by getting on different grids, but by actually getting on any grids. It is more about ease of use than availability of resources."
 “CHTC guys helped us to write the submit files, I wish these could be simpler and more accessible to my students”
“If it weren't for our proximity to HTCondor people, we would not be able to run HTCondor” 
If we had a magic wand..? That would be nice to have a function to send to HTCondor, and this function generates the scripts
I was only the messenger. Bosco project is now is part of HTCondor distribution, buried with much older stuff  accumulated over the years. Most people don't know it is there. It is, the same with SOAR

One brilliant idea:  Dynamic Data Center

Can we bootstrap a Renaissance for grid - clouds?  I liked the presentation of Frank Wuerthwein at ISC BigData '13 in Heidelberg . He is one of the most inspiring talents in Open Science Grid  leadership.

In an interview   Frank describes a worldwide grid model when one can take a piece of hardware, donate it to the grid, automatically all required sw is loaded, and if later the user wants the resource back, the compute resource is restored to its original state, something similar to "now is safe to remove the USB device" message we get on any Windows laptop / desktop.

Frank calls this the Dynamic Data Center. It is similar to have a solar power system on our homes, communicating with a national or regional power grid

This is just an illustration of a bright idea. The litmus test, however , is not the technology itself . It is the modern product creations product management, user UX, which was illustrated in the DropBox example.

Frank work was in High Energy Physics and the name of the game is Big Data. This is a cosmic size Big Data. Big Data can bring grids to the front stage

Cloudera manged to get a 1 billion dollars  for a product readily available in open source (Hadoop).

I think the Dynamic Data Center has the billions potential in infrastructure. In its ultimate implementation, of creating a single grid (aka cluster, cloud, big data infrastructure) over the entire world, this grid will be a miracle, just as it looked when Grid 1 book was published.

This is impossible to achieve in the Academia alone. If computer entrepreneurs can make a car to scare the entire automotive industry, other computer entrepreneurs will make a viable Dynamic Data Center corporation. It could be a flurry of startups.
Post a Comment

Blog Archive

About Me

My photo

AI and ML for Conversational Economy