Higgs Boson: Think HTC instead of HPC.

What is Higgs Boson?

 The LHC (Large Hadron Collider) in Europe announced the appearance of a new particle among the pieces of smashed protons. It is called Higgs Boson, and it is believed to be the secret force that confers mass to matter.

Physicists have searched for it for years, but what is the Higgs boson supposed to do, exactly? A LiveScience infographic explains.

Most of the people, including most educated ones, never heard of Higgs boson, even less understanding why it was hard to find it.
Why has the Higgs been so hard to find? It is only produced at very high energies, such as those in the Big Bang or generated in a particle collider like the LHC, and it breaks down almost immediately into a shower of other particles. "The probability of making a Higgs is so small that you are looking for one collision out of 10 trillion,

Most people think HPC 

Calculations were going on for twenty years, consuming lately  between 1,000,000 to 2,000,000 computing hours per day. If you think in terms of  HPC (High Performance Computing), flops and giant Super Computing machines, one may take the entire TOP500 list at a cost that can burden even more US national debt :-)
This computation does not end and will continue.
"After an enormous effort by LHC experimenters, the CERN laboratory and worldwide Grid computing community we are very excited to observe an excess in our data from a new particle consistent with the production of a Higgs boson," says UW-Madison Bjorn Wiik Professor of Physics Wesley Smith, who plays a lead role in the CMS experiment. "We will need the additional data planned from the running of the LHC until next year to establish if this is indeed the Higgs boson and that we stand at the threshold of a new era of understanding the origins of mass."

Our cherished assumption is wrong 

David Ungar, a manycore processor researcher said during an interview
The obstacle we shall have to overcome, if we are to successfully program manycore systems, is our cherished assumption that we write programs that always get the exactly right answers. This assumption is deeply embedded in how we think about programming. The folks who build web search engines already understand, but for the rest of us, to quote Firesign Theatre: Everything You Know Is Wrong!

Grid Computing versus Cloud Computing at CERN 

 The grid computing infrastructure was created, it handled 15 petabytes to 20 petabytes of data annually. This year, CERN is on track to produce up to 30 PB of data. "There was no way CERN could provide all that on our own," says Ian Bird, CERN's computing grid project leader. Grid computing was once a buzz phrase similar to that of what cloud computing is now. "In a certain sense, we've been here already," he says.
 The entire grid has a capacity of 200 PB of disk and 300,000 cores, with most of the 150 computing centers connected via 10Gbps links. "The grid is a way of tying it all together to make it look like a single system."
Internally, CERN is running a private cloud based on OpenStack open source code.  CERN and two other major European research organizations took steps to create a public cloud resource called Helix Nebula - The Science Cloud.

All is nice and groovy but there is a small problem: As Ian Bird says politely "we're just not sure of the costs and how it would impact our funding structure....  "From a technical point of view, it could probably work," he says. "I just don't know how you'd fund it.""

The French say:  "Le bon Dieu est dans le détail" (the good God is in the detail) . In English we say "the devil is in the details."

Thinking  HTC (High Throughput Computing)

The unprecedented volume of computations for the Higgs Boson discovery was (and still is) carried out using the concept of HTC.
Open Science Grid services knit together researchers, many repositories of LHC data (UW–Madison is home to two research teams, one each for the two biggest experiments at LHC) and more than 100,000 computers at about 80 sites around the country.
 “It’s also a huge triumph for mankind,” says Miron Livny, CTO at the Wisconsin Institute for Discovery. “There were more than 40 nations that came together for a long time to do this one thing that — even if it all worked out — wasn’t going to make anyone rich. It’s a powerful demonstration of the spirit of collaboration.” 
This colossal computer power came almost for free.

HTC is about sustained, long term computation. You might think the difference between sustained long term computation and a short term sprint is merely quantitative, but this difference  really is a qualitative one.  What HTC is in essence sustained throughput over long times.

You would like to measure computational hours , per day, per week, per year, for example. These numbers are so large so we really care about sustained hours. For example OSG (Open Science Grid) delivers about 2,000,000 hours a day, plus or minus, 730 millions hours per year..

OSG is an opportunistic resource, so there are never guarantees about available resources, but on average there is a tremendous amount of capacity there. Each site of OSG is autonomous, locally owned and operated.

Getting people to think in a high throughput way helps a lot. There are still many machines idles that anyone can access for free, but, they are not HPC (High Performance Computing) resources. They may be only be idle for an hour or two. If we have a single 10,000 hour long job, it will never complete on the OSG. But if you are able to deploy the same task as a workflow of 10,000 one hour jobs, you could finish in one day. Statistical and Monte Carlo techniques are often very applicable in HTC and these are similar to the Higgs boson time consuming stochastic modelling .

Greg Thain  HTCondor guru,  teaching "Think HTC" at OSG Summer School 2012 

By Summer 2013 we will know 

On January 26 2013 Washington Post  writes:
The world should know with certainty by the middle of this year whether a subatomic particle discovered by scientists is a long-sought Higgs boson, the head of the world’s largest atom smasher said Saturday.
Rolf Heuer, director of the European Organization for Nuclear Research, or CERN, said he is confident that “towards the middle of the year, we will be there.” By then, he said reams of data from the $10 billion Large Hadron Collider on the Swiss-French border near Geneva should have been assessed.
 The timing could also help Scottish physicist Peter Higgs win a Noble Prize

Professor Peter Higgs  explaining what other call the "God Particle"

Unleashing "guerilla" science

 This is what Greg Thain, the "Think HPC" lead evangelist says:
You, Mr. Researcher are in a constant pressure to deliver results from a limited project funding. What will happen to your scientific project, if computation were really cheap? Because it is. So try not to think about being constrained by the amount of computation you have locally. What would happen if you could run 100,000 hours, one million hours? This is research. This is cheap. You can take risks. If you used 100,000 hours and still don't get the expected results, you still have the ability to analyze what happened and try again. No one will cut your funding. Quite the contrary.


Greg Thain,  from HTCondor project, Derek Weitzel, Bosco architect and  free thinker.


The opinions expressed in this blog are personal. Yet I am a member of the Bosco team, the quintessential "Think HTC" open source product one can try and use for free. You can download it from here,


Popular Posts