Making big money with Hadoop

July 08, 2010

Making big money with Hadoop

Hadoop Summit 2010 in Santa Clara was like a shot of adrenaline to the letargic Silicon Valley. Sold out, booming with developers, and with would be investors in sessions that filled the rooms to the brim. Imagine you wake up with a second chance. This is Hadoop Summit, Santa Clara on June 29, 2010

Who has the greatest opportunity with Hadoop? Yahoo?, Google? Facebook? I believe they already cash on this technology. The biggest revenue opportunity is for database companies, particularly for the market-leader Oracle.

There are a few bloggers covering what actually happens, but none of them foresee the huge monetization future of this technology. As the CTO of Kharmasphere Shevek Mankin, says it easy to set the Hadoop cluster, put the data in, but how do you take the data out? This is crux of MapReduce technology: it t is not, in spite of contrary claims, mature enough to conquest the Enterprise on June 29,

Basically all Hadoop applications are collecting huge streams of data, classify them via MapReduce and place them in a structured data base using some form of intelligence..

Open source tools like Oozie, a work-flow system for managing Hadoop jobs including HDFS, are nice, but what is the business model? . This is more frustrating as one can see the billions and billions of dollars in revenues and valuation at social sites companies like Facebook (recently valued at $ 24 Billions), Netflix Yahoo and Google. What about Enterprise, where most of the wealth in our society is created ?

Cloudera and KharmaSphere want to sell supported Hadoop distributions and developer tools. Market is limited by the minimal sales coverage these companies have in enterprise settings.

So here is the other way around. IBM plans it's own Hadoop supported distribution and has presented at the Summit a do-it-yourself analytic tool based on Hadoop. It has “an insight engine, for allowing ad-hoc business insights for business users – at web scale. It allows access to embedded unstructured data, previously un-available to analyze”

The most puzzling and conspicuous was is the de-facto absence of Oracle at the Hadoop Summit 2010. If anyone from Oracle attended, it was probably in a stealth mode :-)

Assume Oracle can productize a Hadoop-based analytic at web scale , they can sell add-on to all theiir database enterprise users. Oracle, according to Gartner 2009

o Is #1 in worldwide RDBMS software market share by top five vendors
o Holds more market share than its four closest competitors combined
o Is #1 in total software revenue for Linux and Unix with 74.3 per cent and 60.7 per cent market share respectively

Assuming $24B per year total revenues in Oracle, can you imagine having a Hadoop product to complement the existing $10B a year database income only? Note this is a yearly amount, the installed data base based on the last five years should be at least $40B . Assuming a 1% attach ratio, they can sell Hadoop analytic web-wide tools for $500 million per year growing exponentially to $5 billion if the attach rate is 10%. What if the attach rate is 20%?

At that level, it would be the biggest money making product using the Hadoop technology, outside social networking industry.

There is simply no other product, IMO in Oracle portfolio that can provide this growth. Oracle has a Grid Engine team, they recently acquired via Sun acquisition, which has been integrated in December 2009 with Hadoop as Sun Grid Engine. A significant chunk of Oracle's Hadoop know-how comes from Sun's merger.

The first step is not engineering, but customer research within their corporate data base customers and determine the minimum number of features customers need and are enchanted with. And making the product wanted through astute customer research are not the focus of the Hadoop Summit developers so far.

References:

1. Hadoop Tutorial; http://thecloudtutorial.com/hadoop-tutorial.html
2. IBM BigSheets : http://www.slideshare.net/ydn/1-ibm-disruptiveapplicationwithhadoophadoopsummit2010
3. Oracle Grid Engine: http://www.oracle.com/us/products/tools/oracle-grid-engine-075549.html
4. Ahrono Associates : http://ahrono.com

Miha Ahronovitz

Comments

Ashwin Jayaprakash said…

Let's not forget their Coherence acquisition.

4:27 PM

the memories of a product manager said…

@Ashwin Can you elaborate why Tangosol (Coherence) product is relevant to Hadoop monetization?

8:16 AM

Anonymous said…

Oracle has a pretty potent business proposition in Exadata. It scales comparably to Hadoop and is more versatile. Potentially it can offer similar growth numbers to Oracle ?

8:25 PM

the memories of a product manager said…

@abhishekrai It seems you did not distinguish between formatted data and un-formatted data. Databases have clear fields (name, address, phone, etc). Un-formatted data are streams like traffic photos, satellite photos, or web searches. Hadoop can map reduce the huge number of un-formatted data and place them in a database according to the search criteria. Hadoop is complementary to Exadata, and a substitute

8:30 AM

Search This Blog

The memories of a Product Manager

Making big money with Hadoop

Comments

Popular Posts

Many Core processors: Everything You Know (about Parallel Programming) Is Wrong!

Tesla Motors. What I learned from Elon Musk