Thursday, July 03, 2014

Make money with Entrepreneurial Performance Computing

A shocking discovery

The supercomputers and very large clusters have such limited access that at least 99.5% of the engineers and scientists in the world have no access to these applications. Power simulations are an essential tool to create wealth in our society. Right now is a grossly underutilized tool.

In this blog I propose to create a portal for performance computing applications using recent developments on web services for science applications. This can only be successful as an entrepreneurial effort financed for profit Silicon Valley and San Francisco style

I also propose - again -  to create incubators focused on performance computing operating in all Universities and research organization currently funded by NSF, DOE, Health Department, Defense and National Security.

I hope this is the right way to reach more than 90% of researchers and engineers, up from a meager 0.5%

I use this blog to to create awareness and I hope investors and interested institutions will contact us to help start

Product and Services know-how in High Performance / High Throughput Computing

It is not here. It's lacking.

A search for the keyword "HPC" on this blog lists thirty eight entries. One article is relevant to this title: Why HPC TOP500 never made any money and never will in its' present shape . Another blog entry relevant is New Ideas for starting any business seeded while attending supercomputing events.

In these blogs I discovered that  any product and service in a startup must have clear descriptions of the people who want them. We have no idea who are the people who want performance computing by application.

Figure 1: How to make products in 21st century

As per illustration above, if we want to make people want our "things", we don't care about who they are. We assume they will flock in admiration to some giga-dinosaur from  TOP500 list June 3 2014
Tianhe-2, a supercomputer developed by China’s National University of Defense Technology, has retained its position as the world’s No. 1 system with a performance of 33.86 Pflop/s (quadrillions of calculations per second) on the Linpack benchmark, according to the 43rd edition of the twice-yearly TOP500 list of the world’s most powerful supercomputers.
This is an exotic animal people see in zoos and no one wants it at home. I don't even know what it does exactly. We don't know how much it costs, but it is not for sale anyway. We don't know who users are, what they are doing and what is their goal in life.

Tianhe-2 retains its TOP500 leadership for three years in a row. This, as far as I know - it has never happened before. It seems the research and industry lost interest in creating supercomputers with more and more quadrillions of calculations per second, consuming power costing  hundreds of millions of dollars in projected lifetime.

How many researchers (engineers and scientists) are worldwide?

A recent discussion (November 2013) on ResearchGATE estimates about 10 to 12 million scientists worldwide 
In Polish universities and other scientific institutions ResearchGATE is not as popular as in other countries. So if we assume that the global number of people who have a profile on ResearchGATE medium reaches 35-40 %, it can be assumed, that the "global market" there are about 10-12 million people in the "scientific sector".
According a a February 2014 US Congressional Research Service report
In 2012, there were 6.2 million scientists and engineers (as defined in this report) employed in the United States, accounting for 4.8% of total U.S. employment. 
56% are in computer related and 25% are engineers. As US has between 40% to 50% of the world's researchers, the two data sources confirm each other.

What type of people use a supercomputer?

The common answer is:  Super People 
Actually, physicists, meteorologists, global warming people, etc. Anyone modeling complex behaviors that they can describe mathematically can use supercomputers to simulate the interaction of those behaviors. 
Also - engineers will be using them to model parts and whole cars without having to actually make them first. They can model a car's performance, strength, etc on computers without having to actually make the car first. Design it on computers, model its performance. then build it. It used to be a whole lot of trial and error to get things to work right. Now they figure it all out on the computer beforehand, and build it straight from the computer specs...

How many users ("Super People") for TOP500 ?

These lists are not published for TOP500. I learned from my very knowledgeable friends: 
HPC systems in industry are normally running application packages from ISVs and these applications are far away from reaching the peak performance of the HPC systems they are running for. The reason is that solving a specific problem with an ISV package does in general not produce long full matrices, as they are required for LINPACK.
The focus is on  the FLOPs rates that can be reached for a given HPC system under optimal conditions. The practical utility is irrelevant at these levels

Let me assume that each supercomputer listed on TOP500 has 100 users. In this case the total number of users could be 50,000 researchers and scientists who directly work with a supercomputer of enormous clusters.

This number is probably a maximum , because the access to supercomputers or on large Grids like Open Science Grid is restricted via policies and other arbitrary ways. For example 
every six months, Lawrence Livermore National Laboratory gets around 20 to 25 proposals from different national laboratories and accepts around 10 of them. At any given time there are usually one to four projects using the supercomputer. Priority is given to whatever project is deemed most important

How many people use simulations?

Wolfgang Gentzsch  research based on the sales of workstations and  PC dedicated to simulation  estimates about 20 million simulation users worldwide. They are limited to their workstations or very small grids 

The access to Supercomputer and large cluster based applications is not pleasant. See TACC XSEDE Manage Permissions with Access Control Lists . There is no way to pay for access as this is government. Getting access as a private company is probably as complicated as paying to go in a NASA space mission

99.5% of the researchers have no access to supercomputers

We have 11 millions scientists per ResearchGATE estimates and only 50,000 supercomputers users. This means roughly 0.5% can use this wonderful applications

99.8% of  simulation users don't have access to a supercomputer

This is a simple calculation using the estimated 20 million estimated workstations simulation users versus the estimate 50,000 users in supercomputer.

Even we have 500,000 TOP500 users, the conclusions will be the same

Do we need simulations in industry?

Yes, we badly need them, but the simple  calculations above show we can not deliver the benefits of compute intensive applications and simulations where they are most needed. Not in Academia, but in Industry.

According  to Chicago Tribune:
SWD is one of 10 small and midsize manufacturing firms offered the opportunity to work with the previously announced Illinois Manufacturing Lab during its initial launch.
The laboratory is an initiative of Gov. Pat Quinn propped up by $5 million in seed funding...
"This is a resource that typically small companies, even if they could pay for computing time, wouldn't know exactly how to use it," said Caralynn Nowinski, UI Labs executive director and chief operating officer. "Many companies don't even know how to use modeling and simulation in their design."
This is an exceptional initiative, but one swallow does not make a summer. We need a continuous flow of entrepreneurial initiatives

Entrepreneurship in supercomputer and big cluster applications

I explained in a previous blog how the incubators work. They may offer a student or group of students and/or researchers $15,000 to 25,000 for 6% equity. You may read How to Evaluate an Offer from a Startup Incubator, This money is used to create a solid proposal to venture capital firm or other investors.

While at Open Science Grid I brought up this idea. The same at University of Wisconsin CHTC. There are many projects funded by NSF or DOE of say $3 million. Some other projects have as much 22 million in funding.

Assume there is an incubator designed to create businesses from performance computing applications. With $3 million investment , one can fund between 150 to 200 startups. Assume 90% fail, we still have a minimum of 15 successful startup with a combined value many times over the initial  $3M investment.

No one listened,  yet. I keep my hope alive

I want to stress this point. In HPC  the predominant belief among researchers is that no venture capital will ever select an HPC project, when they make so much money in any trivial social application that catches attention.

This has to change. When we buy a GPS device, we don't pay a monthly fee or pay a tax for GPS support. We only pay the price of the device. US Government made GPS a free service worldwide.

Paraphrasing Paul Krugman from New York Times, it seems our educational system for high performance computing makes everything possible to go back to "patrimonial capitalism" where the status quo matters more than effort and talent." This encourages students to stay employees for the rest of their lives. It thwarts any entrepreneurial thought. Students may believe there are no other places to do their work, and that wealth is inherited, not earned, at least in HTC/HPC.

Probably if they want more freedom and money, the only way is to join a social network software company and forget about performance computing. One acquaintance said he wanted to erase the HPC experience from his resume, because this makes him unemployable. .

Takeaways

  1. Virtually no engineers and scientists have significant access to high performance computing applications
  2. Virtually no supercomputer engineers know only vaguely who are or who might be the users of the applications they create
  3. Scientists and engineers want to access supercomputer applications as easily as their desktops. 
  4. Research has shown they do not care what is behind the GUI (Infrastructure, command line, scripts, and all the messy stuff.
  5. Mainstream industry and supercomputers almost don't intersect
  6. More human scientists  per project is far more vital than more cores and and other in-animated resources

Possible Solutions: Science on Demand

Recent developments made possible the usage of web services to make complex applications available via web interfaces everyone who has a PC or Mac is familiar with

Since 2010, The NERSC Web Toolkit (NEWT) brings High Performance Computing (HPC) to the web through easy to write web applications.

Using  TACC developed API, we have Agave API "service you have access to over 600 of today's top plant biology applications on the latest HPC and Cloud systems"

These developments have been presented in academia, and incorporated a new term, Science as a Service included in a presentation by Ian Foster How on-demand computing can accelerate discovery

All this work is ready to be commercialized for a much wider audience than just science. Academia has a different mission. Rarely NERSC NEWT mentions Agave, or Agave, NEWT. The two systems are not yet ready for mainstream.

Startups can make HPC and HTC applications mainstream. like Exabyte.io in San Francisco decided to offer complex materials simulations "as-a-service", "on-demand" for everyone. The company hopes to recruit at least 1,000 more scientists who never used a supercomputer before.

The Performance Computing Portal

What I propose is this

Figure 2: How to increase access to performance apps from 0.5% to perhaps 90%  for engineers, researchers and scientists

Each application on the portal will have
  • A description of what it does
  • Expertise level
  • who may benefit from using the app
  • What a user may achieve
  • A discussion group
  • A rating system (one to five stars)
  • Prices for on-demand usage
  • ... more according the fedback after launch

Similar portals as inspiration

Have a look at Product Hunt "the best new products, every day"
Figure 3: Startup expose their application on Product Hunt portal.. Investors watch, peers comments and vote,
Figure 4: The Comments Screen for the jobbox application
Note that we have some trivial applications - which does not mean they could not be successful - like Headspace that teaches you how to meditate. The portal is like a farmers market for small startups or would be startups.

Ubercloud Marketplace

It offers  computing as a service, for professional engineering and scientific simulation projects. It is an exciting development that creates capabilities for HPC in organizations that lack resources in house. But it does not offer the instant gratification that Product Hunt portal does. I know. HPC is bit more than just a short guide on how to meditate.

Interest from IaaS public providers

There is no coincidence that the winners in the 2014 Gartner Cloud IaaS magic quadrant are also the best infrastructures ready for HPC applications. Amazon Web Services is one
Microsoft acquisition of GreenButton  shows how serious Microsoft Azure cloud IaaS is about performance computing applications... This create new, unlimited opportunities for entrepreneurs in HPC / HTC, 
Amazon and Azure and others will consolidate their leadership in IaaS cloud services for HPC/HTC applications. These applications are most resource intensive and have a significant potential for revenues

Bottom Line

We are ready to boogie. I want to thank all underpaid and not recognized yet extraordinary HPC / HTC engineers, researchers and scientists who generated these ideas in my head. My entire career was to act as if I were a talent agent for exceptional underdogs. This is my message to them. Don't leave. Your time has come.



Post a Comment

Blog Archive

About Me

My photo

AI and ML for Conversational Economy