James Governor's Monkchips

On Amazon, Capacity on Demand, MySQL in EC2, and Sun’s Opportunity (To Lose)

Share via Twitter Share via Facebook Share via Linkedin Share via Reddit

Last week I posted about fellow Enterprise Irregular Charlie Wood‘s online RFP for more server capacity and Sun’s jump ball opportunity. In the back channel the next day I asked Charlie about his use of Amazon’s EC2, because as I understood it the service doesn’t fully support MySQL. His answer deserves wider distribution because it clarifies one current weakness in the Amazon offering as far as this startup is concerned.

The problem with EC2 is that it’s not persistent: when you boot a virtual machine, it reads its image from a hard drive somewhere and once it’s going it has its own “virtual” 160 GB drive, but it’s all in memory. If the image crashes or needs to be restarted, you lose any data you wrote to the virtual drive. S3 is available for persistent storage, but that’s file/object storage. Our app is built on a MySQL back-end, which isn’t a viable architecture (yet) for EC2. If someone (ahem, AMZN?) would build a MySQL storage engine that uses S3 it might be an ideal solution, but the latencies involved might be a deal-breaker. There have been rumors of a Simple Data Service but they might be based on a typo.

When we talked to Marten Mickos, MySQL’s CEO, I asked him about being the “big database in the sky”. If you remember, he said that such a thing would be useful but that MySQL probably wasn’t the company to operate it. Since then, as Dan Farber described, he’s gotten more visionary, saying, “Google gives access to all data in entire world. We could do the same for structured data,” Mickos said. “We could open up all the world’s structured data to all the world’s developers, entrepreneurs and mashup artists. It’s the great database in the sky.” FWIW I posted a small blurb about this here.

Frankly, I don’t care who provides it. Hell, let me put a physical MySQL box or cluster behind an array of virtual EC2 app servers. Whatever. Just figure it out. This is what Sun should be doing with “utility computing”. It’s theirs to lose.

I wrote this post on a plane on my way to Sun’s annual industry analyst summit, so I will be able to find out more about what it is going to do about the Amazon On Demand challenge. Last year at the same conference Sun CTO Greg Papadopolous ran a session that argued Web companies are going to run into some hairy scale problems that will be the natural province of companies like Sun. That is, Papadopolous argued that as the web becomes more and more transactional the loosely-coupled and, or, scale out architectures of the current generation of Web Services and companies will begin to hit some walls. Yep – that old chestnut.

Papadopolous may be right, but Sun’s problems are with the front rather than the back end. Sun’s Startup Essentials program requires that prospects make a formal application, and wait for qualification, before they can buy anything. This barrier to entry feels wrong to me. If I have the money to buy something, why the hell do I need permission to do so? What is the difference between a four year old and a six year old “startup”?

It reminds of the annoyances of consumer web commerce scaled to a startup- you want all my personal details? But I have a valid credit card – all you need is authentication and a delivery address.

My landlord Chris tried to buy some toner the other day. The supplier wanted two utility bills to process the new customer. Naturally Chris went elsewhere. What is up with credit card companies – why aren’t they vouching for customers? At the moment they only seem to share data on credit risks, rather than helping smaller merchants and customers to do business easily using whitehat lists. Please don’t tell me its a data protection issue. 

There are those out there, such as Microsoft’s Dare Obasanjo, who argues Web scale is far bigger than enterprise scale, and that enterprise architecture practices and skills are largely a waste of time, predicated on complexity for its own sake; that any serious Web company will do better. But anyone that has had the “pleasure” of trying to authenticate on Blogger (“We’re out of beta and ready to go”), or seen the recent downtime problems, knows that Google should perhaps have stuck with perpetual betas. If Web apps are so great what is the problem?

That’s the thing with data integrity- its just not easy, whether you’re talking about identities, transactions, or master data. Sloppy can be good, but its not always good. Sometimes structure is a good thing. The key word is perhaps transactional, which takes us back to database persistence and Amazon Web Services.

Before I sign off I want to make it very clear that Sun is not the only company threatened by Amazon’s model. Dell, HP, IBM, and Sun’s storage and server businesses are all going to be pressured by the fact their natural ally, the IT organisation, simply isnt needed for lightweight provisioning. IBM coined the term On Demand but hasn’t delivered it – certainly not with service microgranularity.

Developer needs storage and server capacity. Developer calls online APIs for it. Developer delivers application to customer. Customer wins more business. Service provider bills developer. Developer bills customer. Developer provisions more capacity. 

Provisioning a new server needs to be as easy as a making a API call.  Can the systems companies step up to that?

 

Update – Greg’s keynote today raised some similar issues.

disclaimers: IBM and Sun are Redmonk patrons, MySQL is a client, and Amazon’s search subsidiary A9 is also a client.

 

Technorati Tags: – 

7 comments

  1. Great blog, James, and right on as far as I’m concerned

    By the way, the “big database in the sky” that Marten is talking about (I saw his talk at the Web 2.0 Forum) is about being able to query across multiple web resources. I followed future dialogs with him on this, and I think there is general agreement that what he is describing is the Semantic Web, with RDF and SPARQL. Marten wasn’t that aware of the Semantic Web at the time he proposed the big database in the sky

    I think that this is very different from providing a true structured, durable persistence service for a compute node in the compute cloud, which I agree sounds like a *very* interesting opportunity.

  2. James said:

    “There are those out there, such as Microsoft’s Dare Obasanjo, who argues Web scale is far bigger than enterprise scale, and that enterprise architecture practices and skills are largely a waste of time, predicated on complexity for its own sake; that any serious Web company will do better. But anyone that has had the “pleasure” of trying to authenticate on Blogger (”We’re out of beta and ready to go”), or seen the recent downtime problems, knows that Google should perhaps have stuck with perpetual betas. If Web apps are so great what is the problem?

    That’s the thing with data integrity- its just not easy, whether you’re talking about identities, transactions, or master data. Sloppy can be good, but its not always good. Sometimes structure is a good thing. The key word is perhaps transactional, which takes us back to database persistence and Amazon Web Services. ”

    Perhaps an RDBMS is just not the right solution to deploy in utility compute scenarios. After all, who ever heard of dynamically provisioned databases? How often do we ever move them around between machines?

    Could it not be argued that RDBMS’en are linked to the non-web, centralization dominated world of the enterprise data-centre?

    Bosworth would seem to think so (at least a little bit):

    http://acmqueue.com/modules.php?name=Content&pa=showpage&pid=337&page=5

    Maybe we cram far too much into an RDBMS because it’s an easy well-understood option rather than because it’s suitable for the task? One could argue that enterprise should be allowed to continue to do this but it might be that if they wish to exploit these new models they’ll have to give up the RDBMS drug.

    It’s not like we don’t have other ways to achieve the equivalent level of transactional integrity that we get with an RDBMS……

  3. […] …..has been worshipped for a long time but there are various barbarian enclaves that are not ready to kneel before him. […]

  4. The statement made made by Charlie Wood regarding restarting an EC2 instance resulting in losing your data is incorrect. I restart my instances on a fairly regular basis (mid-stream of doing a lot of development and testing of a new product which sits on top of a built-from-the-kernel-up Linux-based instance), and the instance is maintained, as-is, data intact.

    If I shutdown my instance without first saving its state, then yes, the data is lost. But saving its state and storing it on S3 is a straight forward process. The weakness, of course, is the inability to mount S3 as a virtual drive, but there are several external solutions, including a FUSE-based solution that I am involved with, that help make this process pretty straight forward, so it’s not a problem without remedy.

  5. […] James Governor’s Monkchips » On Amazon, Capacity on Demand, MySQL in EC2, and Sun’s Opportunity (To Lose) (tags: amazon onDemand SAAS) […]

  6. I’m glad you mentioned web-scale vs. enterprise-scale. For software to work on a web-scale, it requires massive simplification, something that many legacy enterprise software have not come to terms with.

    I’ve seen enterprise software with security rules that require cross joins across 3 or 4 tables. This will simply not do for web-scale software. (I’d contend that this kind of rules are potentially so convoluted that only one or two people truly understand how it operates at an organisation)

  7. […] distribution, and all the other boring issues. The X that’s missing in the equation is a database in the sky (and persistent storage for EC2 instances, but that’s apparently in the works…). […]

Leave a Reply

Your email address will not be published. Required fields are marked *