You are currently browsing the Michael Kimsal's weblog posts tagged: Database

Responsibility, liability and cloud computing

I was browsing High Scalability this morning and came across this article which got me to thinking about something I’ve not seen discussed yet when talking about ‘cloud computing’ in all its forms: liability.  I’m not talking about downtime, though that’s an issue in and of itself.  Looking specifically at S3, Google’s bigtable, and other similar services, what are the legal ramifications which you face if someone steals your customer data which you store in one of these hosted services?

I’ve not signed up for these hosted services yet so I’ve not reviewed the specific Terms of Service.  I’d have a hard time believing they’ll take *any* legal responsibility for lost or compromised data, even if it’d demonstrated that there was a security breach (physical or otherwise) in their infrastructure.  Why would they?  Perhaps some case law will eventually be established, but I’d be fairly nervous about putting large amount of customer data in to anyone’s inrfastructure that I couldn’t audit/control.

This is a tough call, really, because it’s fairly obvious that Google, Amazon, etc. have pretty smart people working for them, and likely are better at security than my organization might be.  That may be true, but it still doesn’t mean I can take a look at how they’re handling things, and doesn’t really give me much of an excuse if my customers’ data gets compromised.  If anything, S3 and others will likely be a bigger target for more advanced criminals to target.  Yes, it’s likely more secured, but the payoff will be much bigger.

Is this an acceptable risk for you, putting customer data in the hands of others for whatever reason?  To some extent we all do it – I’m putting customer data on servers in data centers where I don’t always have direct physical access.  I can lock down the machines via software as much as possible, but it’s not a perfect option, certainly.  Going the next step and just moving your data in to databases over which you basically only have read/write functionality seems like too big a step for me at this point, at least for sensitive data.

Are you using these ‘cloud’ services yet?  If so, what’s your take?

I'm currently working on a book for web freelancers, covering everything you need to know to get started or just get better. Want to stay updated? Sign up for my mailing list to get updates when the book is ready to be released!

Web Developer Freelancing Handbook

Optimization will become more important in the next few years

I was looking at hosting provider this morning.  No particular reason – just saw an ad and clicked on it to poke around.  The pricing seemed reasonable, but because mosso is a ‘cloud computing’ sort of provider, I dug a bit further.  You’re charge $100/month for “10,000 compute cycles”.  Seemed a bit nebulous, so I dug in the the FAQ.

For example 10,000 compute cycles would power:

  • about 2.1 million page views using a database-driven content management system
  • about 11 million page views of
  • about 25 million requests for a static 15KB image

and then:

What goes in to calculating a compute cycle?

Mostly, CPU processing time. However, compute cycles also account for the disk I/O your application’s operations consume. For example, a page with heavy database queries will consume more compute cycles in part due to the larger volume of disk I/O it requires.

So, for people who are used to “just get it done” coding (a valid approach, I might add), a move to cloud computing may ultimately end up costing more than being faced with CPU limits on one initial box.

Obviously, more traffic will increase your site usage, and you’ll eventually need more boxes (if you’re using a traditional server approach).  But by watching the stat usage on that single box, you’re more likely able to determine where some bottlenecks may be.  You’ll likely get a better idea of what’s *causing* the box to use CPU time – is it user code or database calls?  Working at that level first, and optimizing there, might be a better approach before moving systems to cloud computing systems.

Because a system like mosso transparently expands to whatever your needs are, you may just get hit with a bigger bill than you bargained for without necessarily being aware of it beforehand.  It does seem they offer a ‘used compute cycles’ counter on their dashboard though, which would help monitor the situation.

This isn’t to pick on – I like the look of their offerings.  It’s more to point out that optimizations, both code and database, will likely become ‘in fashion’ again as people move to cloud computing systems with an eye towards keeping the costs down.  Database optimizations in particular will probably become key for many projects.

Related story:  back many years ago I was working with a company who was growing ASP projects like gangbusters.  Scaling up meant throwing more servers in to a farm.  Everything was built with MS Commerce Server.  Well, one day MS comes knocking and says “oh, you can’t use the dev versions of Commerce Server for production”.  We were paying for the ‘developer program’ where you get copies of everything for a yearly license fee.  I’m not sure if we all knew we couldn’t use them in production or not at the time or not, or just assumed that because they weren’t crippled, they were fine to use.  Either way it was something like $8k per server we needed to pay per server running a production copy of Commerce Server.  We had about 150 servers running at that point.  Next day the owner came in to the engineering meeting and said “we need to focus on optimization!”.  I’m not sure how many optimizations were done, but it became a big priority for a few weeks – even a 10% reduction in servers would have saved $120,000 at that point.

Perhaps most people won’t be dealing with projects on such a large scale to start with, but part of the attraction of cloud computing is that you can scale up to those levels while leaving the infrastructure headache to someone else.  They’ll gladly provide it to you, for a fee.  Just make sure you really need what you’re being charged for, or if easy optimizations would be able to save you a bunch.

I'm currently working on a book for web freelancers, covering everything you need to know to get started or just get better. Want to stay updated? Sign up for my mailing list to get updates when the book is ready to be released!

Web Developer Freelancing Handbook