O’Reilly Radar: Web 2.0 and Databases

In prep for his keynote to the MySQL User Conference, O’Reilly asked a bunch of people running what he calls Web 2.0 sites how they were using databases. Have only read two installments so far, and it’s not very deep, but it’s incredibly insightful. Couple key quotes from Ian Wilkes of Linden Labs:

Like everybody else, we started with One Database All Hail The Central Database, and have subsequently been forced into clustering. However, we’ve eschewed any of the general purpose cluster technologies (mysql cluster, various replication schemes) in favor of explicit data partitioning.

Web 2.0 applications will require more horsepower with less money than One Database or his big brother One Cluster All Hail The Central Cluster will offer. (After all, a 64-way Mysql Cluster installation is just the budget-friendly version of a Sun E-10000.) Unfortunately, this seems to be the minority view, at least if the dearth of automated db provisioning tools is any indication.

And what he means by “automated db provisioning” (in the comments):

I’m talking about tools for setting up new mysql instances – it gets pretty error-prone when replication is involved. As a first pass, I want to be able to do this sort of thing trivially:

Machine X, which has a blank mysql installation, becomes a slave of master A.
Machine Y, which was a slave of master A, becomes master B
Machine Z, which was master C, becomes a slave of master A

It’s difficult now because there are multiple manual steps, which often have long waits for data dumps/imports between them, and which can nuke you if you get them wrong. Some people have solved this already but generally through a series of scripts which they haven’t released.

If this stuff were a modular toolset, a higher-order routine could use it to say “These 12 machines are database spares. Take 8 of them and make empty inventory instances.”

Comments are closed.