?

Log in

No account? Create an account

PostgreSQL to Scale to 1 Biilllliooonnnn Users, Dr Evil would be proud

« previous entry | next entry »
Apr. 6th, 2008 | 02:44 pm

For reference:
http://highscalability.com/skype-plans-postgresql-scale-1-billion-users

Here are some observations by me on the state of database usage in Web 2.0:
  • All major web 2.0 sites now use object caching (of one type or another)
  • Sharding and now Proxy style solutions are becoming commodity. They are everywhere.

    What does this mean?

    Replication is dead except for replicating for "application" needs.

    Good News :)

    For MySQL it encourages multiple engines. For Postgres I suspect their flexible index design will be useful. The "I replicated over here for a backup, or to run reports..." is still happening a lot. Multi-master replication is one scenario to achieve high availability (DRBD on the low end... you will go broke trying to deploy it with too many nodes). The problem with multi-master is the users, or the developers.

    We could blame the users for not understanding it, and deploying it incorrectly, or we could blame the developers for not making it dirt simple to setup.

    Hey! We can blame marketing guys for over hyping it!

    No matter who is to blame, not everyone can keep it running. Plenty of people do though. This blog entry is being hosted on a site that has had it working for a long time.

    Bad News :(

    The above mentioned technologies now work for any database. So you can pick your database and scale it. Picking an open source database is now just picking for reliability, since that is the one thing that open source databases have in common right now... that and a plethora of drivers for almost any situation.

    What does this mean for someone trying to promote an open source database today? It means that there are only two large differentiators:

  • Online features (aka making schema changes, modifying tables...)

  • Scaling on multi-core/multi-way machines.

    Both of the above are done horribly today by open source database (and not all of the commercial competitors do well either). Online features are a ways off I suspect in the open source world. With proxy designs you can build around online features, but at the end of the day... they do not exist, you are building around problems.

    And backup? I know someone out there is thinking "backup".

    Backup is irrelevant for those of you who care about this discussion. LVM/ZFS snapshots are the rule of the land. With Apple moving to ZFS this will be built into the OS (which makes Apple start to look like a viable platform for servers).

    BTW I am at the MySQL User's Conference the week after this week. Most likely I will be putting together a BOF one night on this topic (and we have a Hackathon planned for Memcached another night). I will also be talking on the future of databases at Web 2.0 Expo in a couple of weeks.
  • Link | Leave a comment |

    Comments {6}

    Yazz D. Atlas

    backups

    from: aaton
    date: Apr. 7th, 2008 06:38 am (UTC)
    Link

    you wouldn't be talking directly to me now would you :-)

    Reply | Thread

    Brian "Krow" Aker

    Re: backups

    from: krow
    date: Apr. 7th, 2008 06:57 am (UTC)
    Link

    Nope :)

    I do know you and your systems fall into this category. It is best practice at this point.

    Reply | Parent | Thread