Log in

No account? Create an account

The Death of Read Replication

« previous entry | next entry »
Apr. 6th, 2008 | 03:12 pm

A number of months ago, possibly a year ago, I wrote an internal letter to the MySQL internal discuss list with the title of "The Death of Read Replication". Ever since then I've been getting pinged internally to publish my thoughts on this externally.

Here goes :)

Read replication is going to be in use for many years into the future. There are plenty of reasons to use it, and plenty of setups where it will make sense.

All of the scripting, management daemons, and ease of use scenarios will not solve its problems though, and I am finding that users have either moved away from it, or more often, have reduced their need for it.

A few reasons:
  • Latency is painful to manage.
  • Lots of servers means more head count (both disks and in numbers of people to manage it)
  • In web usage, the rule of thumb is to keep your query number under 7, for this reason you try make more out of the seven queries you get. You get more bang for your buck with cached objects.

    For query count, you don't care about how many asynchronous queries you have to make, you care about how many synchronous queries you have to make.

    In 2000 a number of us tried to build caches, and all of us failed except Brad. Brad a few years later came up with memcached. In a world where we all thought the network traffic would be too
    expensive he proved everyone wrong. Livejournal used it. Then CNET used it. Etc...

    Today you cannot name a big Web 2.0 website that does not use it, and among the non-web 2.0 sites its usage has spread.

    For MySQL users, the concept is simple. Build objects from data stored in MySQL and push those objects into memcached. Have all webpages build from objects stored in memcached. Memcached is not the only game in town, but it certainly is the most predominant.

    In the past object stores meant "all your data belongs to us". Today we use them as caches.

    The model for object caches is simple, it is called read through caching. Everything you can turn into a read through cache object, is an object you can scale.

    For sites backed by MySQL, updates to data are handled by pushing data into MySQL, and then having asynchronous systems build objects that are pushed into memcached (you can study these systems by look at Gearman, Hadoop, and Map/Reduce systems via Google). Jamie took my ghetto style runtask and built it out to being something serious for Slashdot somewhere in 2002.

    Asynchronous job systems with cached object stores are the state of the art for building large scale websites.

    MySQL" scaling users" come in three breeds for task systems.

    Those who are deploying an open source solution, those who wrote their own, and those who are still trying to make it work with cron jobs.

    What does this mean? Less database servers. Bringing down your load means you push off the load to another tier.

    Is it worth the hassle to scale another tier?

    It is if you can:
  • Lower the number of machines required
  • Lower the complexity of the setup
  • Use less electricity!

    Getting rid of spinning disks is a good way to get rid of number three. The number of companies who are building solid state platforms is also growing.

    Let me leave you with a thought. Recently I was at the SLAC large database forums, and one of the scientists pointed out, perhaps rightly, that all of the database vendors may find themselves made irrelevant by Hadoop. That is a hard thought to grasp, but I have been kicking it around in my head to think about what this means.

    In one of the diagrams I drew for myself I asked "why do I need to go through MySQL at all... unless I just want it as a backup or for ad-hoc reporting?".

    An observation Josh Berkus made internally was that databases are here because in the generic sense, they work well enough. Despite all of the fancy features we layer around them, they do a good job over a broad range of problems. This was an excellent observation.

    Web 2.0 scaling is a classic problem. You can even say it is "Windows vs UNIX". UNIX means small parts that do one task being used to build up systems. Windows means monolithic solutions. There are advantages and disadvantages of both (though personally I am only interested in heterogenous solutions... I do not like having to keep all my eggs in one basket).

    Web 2.0 is about picking the best of all solutions and gluing them together.
  • Link | Leave a comment |

    Comments {15}

    (no subject)

    from: mike503
    date: Apr. 7th, 2008 01:16 am (UTC)

    "All of the scripting, management daemons, and ease of use scenarios will not solve its problems though"

    Exactly. I've been trying to get someone to make it as easy as possible to implement a reliable self-healing replication mechanism for MySQL.

    Luckily so far I haven't grown large enough, and I am pushing everyone I know into a simple object get/set methodology that will make read through caching simpler and decrease any issues of data being set and the cache being out of date.

    Although now I am also looking into CouchDB as a possibility. With built-in replication and a cocky "databases are okay, but they're trying to build redundancy and reliability into something that wasn't designed for it" attitude, I am very intrigued. Especially with it's simple REST-based usage model using HTTP and JSON standards (well, not sure if JSON is formally a "standard" but good enough) and being architected from the start to be shared-nothing and distributable... this could help solve my sleeplessness at night worrying about scaling...

    Reply | Thread

    Brian "Krow" Aker

    (no subject)

    from: krow
    date: Apr. 7th, 2008 01:34 am (UTC)

    CouchDB is one of the coolest things I have seen conceptually come up in the last year. I think that when Damien hit on the JSON stuff he found the niche that needed to be filled.

    Reply | Parent | Thread

    (no subject)

    from: mike503
    date: Apr. 7th, 2008 02:10 am (UTC)

    Yeah, right now I am packaging nginx and couchdb .debs to install on my servers and give them a go. Already moved my storage to an iSCSI solution (although my only two filesystem choices were GFS and OCFS2... I went with OCFS2, GFS seems to be a nightmare to configure...)

    I'm hoping getting into all of these help me move to a more shared-nothing architecture :)

    Reply | Parent | Thread


    from: sykosoft
    date: Apr. 7th, 2008 05:31 am (UTC)

    We use ocfs2 in production on a +$100M revenue e-commerce/social network site, and it works rather well. The database backend is mysql5 master-master, with mulitple mysql instance per database server, and memcached on all of the web servers. There are SOME caveats with ocfs2 (seriously watch out for the 32000 directory limit).


    Reply | Parent | Thread

    Re: OCFS2

    from: mike503
    date: Apr. 7th, 2008 07:24 am (UTC)

    Michael: Thanks for the input. Would you mind if I talked to you directly? I can't figure out your email from here... mine is mike503 AT gmail.com. I'd like to pick your brain since you've got it running and sanity check myself :)


    Reply | Parent | Thread