The Death of Read Replication
« previous entry | next entry »
Apr. 6th, 2008 | 03:12 pm
A number of months ago, possibly a year ago, I wrote an internal letter to the MySQL internal discuss list with the title of "The Death of Read Replication". Ever since then I've been getting pinged internally to publish my thoughts on this externally.
Here goes :)
Read replication is going to be in use for many years into the future. There are plenty of reasons to use it, and plenty of setups where it will make sense.
All of the scripting, management daemons, and ease of use scenarios will not solve its problems though, and I am finding that users have either moved away from it, or more often, have reduced their need for it.
A few reasons:
Latency is painful to manage.
Lots of servers means more head count (both disks and in numbers of people to manage it)
In web usage, the rule of thumb is to keep your query number under 7, for this reason you try make more out of the seven queries you get. You get more bang for your buck with cached objects.
For query count, you don't care about how many asynchronous queries you have to make, you care about how many synchronous queries you have to make.
In 2000 a number of us tried to build caches, and all of us failed except Brad. Brad a few years later came up with memcached. In a world where we all thought the network traffic would be too
expensive he proved everyone wrong. Livejournal used it. Then CNET used it. Etc...
Today you cannot name a big Web 2.0 website that does not use it, and among the non-web 2.0 sites its usage has spread.
For MySQL users, the concept is simple. Build objects from data stored in MySQL and push those objects into memcached. Have all webpages build from objects stored in memcached. Memcached is not the only game in town, but it certainly is the most predominant.
In the past object stores meant "all your data belongs to us". Today we use them as caches.
The model for object caches is simple, it is called read through caching. Everything you can turn into a read through cache object, is an object you can scale.
For sites backed by MySQL, updates to data are handled by pushing data into MySQL, and then having asynchronous systems build objects that are pushed into memcached (you can study these systems by look at Gearman, Hadoop, and Map/Reduce systems via Google). Jamie took my ghetto style runtask and built it out to being something serious for Slashdot somewhere in 2002.
Asynchronous job systems with cached object stores are the state of the art for building large scale websites.
MySQL" scaling users" come in three breeds for task systems.
Those who are deploying an open source solution, those who wrote their own, and those who are still trying to make it work with cron jobs.
What does this mean? Less database servers. Bringing down your load means you push off the load to another tier.
Is it worth the hassle to scale another tier?
It is if you can:
Lower the number of machines required
Lower the complexity of the setup
Use less electricity!
Getting rid of spinning disks is a good way to get rid of number three. The number of companies who are building solid state platforms is also growing.
Let me leave you with a thought. Recently I was at the SLAC large database forums, and one of the scientists pointed out, perhaps rightly, that all of the database vendors may find themselves made irrelevant by Hadoop. That is a hard thought to grasp, but I have been kicking it around in my head to think about what this means.
In one of the diagrams I drew for myself I asked "why do I need to go through MySQL at all... unless I just want it as a backup or for ad-hoc reporting?".
An observation Josh Berkus made internally was that databases are here because in the generic sense, they work well enough. Despite all of the fancy features we layer around them, they do a good job over a broad range of problems. This was an excellent observation.
Web 2.0 scaling is a classic problem. You can even say it is "Windows vs UNIX". UNIX means small parts that do one task being used to build up systems. Windows means monolithic solutions. There are advantages and disadvantages of both (though personally I am only interested in heterogenous solutions... I do not like having to keep all my eggs in one basket).
Web 2.0 is about picking the best of all solutions and gluing them together.
Here goes :)
Read replication is going to be in use for many years into the future. There are plenty of reasons to use it, and plenty of setups where it will make sense.
All of the scripting, management daemons, and ease of use scenarios will not solve its problems though, and I am finding that users have either moved away from it, or more often, have reduced their need for it.
A few reasons:
For query count, you don't care about how many asynchronous queries you have to make, you care about how many synchronous queries you have to make.
In 2000 a number of us tried to build caches, and all of us failed except Brad. Brad a few years later came up with memcached. In a world where we all thought the network traffic would be too
expensive he proved everyone wrong. Livejournal used it. Then CNET used it. Etc...
Today you cannot name a big Web 2.0 website that does not use it, and among the non-web 2.0 sites its usage has spread.
For MySQL users, the concept is simple. Build objects from data stored in MySQL and push those objects into memcached. Have all webpages build from objects stored in memcached. Memcached is not the only game in town, but it certainly is the most predominant.
In the past object stores meant "all your data belongs to us". Today we use them as caches.
The model for object caches is simple, it is called read through caching. Everything you can turn into a read through cache object, is an object you can scale.
For sites backed by MySQL, updates to data are handled by pushing data into MySQL, and then having asynchronous systems build objects that are pushed into memcached (you can study these systems by look at Gearman, Hadoop, and Map/Reduce systems via Google). Jamie took my ghetto style runtask and built it out to being something serious for Slashdot somewhere in 2002.
Asynchronous job systems with cached object stores are the state of the art for building large scale websites.
MySQL" scaling users" come in three breeds for task systems.
Those who are deploying an open source solution, those who wrote their own, and those who are still trying to make it work with cron jobs.
What does this mean? Less database servers. Bringing down your load means you push off the load to another tier.
Is it worth the hassle to scale another tier?
It is if you can:
Getting rid of spinning disks is a good way to get rid of number three. The number of companies who are building solid state platforms is also growing.
Let me leave you with a thought. Recently I was at the SLAC large database forums, and one of the scientists pointed out, perhaps rightly, that all of the database vendors may find themselves made irrelevant by Hadoop. That is a hard thought to grasp, but I have been kicking it around in my head to think about what this means.
In one of the diagrams I drew for myself I asked "why do I need to go through MySQL at all... unless I just want it as a backup or for ad-hoc reporting?".
An observation Josh Berkus made internally was that databases are here because in the generic sense, they work well enough. Despite all of the fancy features we layer around them, they do a good job over a broad range of problems. This was an excellent observation.
Web 2.0 scaling is a classic problem. You can even say it is "Windows vs UNIX". UNIX means small parts that do one task being used to build up systems. Windows means monolithic solutions. There are advantages and disadvantages of both (though personally I am only interested in heterogenous solutions... I do not like having to keep all my eggs in one basket).
Web 2.0 is about picking the best of all solutions and gluing them together.
(no subject)
from:
dormando
date: Apr. 6th, 2008 11:10 pm (UTC)
Link
Years go people would shrug that off when I told them... Too hard, inconsistent, data integrity, etc.
Now most people are on the bandwagon and it feels good. Next step is getting job systems to be more universal.
Reply | Thread
(no subject)
from:
krow
date: Apr. 6th, 2008 11:32 pm (UTC)
Link
Worker and C server are next. I'll make an announcement on the Gearman mailing list sometime in the next day or so.
Reply | Parent | Thread
(no subject)
from:
mike503
date: Apr. 7th, 2008 01:16 am (UTC)
Link
Exactly. I've been trying to get someone to make it as easy as possible to implement a reliable self-healing replication mechanism for MySQL.
Luckily so far I haven't grown large enough, and I am pushing everyone I know into a simple object get/set methodology that will make read through caching simpler and decrease any issues of data being set and the cache being out of date.
Although now I am also looking into CouchDB as a possibility. With built-in replication and a cocky "databases are okay, but they're trying to build redundancy and reliability into something that wasn't designed for it" attitude, I am very intrigued. Especially with it's simple REST-based usage model using HTTP and JSON standards (well, not sure if JSON is formally a "standard" but good enough) and being architected from the start to be shared-nothing and distributable... this could help solve my sleeplessness at night worrying about scaling...
Reply | Thread
(no subject)
from:
krow
date: Apr. 7th, 2008 01:34 am (UTC)
Link
Reply | Parent | Thread
(no subject)
from:
mike503
date: Apr. 7th, 2008 02:10 am (UTC)
Link
I'm hoping getting into all of these help me move to a more shared-nothing architecture :)
Reply | Parent | Thread
OCFS2
from:
sykosoft
date: Apr. 7th, 2008 05:31 am (UTC)
Link
Michael
Reply | Parent | Thread
Re: OCFS2
from:
mike503
date: Apr. 7th, 2008 07:24 am (UTC)
Link
Thanks!
Reply | Parent | Thread
(no subject)
from:
volodymir_k
date: Apr. 7th, 2008 10:48 am (UTC)
Link
Synchronous queries in the sense that client does not have to wait for response before sending another request. Ethernet card however does not send packets asynchronous, one network segment is shared by all connected hosts. Interesting what is typical ration of time-to-ethernet-send to overall roundtrip. E.g.when there would be 1e+4 network requests, does it matter that carrier is busy.
> The model for object caches is simple
And I am waiting when smart batching idea will come. I mean, if we have to query 2 levels of objects by userId (e.g. "accounts" - "transactions"), with memcached we have to send 2 requests. The first is for accountIds, the second is for transactions. We cannot reference objects until *the client* gets their IDs. I wonder if that would be common case to send "program" to memcached: "get refs value by object ID then get referenced objects".
Reply | Thread
NDB
from:
bg194
date: Apr. 18th, 2008 10:28 pm (UTC)
Link
(With pluggable storage could we back memcached with NDB?)
Reply | Thread
Re: NDB
from:
krow
date: Apr. 18th, 2008 11:01 pm (UTC)
Link
Reply | Parent | Thread
Re: NDB
from:
davidphillips
date: May. 3rd, 2008 12:44 am (UTC)
Link
Reply | Parent | Thread
Re: NDB
from:
krow
date: May. 3rd, 2008 01:52 am (UTC)
Link
Reply | Parent | Thread
Re: NDB
from:
bg194
date: May. 13th, 2008 03:27 pm (UTC)
Link
Reply | Parent | Thread
Re: NDB
from:
bg194
date: May. 13th, 2008 03:32 pm (UTC)
Link
Or junks. Junks is good.
Reply | Parent | Thread
Putting the database where it belongs
from:
natishalom
date: Apr. 26th, 2008 06:53 am (UTC)
Link
I've recently wrote a detailed post on Scaling Out My SQL where i compare the pure database clustering approach with a combination of In-Memory-Data-Grid as front end and database as the backend data store. In general i think that decoupling the application from the underlying database as well as keeping pure in-memory data store to manage our data as the front-end data store is the most scalable and efficient solution. There are plenty of references, many of them servers today mission critical applications in the financial world in which this model proved to be successful.
I was wondering if you ever considered something on that line?
Reply | Thread
Re: Putting the database where it belongs
from:
krow
date: May. 13th, 2008 10:57 pm (UTC)
Link
When you go to the store you find dozens of flavors of tomato sauce. Architecture is pretty much the same way... there are several solutions.
Reply | Parent | Thread