With consistent hashing we have data spread out over servers. A loss of a single server removed 1/N of the available cache.
Not bad, but it is also not perfect for everyone. Some users would rather use more hardware and take an approach of fewer losses.
What needed to be done was to replicate the data to multiple node, and handle node failure. This has been on the list for a while :)
Did I get to it? Nope.
Did someone else? Yep.
I got a patch for this a few days ago from a user using memcached that needed it.
So now:
memcached_return enable_replication(memcached_st *memc)
{
uint64_t value;
value= 2;
memcached_behavior_set(memc, MEMCACHED_BEHAVIOR_REPLICAS, &value);
}
All you now need to do is set the number of servers you want to replicate to and you are in business. There is the problem of the split brain/error but not crash of the first node. If users are particularly worried an asynchronous fetch could be done with a compared result. I'm not sure this matters, but I'll probably consider it.
You can pull it from:
http://hg.tangent.org/libmemcached
hg update -C replication
I have it in a separate branch for the moment. 0.17 will be released in the next day, and I do not want this going out without more review.
Thanks to a cross Atlantic flight and a weird sleep schedule I finished this up tonight :)