Log in

No account? Create an account

Drizze, libuuid, Sometimes "other" is better...

« previous entry | next entry »
Oct. 29th, 2008 | 03:05 pm

One of the stated goals of the Drizzle project is to "reuse many eyeballs". I dislike "Not Invented Here", it breaks one of my primary rules that is "all engineers should be lazy".

By lazy I don't mean "don't do your work". Being an engineer means that you build stuff. If you aren't building stuff, then you are not an engineer.

Being lazy means that you reuse other people's work as much as possible. Skip re-inventing the wheel.

Sometime ago MySQL introduced a uuid() function into the server. It creates infinite numbers of keys for you, at the cost of creating a large footprint in your indexes. There is a trade-off in this, but I find people are willing to make it.

What was the problem?

We wrote our own UUID function instead of just inheriting the one that most systems provide. What does this lead to?

Code that only a few eyeballs ever looked at (and we have an active debate on whether its startup is thread safe or not).

We decided to look at this recently as an exercise in "was there a better choice". For this we picked the libuuid code that comes with Linux and OSX distributions.

The end result?

The libuuid code was faster.

Not by a lot, but the performance did show up, especially on multi-core hardware.

Is this a surprise? Not entirely. I would hope that code which is used by many people would turn out better the code that only a few looked at.

I'm attaching the end results. The first run was doing incrementing thread runs while diving the load out among clients as threads increased. The second run shows increasing work as threads increase. All work was done using our default engine (which is Innodb). I used on of my spare 8 core systems. When I get the chance I will look at reproducing it on something larger.
Picture 5.png

Picture 4.png

Link | Leave a comment |

Comments {6}

Re: hm......

from: anonymous
date: Oct. 30th, 2008 01:51 am (UTC)

Also works better if you're on a distributed/cloud DB and just using Drizzle as a storage node :)

You can compute the hash on the client and then route the query to the correct Drizzle node.

You can truncate the hashcode if you can accept more collisions or just store the whole thing.....

BTW. This is a feature I've wanted for a long time. The ability to store data as binary but use an escaped encoding when writing queries and printing results.

Base64+filesafe would be ideal for this.

SELECTing binary data on the console isn't always pretty. :)

Reply | Parent | Thread

Brian "Krow" Aker

Re: hm......

from: krow
date: Oct. 30th, 2008 02:00 am (UTC)

At some point in the very near future we will have a built in UUID type. It will store as 16byte, but will do proper display.

Reply | Parent | Thread