Log in

No account? Create an account

Benchmarks, Comparing Drizzle to others

« previous entry | next entry »
Apr. 15th, 2009 | 05:21 pm

"I was surprised for example to hear from Jay that the Drizzle team should not compare its performance to MySQL only to Drizzle itself"

This quote came from a piece of email today asking me about why we, as in the Drizzle project, have not been publishing comparison benchmarks.

First of all, this is not a Sun request, this was a request from me to the other core members of the project.

Let me explain...

I don't think projects can ever objectively compare themselves to other projects. Look at all of the heat that went on for years between Postgres and MySQL. I think we, the Drizzle Project, are better off publishing code and helping others create benchmarks, but avoiding doing the comparisons ourselves.

We are inherently biased.

One of the first things that went up on the Drizzle wiki were the pages that were labeled "this is not what we are". Namely we wanted to create a place for people to be able to fill in that sort of information, and debate it.

We should be facing truth about what we do, and listening outwardly for feedback on how we are doing.

When it comes to benchmarks, I really believe we must stay out of the business of doing anything other then regression testing. I welcome everyone, and anyone else to do benchmarks.

I ask for groups to work with us so that we can improve upon what is found (and trust me... there is still plenty for us to find).

So the lack of comparison benchmarks has nothing to do with Sun, is has more to do with my own requests.

Drizzle as a project is about providing a new Kernel for MySQL, and I really believe others can do the benchmarks better then we can.

Those not attached to the core project come off as a legitimate source of information in a way that we simply can not, and should not, try to be.

On that same note... the Drizzle Project doesn't produce an OLTP engine. At some point we will get into the game of publicly comparing engines, but when we do it we will not be pulling any punches. We will be working with everyone we test with to make sure our final reports are as accurate as possible. I feel pretty free to compare libraries/plugins/engines that we make use of in Drizzle

It is only through public scrutiny that we can really provide distributions that bundle Drizzle with solid decisions on what to package.

Link | Leave a comment |

Comments {6}


(no subject)

from: tanjent
date: Apr. 16th, 2009 05:25 am (UTC)

"I really believe we must stay out of the business of doing anything other then regression testing"

That seems so alien to me - I can't imagine releasing code without having a thorough understanding of its performance on at least a "X% of the CPU is spent on task Y, Z% of the CPU is spent on task Q" level, and quite often I dig down to the cycles-and-cachelines level.

One of the projects I maintain at MS is a suite of profiling tools for doing game performance analysis, which has been integrated into well over a dozen shipped titles so far. For the kind of work I do, it's crucial. Working without it would be like operating blindfolded.

Reply | Thread

Brian "Krow" Aker

(no subject)

from: krow
date: Apr. 16th, 2009 04:35 pm (UTC)

We do lots of internal benchmarking and we do compare ourselves to others. For instance we have stack traces showing what a query does and the call cost for each operation.

What we do not do is publish these to outside audiences. We leave that in the hands of others.

Reply | Parent | Thread

Sounds fair, however...

from: mingenthron
date: Apr. 16th, 2009 09:14 pm (UTC)

Benchmarks are done for multiple reasons. The best are the ones that aim to be realistic and try to give a metric that help someone determine the suitability or capability of something (i.e. a component, a system, a configuration, a maximum workload) for a given task. Drizzle seems to be looking to answer whether or not it's a more capable kernel, which is good.

Then there are all of the other benchmarks... which are done for marketing/advocacy. The challenge here is that others then rush to compare themselves and give a better result with a metric. I think like it or not, Drizzle will be inviting these comparisons.

So the challenge is to be clear about not only the metric, but also produced the metric. It'd be good to give guidelines on what's expected to be reported.... i.e. filesystem type/configuration, system type/configuration, and drizzle configuration

I don't think it needs to be so heavyweight as SPEC, but encouraging people to have those kinds of details on the system under test helps promote honesty.

Reply | Thread

Brian "Krow" Aker

Re: Sounds fair, however...

from: krow
date: Apr. 16th, 2009 09:20 pm (UTC)

Right now there are a couple of groups working on benchmarks who are feeding us information. At some point I will encourage one of them to publish their results... but it will be them doing it (and I am wanting them to speak to everyone when they do it).

Internally though? We are doing them for ourselves. Its helpful to compare how long a given operation actually is taking compared to how other databases are doing it.

Reply | Parent | Thread

Grizzly Weinstein

Today's news

from: sea_gaagii
date: Apr. 20th, 2009 02:58 pm (UTC)

I am sure you might have known it was coming. But did you ever think you would be working for them?!

Reply | Thread

Brian "Krow" Aker

Re: Today's news

from: krow
date: Apr. 20th, 2009 10:25 pm (UTC)

I thought so a year ago... just took a while :)

Reply | Parent | Thread