Brian "Krow" Aker (krow) wrote,
Brian "Krow" Aker

AIO, Write Code, Quickly Evolve

I've had a deadline in my head for the last week or so, though the
start of the deadline began months ago.

I was asked a question "why doesn't Archive prefetch blocks of data
from disk while decompressing the current block".

Excellent question which demanded that, I well... fix it. So... I do
some reading.

I have some options:

1) Write my own AIO package.
2) Use the Posix AIO.
3) Support native AIO.

Writing less code is always good. Tricking out a library around every
vendors native Posix AIO is well... a lot of work. Not terribly
rewarding either since the vendors aren't very careful about
compatibility with their native solutions.

Write my own? Well... posix! Posix has AIO, lets use that.

So I did. As previous posts pointed out, it worked great.

Well sort of.

For instance it turned out that the OSX posix AIO didn't well... it
doesn't really work. Including the libraries cause a compile failure.


Solaris? No performance difference, and well... lots of error
messages around "device not available".

Linux? Now for linux, glibc does work. Got anywhere between 50% and
30% better performance.

Rock on!


Valgrind. Damn valgrind, showed that enque bits kept sticking around
after exit. Its a memory leak, though the blocks are merely reachable
which means it is probably ok.

Except that I do not accept that as an answer. I want clean code.

So for weeks I fiddle with it trying to come up with ways to make
sure that valgrind always reports clean. See... my aim is just Linux
at this point.

I would be fine with just Linux.

Monday night. I want to go bowling, I want to leave the house.
Valgrind haunts me.

What do I do at 7:00? Screw posix AIO, I start writing.

First version is just me firing off a thread for each read. Way
lousy, but it works. In fact... it performs ok.

Still lousy.

Crank away at it for another hour and.... we are now firing off a
thread at startup. The thread hangs around and uses a thread
condition to be told "go fetch me some data".

Better. At this point? It is as good as what I had with Posix AIO.

Now what to do? First I fix the hard spin on determining if the IO
thread has completed.

I put and "if" in place, and do a see saw on the broadcast. Clever?
Maybe, maybe not, but it works. The signal will almost never happen
though because the IO should return fast enough.


Same as before.

What to do next?

Pool the IO threads together. Sure, this is not a big deal for
Archive, but we might as well make it a bit cleaner.

And in the end? Pushbuild, MySQL's internal build system, shows it
working on all platforms.

No complaints from Valgrind.

More portable, less overhead, and no Valgrind warnings.


Lesson learned? Keep evolving. I knew from the beginning that the
Posix code was a short term solution which was quite dead long term.
I should have evolved the code more quickly. Should have stayed
active and not tried to work around issues, aka the valgrind warnings.

Press a head, learn from what you do, and just keep evolving.
  • Post a new comment


    Comments allowed for friends only

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded