I keep toying with different ideas... rewriting it in different ways to see if I can get any performance changes in it.
Thus far... nada... no go.
Yesterday I had the idea of "bypass my write buffer, use writev() to push the data to the client".
This would save me in design two memory copies.
Results?
writev() was slower.
D'oh!
What to do, what to do...
I keep a copy of glibc() on my laptop. I do this for just such the occasion.
How is writev() implemented? It alloc's memory large enough to contain the data, does a copy, and then calls write().
Well crap.
I have never used writev() before, the interface has just never been one I wanted to deal with for asynchronous connections.
I had always assumed that it was clever. Somehow saving on system calls, yadda, yadda, yadda...
Nope, it does not.
Oh well :)
Back to the drawing board...