?

Log in

No account? Create an account

Queue Engine, and why this won' likely happen...

« previous entry | next entry »
Jun. 28th, 2007 | 11:40 am

After mentioning Blaine Cook's use of MySQL as a queue engine for Twitter I've been pinged about why couldn't one be written for MySQL.

Useful? Hell yes.

Possible? Not Really.

Lets look at the simple case.

CREATE TABLE queue (id serial, message text);

INSERT INTO queue VALUES (message) "This is the first message";
INSERT INTO queue VALUES (message) "This is the second message";
INSERT INTO queue VALUES (message) "This is the third message";

In an application that would use this, the sort of query you would see would be:

SELECT message FROM queue LIMIT 1;

For a queue engine to work, it would always need to respond with the first record in, being the first record out. So you would get:

"This is the first message"

as a response.

For a second call to the select you would get:

"This is the second message"

At no point would the row appear twice. Sounds great? Here is the problem where SQL would come in. Say you had never done any select so all three rows were intact. So you then do a:

SELECT message FROM queue WHERE message LIKE "%third%";

This would cause MySQL to scan the rows. The engine would have no idea what rows were returned and would empty its queue. Even push down conditions could not fully resolve for the engine what rows were actually sent.

Ick!

That part of the behavior really bothers me, and is why I don't think a queue engine really makes sense (though I believe I could be really wrong about this). The behavior of the engine would be radically different from any engine we have seen so far.

I and Monty have talked about adding syntax for years that would allow for a DELETE FROM queue WHERE <> RETURN RESULT (or something similar) that would allow this behavior, but having an engine specialize in it would be a bit different. Most engines would have a problem optimizing for the sort fragmentizing affect this would have.

It might be possible to add a callback from SELECT that would tell the storage engine that the row had been selected for use... but I worry about calling that for each row (though this is a matter adding just one additional call in the handler API).

So what do you think?

Link | Leave a comment |

Comments {4}

Mark Atwood

(no subject)

from: fallenpegasus
date: Jun. 28th, 2007 06:58 pm (UTC)
Link

My answer for troublesome queries is "So, don't do that then", similarly to my answer to troublesome and expensive queries in the S3 engine.

Separate SELECT from DELETE, and then implement

SELECT id,message FROM queue LIMIT 1;

and implement

DELETE FROM queue WHERE id="foo";

and then just tell the user that doing any other sort of SELECT is a Bad Idea.

Reply | Thread