If you have a given number of processors, lets say 32, you may want to reserve a set number across a spectrum of performance. For instance let us take a fuzzy definition of:
You could then map half your processors to Fast queries, and then split your Slow and Unknown queries across your other processors. That way you could always know that you have enough CPU to handle Fast queries (or you could give hints that certain queries matching a pattern should always be fast).
With MySQL you never have costs which are exact before the query is optimized, namely because the optimizer recomputes the execution path for each query. This means you do not have stale data being used for execution, but it means also that the query is always being calculated.
If you can tie the query execution cost calculation to a specific prepared statement you could keep a running average of cost (possibly even the deviation, which would be nice for knowing which queries are well behaved).
I am sure proxy could do this by tracking response times, which is great for scale out, but with faster machines there is no reason not to figure out the scale up solution as well.
With the libevent work we now have pools of threads, so the above work is conceivable.
On a slightly different note, I was asked about storage engines doing this. The answer is that yes a storage engine could do some of this as well, but thread pooling works better in the layer above the engines, so you will get more bang for the buck by working on the upper end of the server. I've run into this problem several times with the Archive engine. I can push data at this point faster then the server can really process it (aka, it is the bottleneck). The upper end of the server will take more TLC to really make use of multiple CPUs. Libevent was a good start, but once we hit 32+ way machines we are going to need to work on our strategies.
32+ way machines are going to make everyone rethink their strategies :)