?

Log in

No account? Create an account

Field Types, Three bears, A byte is just a byte

« previous entry | next entry »
Aug. 1st, 2008 | 11:28 am

Ever run through an integer before?

One of the early design decisions in Drizzle was we were going to make the decision of field types simpler.

24bit ints? Multiple blob sizes? Display ranges on numeric types? Floats over doubles?

Why do we have types?

  • Good storage.
  • Constraint

    We took those two principles and made a couple of decisions.

    1) Blow the constraint, we throw an error.
    2) Good strorage means not only size, but that we give the right choices and no more.

    Drizzle lacks a TINYBLOB, MEDIUMBLOB, and LARGEBLOB (every time I see this my mind flips to the Goldilocks story...).
    woodroffe_3bears1.jpg


    Why? A Tinyblob buys you nothing but a little space in the "ROW" field. AKA the chunk of memory we use to pass around the results of row reads. The others just save you a byte or two in ROW.

    What do we have?

    We have a BLOB.

    The difference between this and a TEXT?

    Default collation is binary, aka we sort it as a binary string. Our blob is 4bytes,

    I hope you do not try an ORDER BY on the maximum size, that would be bad news (though we plan either on fixing this, or letting Moore's Law catch up).



    How about CHAR? Or NCHAR?

    NCHAR got a response of "delete it before anyone discovers it". CHAR? It has the optimization of giving you fixed with rows depending on the storage engine. We can solve this by just letting you set that on the table description. That way you can in the future have VARCHAR that will even map into fixed row types.

    Nifty!

    How about integers?

    We sit on the fence about these. Was there anything obvious to remove?

    MEDIUMINT. Sure it saves you space, but when was the last time you bought a 24bit computer?

    We have left in 1,2,4,8 byte INTS. For the moment we have kept the MySQL field names.

    So what did we drop?

    INT(11), aka we removed the display formatter. Drizzle is about the web, and more precisely about storage. If you want to format your numbers, please do this outside of the database. This was a carry over from Unireg, and we thought it was time to bury it.

    Do we plan on adding new types?

    Yes!

    Expect to see an IPV4/6 type. That has long been on my list of things I have been wanting to see added.

    How about a UUID type? AKA a 16 byte? Yes expect that.

    Anything else?

    We have been re-factoring the field system a lot. We are getting close to being able to add them through a plugin interface. Once we do that anyone can build constrainable types.

    BTW check out Fables, it is one of the best graphic novel series I have read in a long time.


    image.cgi

  • Link | Leave a comment | Share

    Comments {9}

    (no subject)

    from: jamesd
    date: Aug. 1st, 2008 07:25 pm (UTC)
    Link

    I wonder if the UUID type will cripple storage engines that physically order by primary key if the UUID is used as a PK. Just make the most rapidly changing bits the most significant and such engines automatically perform horribly. Make moderately slowly changing bits most significant and you end up with both good early byte selectivity and good caching properties. Easy, but someone has to think about it.

    Reply | Thread

    (no subject)

    from: jamesd
    date: Aug. 1st, 2008 07:25 pm (UTC)
    Link

    Post it to the mailing list.

    Just saving you some typing. :)

    Reply | Parent | Thread

    Brian "Krow" Aker

    (no subject)

    from: krow
    date: Aug. 1st, 2008 07:54 pm (UTC)
    Link

    There was talk of this a while ago on the drizzle maiing list. If we want to fix the index issue we can play with the bits of a UUID in order to get the MAC in the prefix.

    Reply | Parent | Thread

    Anything else?

    from: xaprb
    date: Aug. 2nd, 2008 02:05 am (UTC)
    Link

    YES, I want something else: time storage with better than one-second precision :) And, a version of NOW() that has high precision too.

    That's not all. I want someone else to build this for me, too =D

    Reply | Thread

    Brian "Krow" Aker

    Re: Anything else?

    from: krow
    date: Aug. 2nd, 2008 02:12 am (UTC)
    Link

    Do you need time storage, or do you need a timestamp that inserted a much higher precision value?

    Even if you had a build today, I would not suggest running it just yet. We still have several incompatible changes to the basics we are making (FRM being a big one).

    Right now is an excellent time to be on the mailing list and making suggestions :)

    Reply | Parent | Thread

    Re: Anything else?

    from: xaprb
    date: Aug. 2nd, 2008 12:50 pm (UTC)
    Link

    I actually don't need time storage myself, but everyone around me does. I'm a consultant now :) What good would a high-precision timestamp do if I have to hack up a way to store it with BIGINT or FLOAT?

    FRM changes: getting rid of them entirely? It's a database, keeping meta-data in tables is the way to go IMO. But I'm speaking out of line. I've joined the mailing list and will listen for a while before I start talking again.

    Reply | Parent | Thread

    Re: Anything else?

    from: xaprb
    date: Aug. 2nd, 2008 12:51 pm (UTC)
    Link

    Can't resist.... You know what, maybe now's the time to let users define their own types?

    Reply | Parent | Thread

    Brian "Krow" Aker

    Re: Anything else?

    from: krow
    date: Aug. 2nd, 2008 02:33 pm (UTC)
    Link

    That is on the table. We have done work around refactoring the Field times in order to let us make this happen.

    Reply | Parent | Thread

    Brian "Krow" Aker

    Re: Anything else?

    from: krow
    date: Aug. 2nd, 2008 02:33 pm (UTC)
    Link

    FRM will be replaced by asking all engines on the system if they have a table.

    Reply | Parent | Thread