?

Log in

No account? Create an account

Innodb Embedded Engine

« previous entry | next entry »
May. 6th, 2009 | 11:15 am

A couple of notes:

1) I have already shared these thoughts with the Innodb team (and received some encouraging thoughts).
2) I wrote this about a week ago. Drizzle/Memcached/Gearman keep me busy so I only got to spend a couple of weekend days to look over the Innobase Embedded Engine. I wish I had more time to go in depth.
3) Innodb has a forum for the engine here where you can get more answers: http://forums.innodb.com/list.php?8
4) You can download the technology from here: http://www.innodb.com/products/embedded-innodb/
5) I'd love to see someone take the embedded engine and port it directly to the Drizzle Engine Interface. I think that the interface they have done would make a much better starting point for integration then what we have today.

Technology

The Innodb Embedded Engine is the same technology used for the Innodb Plugin. That is awesome... think about it. You are getting a threaded storage engine for embedded use that understands schema. Schema-less certainly has its place but there are a lot of cases where schema matters. Better? We are talking about using a real OLTP engine and being able to skip all of the SQL and go straight to the interface. A number of the storage engines vendors have talked about doing this over the years, but this is the first time one of them as delivered.

All of the settings that you can do from the normal MySQL/Drizzle configuration are available. You are exposed to the direct API so you can get at the bits if all you really need is a simple interface (joins are not supported).

Observations

How are the examples?

Without a bit of work they won't compile. There are problems in the include declarations for paths.

Simple stuff like this:
/usr/local/include/embedded_innodb-1.0/api0api.h:26:23: error: ib0config.h: No such file or directory

The paths were incorrectly setup in the header file. My suggestion to the authors would be to look at libxml2 to see how to properly setup header files.

Also? Don't include the config file. You break everyone else who is using auto tools. If you see errors like this you know you will have issues:


/usr/local/include/embedded_innodb-1.0/ib0config.h:348:1: warning: "PACKAGE_NAME" redefined


The naming conventions are pretty poor. For instance the include file names make no sense:
api0api.h db0err.h ib0config.h

The naming scheme is reflective of Innodb's history but for end-developer usage they could have picked something a little bit cleaner (unless you are a Solid DB author, since they use exactly the same naming conventions) .

I noticed when writing code that I must insert as a table name "something/something". You are required to build names like this:
snprintf(table_name, sizeof(table_name), "%s/%s", dbname, name);

From what I gather from the assert the "/" is required for the embedded plugin to know the name of a schema to create.

The error system lacks some sort of error to string function. Having this would make debugging much simpler.

I may be wrong about this though. The examples are a bit confusing and the call ib_database_create() makes me think that there is more to explain then what the current manual has in it. The documentation is far above the quality of what is frequently found for a first release like this... kudos to Oracle for doing this. The only issue I found was a discrepancy in the ib_cursor_moveto() function and a few other random mistakes.

Unlike SQLite, the Embedded Innodb requires multiple files to run. Depending on your use case this will be annoying. The "single file" feature of SQLite really makes it useful as a replacement for writing your own file formats.

The library makes use of its own types, like ib_bool_t, instead of just using standard C99 types. This type of programming is currently a pet peeve of mine. I am getting tired of dealing with "yet another set of types". It makes it a pain to integrate with other code and just increases complexity for the end developer for no good reason.

And writing to stdout on startup? That is a no no. I don't want my libraries writing to stdout on me. Libraries should be quiet, not noisy.

Errors like this make the point that it is not only noisy, but that it still isn't really ready for prime time yet:

090425 14:38:31 InnoDB: Error: table test/Foo does not exist in the InnoDB internal
InnoDB: data dictionary though the client is trying to drop it.
InnoDB: Have you copied the .frm file of the table ?


or...


090425 15:36:52 InnoDB: Error: Client is freeing a thd


That remind you of anything? :)

The API is based on what has been needed so far to make Innodb work with MySQL, so MySQL above specific errors are not surprising. The lack of API calls to list all tables in a schema is another example of this. With MySQL you had the FRM files, so there was no need to be able to get at the list of tables Innodb owned, so no API call exists for this (which is a common need in a standalone database).

One other big item, this library is using a lot of global variables. Take a look at ib_cfg_set_int(). Notice the lack of context for setting variables. This means that the library really was not meant to be used in any sort of multi-tenancy use case. This really limits its usage pattern. A better design would be to create a "context" and pass that to the startup of the library. I've wanted Innodb to cleanup its usage of global variables for years, and I had hoped that with the creation of a library this would have been solved.

Final Thoughts

I can see a lot of use for this library. Concurrency with SQLite is a big issue for write, and libraries like Berkeley DB lack schema, and I personally like the concept of an embedded engine knowing more about the data then the length of the byte array being stored in it.

Still? The library could use a lot of polish. It still shows the warts of being an internal project that has been pushed out into the public. I really hope that before a final release occurs that the authors will clean up the interface and consider the end developer.

As far Drizzle/MySQL goes? I can certainly see basing a future storage engine interface around the concepts found in this library. It is not perfect but it is certainly better then what we have today. It is also obvious after looking at the interface that there is more that can be done in regard to performance if the interfaces were better aligned.

There is a lot of potential in this project.

Link | Leave a comment | Share

Comments {8}

Embedded

from: anonymous
date: May. 6th, 2009 08:09 pm (UTC)
Link

Use Firebird Database. If you later change your mind to connect to full server, just change the connection string

Cheers

Reply | Thread

Brian "Krow" Aker

Re: Embedded

from: krow
date: May. 6th, 2009 08:14 pm (UTC)
Link

This is an embedded engine with no SQL (think something more akin to BDB). There is no "connection string", since there is no SQL.

Reply | Parent | Thread

interesting indeed

from: anonymous
date: May. 7th, 2009 12:26 pm (UTC)
Link

Now all we need is a memcache like protocol (perhaps with protobuf or something like that for the data) and small server around it and have a nice, fast, durable storage option for structured data without all the SQL overhead. Drizzle is a possibilty but I don't think in the forseeable future.

I wonder if it also supports things like foreign key constraints...

Reply | Thread

Brian "Krow" Aker

Re: interesting indeed

from: krow
date: May. 7th, 2009 03:28 pm (UTC)
Link

I've hacked up a Memcached 1.2.X branch to test Innodb with it. Works just fine.

Reply | Parent | Thread

Gearman...

from: jzawodn
date: May. 8th, 2009 11:16 pm (UTC)
Link

Persistent queue? :-)

Reply | Thread

Custom types...

from: anonymous
date: May. 14th, 2009 07:01 am (UTC)
Link

"The library makes use of its own types, like ib_bool_t, instead of just using standard C99 types. This type of programming is currently a pet peeve of mine. I am getting tired of dealing with "yet another set of types". It makes it a pain to integrate with other code and just increases complexity for the end developer for no good reason." Hi, Indeed, this is imperfect, but Microsoft Visual Studio does not provide the C99 types. I.e. it does not conform to C99. How do you propose to workaround this? -- Vasil Dimov

Reply | Thread

Brian "Krow" Aker

Re: Custom types...

from: krow
date: May. 14th, 2009 08:07 am (UTC)
Link

Here is MS's response:
http://stackoverflow.com/questions/146381/visual-studio-support-for-new-c-c-standards

If they won't keep their compilers up to date then consider using the Intel compiler instead.

Looking around there are plenty of header files you can use to solve issues with the state of the Microsoft C/C++ compiler.


Reply | Parent | Thread

(no subject)

from: mrzyuenimos
date: Jun. 18th, 2009 05:32 pm (UTC)
Link

Has the innodb engine been integrated into MySQL 5.1.7-beta libmysqld? I've been trying to use the embedded mysql with the innodb engine and it seems to be creating MYISAM tables. I have the same problem when I try the latest development snapshot (MySQL 5.1.9).
My code works fine when I compile it with the libmysqld generated by compiling MySQL-5.0.17 with embedded-server enabled.

Reply | Thread