Where does the Innodb Technology today come from?
« previous entry | next entry »
Oct. 11th, 2010 | 10:51 am
Every so often I get a question on "where does Innodb technology come from right now?"
I was thinking about it this morning and decided to break it out via a chart. This is a rough number, and in it I give Percona points for the Xtrabackup.
Most of the work that I see, which is not niche work, is done by Oracle, hands down, no question. I don't see any sign of Heikki being very involved anymore but his legacy seems to be alive (or at least embers of them see to have not gone out). Percona provides a lot of niche changes to Innodb, and the Percona XtraBackup Tool (which if you don't know about, you should).
I, and others, look at changes that occur to Innodb for Drizzle. We adopt ones we are comfortable with(keep in mind, that I personally look at MySQL, PostgreSQL, and a couple of other open source databases as well). I am well aware of Innodb's short comings (and even more so with its connection to mysql's monolithic kernel, I noticed that in 5.5 they changed the default engine to Innodb, but they didn't change the testing frame work to use it for all tests).
HailDB is just offering testing right now for the embedded technology. This is giving projects that used the embedded version of Innodb a home since Oracle seems to have shutdown development on it (i.e. no new releases). HailDB is focused on getting the code in shape for open source development (and a better testing framework is being added to keep any regression issues from popping up).
MariaDB, i.e. Monty Program, is distributor of Innodb technoloy that they obtain from Percona.
I saw the note this morning about SkySQL opening its doors for doing business. They state "Through our relationships with strategic partners such as Monty Program AB", which would give them access to the old optimizer team from MySQL, but that is about it. Monty Program derives its knowledge of Innodb via Percona (and relies on their backup technology).
So in the end? At this point it appears that everyone is just a value added reseller of Oracle and Percona's work (though Percona is obviously a distributor of Oracle technology as well).

If you are interested in knowing about, and participating in the Ecosystem around MySQL you should either be attending or providing a talk at the O'Reilly MySQL Conference and Expo. This is the conference of the year for MySQL the technology, and I expect this years conference will be the best place to learn about what is happening next.
Update I was asked about bugs released to Innodb as the default engine. To get a good feel for it, force the tests in mysql-test to run as Innodb. We found very few bugs in Innodb itself (Innodb is relatively free of bugs), but the kernel was just not designed for it and it fits in more as an afterthought, not a design decision.
If you are up for another challenge, turn on Heap to just use its range index and not its hash index type.
Update2 I pulled a comment about replication that I had in the original post. Replication deserves more then a midstream one liner.
I was thinking about it this morning and decided to break it out via a chart. This is a rough number, and in it I give Percona points for the Xtrabackup.
Most of the work that I see, which is not niche work, is done by Oracle, hands down, no question. I don't see any sign of Heikki being very involved anymore but his legacy seems to be alive (or at least embers of them see to have not gone out). Percona provides a lot of niche changes to Innodb, and the Percona XtraBackup Tool (which if you don't know about, you should).
I, and others, look at changes that occur to Innodb for Drizzle. We adopt ones we are comfortable with(keep in mind, that I personally look at MySQL, PostgreSQL, and a couple of other open source databases as well). I am well aware of Innodb's short comings (and even more so with its connection to mysql's monolithic kernel, I noticed that in 5.5 they changed the default engine to Innodb, but they didn't change the testing frame work to use it for all tests).
HailDB is just offering testing right now for the embedded technology. This is giving projects that used the embedded version of Innodb a home since Oracle seems to have shutdown development on it (i.e. no new releases). HailDB is focused on getting the code in shape for open source development (and a better testing framework is being added to keep any regression issues from popping up).
MariaDB, i.e. Monty Program, is distributor of Innodb technoloy that they obtain from Percona.
I saw the note this morning about SkySQL opening its doors for doing business. They state "Through our relationships with strategic partners such as Monty Program AB", which would give them access to the old optimizer team from MySQL, but that is about it. Monty Program derives its knowledge of Innodb via Percona (and relies on their backup technology).
So in the end? At this point it appears that everyone is just a value added reseller of Oracle and Percona's work (though Percona is obviously a distributor of Oracle technology as well).
If you are interested in knowing about, and participating in the Ecosystem around MySQL you should either be attending or providing a talk at the O'Reilly MySQL Conference and Expo. This is the conference of the year for MySQL the technology, and I expect this years conference will be the best place to learn about what is happening next.
Update I was asked about bugs released to Innodb as the default engine. To get a good feel for it, force the tests in mysql-test to run as Innodb. We found very few bugs in Innodb itself (Innodb is relatively free of bugs), but the kernel was just not designed for it and it fits in more as an afterthought, not a design decision.
If you are up for another challenge, turn on Heap to just use its range index and not its hash index type.
Update2 I pulled a comment about replication that I had in the original post. Replication deserves more then a midstream one liner.
(no subject)
from:
dormando
date: Oct. 12th, 2010 02:19 am (UTC)
Link
Reply | Thread
(no subject)
from:
krow
date: Oct. 12th, 2010 03:16 am (UTC)
Link
Now the ones created by CREATE TEMPORARY? I don't know how much that matters.
Reply | Parent | Thread
(no subject)
from:
dormando
date: Oct. 12th, 2010 03:54 am (UTC)
Link
Reply | Parent | Thread
Replication, SkySQL
from:
Henrik Ingo
date: Oct. 12th, 2010 07:33 am (UTC)
Link
I don't know how replication is an InnoDB technology, but in any case, Monty Program has worked on improving MySQL replication for customers. See my MySQLconf presentation from 2010: http://en.oreilly.com/mysql2010/public/schedule/detail/14656
Kristian and Sergei are currently working on a completely new replication API to allow plugging in new replication protocols. Kristian's work is supposed to be much more abstracted than what was done in MySQL 5.5 to enable the semi-synch protocol, so as to allow a wider range of strategies for replication.
Related to SkySQL and InnoDB, I'd say it is premature to make any assumptions based on the information publicly available to you - or me. Naturally SkySQL, having been launched yesterday, has not yet committed any InnoDB code to any of the MySQL variants.
Reply | Thread
Re: Replication, SkySQL
from:
Henrik Ingo
date: Oct. 12th, 2010 07:39 am (UTC)
Link
Reply | Parent | Thread
Re: Replication, SkySQL
from:
krow
date: Oct. 12th, 2010 05:25 pm (UTC)
Link
How about a litmus test?
1) Does everyone who provides code still have to sign over (in all or some part) code to Monty Program?
2) Do you have a single push to trunk by anyone who doesn't work from Monty Program?
3) Do Monty Program people take patches for storage engines or do people actively push to your main tree? (Galbraith might be the closest person who does something like this, but I know most of his code was rewritten).
4) How many people paid their own way to your all's meeting?
5) How many core developers do you have that don't work for Monty Program?
I can add some more if you would like.
Its good advertising for you all to say you are a community edition but in truth that cup holds zero water. According to launchpad you are only averaging eight people at most even pushing code in a given month.
You guys should own up to the fact that you really are Monty Program. MySQL was MySQL, MariaDB is Monty Program.
Reply | Parent | Thread
Re: Replication, SkySQL
from:
Henrik Ingo
date: Oct. 13th, 2010 11:41 am (UTC)
Link
In any case, I don't fully agree with the question either. FSF uses copyright assignment and from there it doesn't follow that GNU projects aren't community projects.
2) Code is only pushed to trunk by the launchpad maria-captains group. It is always developed elsewhere, by all kinds of people. (Also, see other questions.)
3) Patrick Galbraith, Paul McCullagh, Arjen Lentz (and at least 2 others) are captains and *could* push to trunk. In practice *I think* (remember how I'm not at work for months now) most pushes are done by a person that does the integration cycle or QA/review, which tends to be a MP person (Kristian, Sergei, Monty, Philip). This is distributed vcs, the people who write code against trunk is a bigger group than those who push.
4) Several Oracle employees paid their own trip. However, what you really insinuate is that MP paid for guests to come to our meeting. Everyone (except Oracle people) were paid for by their own employees.
...I have to say, that was yet another weird question.
5) I don't know. Several of maria captains are imho not core developers, so I don't know how to define "core". For instance we have 5 storage engines whose main developers (more than 5) don't work for MP. At least for a few of those it is fair to say their main development focus is MariaDB. So if I had to draw a line, I'd say they are core developers.
I can add some more if you would like.
Perhaps better not, it was already difficult to understand the point of the previous ones.
That being said, there are still a few steps that need to be taken before I can say that MariaDB is truly a community project where perhaps MP is only special due to technical merit. AFAIK those same legal issues also are still unhandled for Drizzle (at least they were the last time we met). They are boring non-technical stuff that nobody cares enough to prioritize, except me. At some point (soon) they need to be taken care of.
To illustrate why saying "Monty Program (MariaDB)" doesn't make sense, try to make up (yes, that's what you did) similar numbers for "who develops FederatedX/OQGraph/etc engine?". Monty Program contributed zero. So in your opinion MariaDB contributes zero. So the engine appeared in MariaDB all by itself?
Reply | Parent | Thread
Re: Replication, SkySQL
from:
Henrik Ingo
date: Nov. 25th, 2010 08:42 pm (UTC)
Link
http://openlife.cc/blogs/2010/november/leaving-monty-program-and-mariadb for more info
Reply | Parent | Thread
Re: Replication, SkySQL
from:
krow
date: Oct. 12th, 2010 07:50 am (UTC)
Link
The numbers are entirely subjective (with a bit of wc -l done to see relative size).
The point I was trying to make, and I believe made, was despite a lot of hand waving right now, everyone is living off whatever Oracle is doing.
As far replication goes? It is basically what Sasha did nearly a decade ago (with a lot of bug fixes). I was off on a bit of a tangent on the replication bit. It's my blog, I get to ramble a bit when I feel like it :)
Reply | Parent | Thread
Re: Replication, SkySQL
from:
ext_302497
date: Oct. 29th, 2010 11:43 am (UTC)
Link
Reply | Parent | Thread
Re: Replication, SkySQL
from:
krow
date: Oct. 29th, 2010 03:39 pm (UTC)
Link
Compare and contrast that to Oracle or Percona? It is not the same at all.
I would think that Percona would be your significant partner, since that is where your Innodb technology comes from today.
Reply | Parent | Thread
Re: Replication, SkySQL
from:
ext_302497
date: Nov. 1st, 2010 09:25 am (UTC)
Link
And yes, naturally Percona is also one of our partners.
Reply | Parent | Thread
Google/Facebook
from:
harrison_fisk
date: Oct. 12th, 2010 07:02 pm (UTC)
Link
Reply | Thread
Re: Google/Facebook
from:
krow
date: Oct. 12th, 2010 07:48 pm (UTC)
Link
His work is his, and by no means do I discount it, but it is not a distribution, fork, etc... It is just Mark's work.
I don't see either Google or Facebook getting into the business of releasing a version of MySQL, and Mark has never show any interest in doing it either.
The work makes for good blogs, but its not something that you can just use.
Reply | Parent | Thread
Re: Google/Facebook
from:
harrison_fisk
date: Oct. 12th, 2010 08:31 pm (UTC)
Link
Both XtraDB and official InnoDB have integrated work done by Mark's team. The fact he doesn't do an official release doesn't impact the fact that they are the source of the technology. Your question was "where does Innodb technology come from right now?", not "who officially supports InnoDB changes they are doing?".
Reply | Parent | Thread
Re: Google/Facebook
from:
ext_282843
date: Oct. 12th, 2010 08:34 pm (UTC)
Link
Reply | Parent | Thread
Re: Google/Facebook
from:
krow
date: Oct. 12th, 2010 08:46 pm (UTC)
Link
What I haven't seen is any of those groups ever push their code to someone else's tree. It is tossed over the wall work, and that is just simply not the same in my book as what Percona or Oracle does.
Is it valuable though? Yes, but it simply isn't the same.
Reply | Parent | Thread
Re: Google/Facebook
from:
ext_282843
date: Oct. 12th, 2010 08:54 pm (UTC)
Link
I don't see Percona pushing stuff to Oracle's tree.
It doesn't mean technology/ideas/patches/changes/whateve
And there're other users of Facebook tree too (and they get supported! ;-)
Reply | Parent | Thread
Re: Google/Facebook
from:
krow
date: Oct. 12th, 2010 10:12 pm (UTC)
Link
Reply | Parent | Thread
Re: Google/Facebook
from:
krow
date: Oct. 12th, 2010 10:16 pm (UTC)
Link
Do you do releases? Do you have a bug tracker? What is your testing?Do you have a roadmap?
Is Facebook going to stand behind what is being published?
I don't believe that just publishing code up to a website means you are providing support. Does that mean it is of no value? I believe that there is value in sharing ideas/code/etc.
Reply | Parent | Thread
Re: Google/Facebook
from:
ext_282843
date: Oct. 13th, 2010 08:07 am (UTC)
Link
MPAB declares they will support community _and_ their customers (direct and indirect). Are they (and their partnerships) efficient or not will be seen as a track record and is not basis for speculation (I usually don't like other speculations about Oracle future actions either).
We try to maintain the tree always in good shape/quality (continuous integration, bla bla) to run in production, that eliminates need for releases (public tree pushes are done in largish chunks of internal changes).
Our testing goes way beyond standard MySQL testing practices (don't forget, we run this thing live eventually) and besides MTR/sysbench/etc we do shadow testing with real (and inflated) workloads.
Our bugs are filed in bugs.mysql.com (though of course, "tracking" is a bit limited there :-) There's also LP bug tracker :)
Oh, and of course we do have roadmap - having best online database platform. Do we have public roadmap? We discuss certain aspects of it in our blogs, if that suits you :)
I don't say that we provide support for everyone, everywhere at any time. OTOH, high quality feedback (or contributions) deserve high quality responses :-) FB tree is good for those, who's problem domain overlaps with FB, so they get good quality tree targeting the problems - I'm really satisfied with production use of it at Wikipedia :-)
Different entities see their need for support in entirely different light. :)
The question about "is going to stand" should be probably asked in the form of "is standing". I sure cannot provide any organization policy insights, sorry ;-)
Oh, well, you derailed the topic yourself. It wasn't supposed to be "Where should you get your InnoDB support from", it was "Where does technology come from", as Harrison pointed out already.
You pointed out HailDB though, do they have 24/7 phone number for their users to call? You know, in some cases that is essential part of support offerings.
Reply | Parent | Thread
Re: Google/Facebook
from:
krow
date: Oct. 29th, 2010 03:35 pm (UTC)
Link
So you are saying you have no roadmap that is published, or release schedule, and you don't have any sort of central bug tracking?
I am sure that someone will go to the trouble of using the source code you provide, but you aren't providing a stable platform that any company can really rely on.
HailDB is providing, and you can ask Stewart, a release schedule, bug tracking, and a roadmap. 24/7 support? Does Apache? Your 24/7 comment is the same thing that closed source vendors have said about open source in general. That has been proven wrong, and it was proven wrong by open source projects by providing releases/roadmaps/etc.
I'm not knocking your work, but I will point out that what you are doing is throwing code over the wall. Its great, but it isn't a solution that is going to get anyone very far.
Cheers,
-Brian
Reply | Parent | Thread
Re: Google/Facebook
from:
ext_284387
date: Oct. 13th, 2010 09:14 pm (UTC)
Link
We, the InnoDB team, have our own policies/procedures regarding integrating third-party contributions into InnoDB code base. We have to conduct extensive review and testing to make sure the quality is sound. Very often, the contributed codes are changed or enhanced. That's why you do not see commits directly from outside of the team.
Thanks,
Calvin
Reply | Parent | Thread
Re: Google/Facebook
from:
krow
date: Oct. 29th, 2010 03:42 pm (UTC)
Link
I wasn't taking issue with their contributions, but that there was a "Facebook MySQL" that exists that you could download/get fixes from/have a roadmap/etc...
Watching the MySQL world is like watching the early days of the linux kernel, where anyone who could recompile it, and flip a few bits would claim that they had a product/project.
Reply | Parent | Thread