O'Grady's Fear of Forking, Let a thousand flowers bloom

Nov. 16th, 2010 | 11:18 am

In the article "Fear of Forking" there was a quote pulled from me about my observations from a yearly call done by the folks at O'Reilly with many of the authors of different open source projects.

"On a related note there was a recent phone call that O’Reilly put together with a number of open source leads. It was amazing to hear how many folks on the call where terrified of how Github has lowered the bar for forking. Their fear being a loss of patches. It was crazy to listen too."

Since I made that comment, there is one new observation I have made. GitHUB has begun to feel like the Sourceforge of the distributed revision control world. It feels like it is littered with half started, never completed, or just never merged trees. If you can easily takes changes from the main tree, the incentive to have your tree merged back into the canonical tree is low.

I feel like you can look at it in either two ways.

If you count up all of the hours and energy going into abandoned trees then you begin to worry about "all of that wasted work". It takes a lot of effort to keep projects going, and if all new energy is focused in this direction I don't know that we can keep a sustainable amount of focus to produce the sort of software that we do today. While consulting this last year I've run into a number of shops where a developer has made changes to an open source project, and placed these into production without any vetting (and in most of these cases they had a github/launchpad/etc sort of tree, or they pulled from some random person's tree). They didn't use a released piece of software, and often the code they had used was just thrown over the wall by some devs in some other company. It is the "we hired a smart guy who tinkered with our debian distribution/kernel" problem all over again.

The other way to look at it, is that Github/Launchpad are today's Burgess Shale. We are in the equivalent of a cambrian explosion and the diversification we are seeing is similar to what we saw when Sourceforge first launched. If this is the case then we will see some stabilization in the next few years. In the database world, we are certainly in the middle of one of these periods.

If I put all of this into perspective and apply it to the MySQL Ecosystem, I fully believe that the forking we saw was enabled by the move to bzr/launchpad. Without that move it would have been a lot harder to make that shift for most of the forks and distributions (and I believe it has also slowed down the evolution of most, since almost all of the forks/distributions are heavily tied to downstream changes that Oracle makes). Beyond Drizzle, none of the other forks have any significant contributions, and they are all stuck waiting for Oracle to fix bugs for them and/or hoping that the changes they make don't conflict with what Oracle is doing.

In a related Ecosystem, I am eager to see what happens in the Postgres world now that they have moved to Git. As I have mentioned before, with Drizzle we could have started with Postgres as the foundation, and I believe there would have been a lot of benefit in doing so. I will be curious to see if anyone decides to see what they could do with the code if they take a radical departure from the current architecture. I am still happy with our choice of MySQL, but I believe there is an opportunity to do something pretty incredible with that codebase as well.

In the cloud world we have OpenStack, Canonical, and Eucalyptus all circling around the same problem and having a history of shared tools and code. It is going to be interesting to see what happens there as well.

Link | Leave a comment | Share

Reply

No HTML allowed in subject

  
 
   
 

Notice! This user has turned on the option that logs your IP address when posting. 

(will be screened)