Three Commits, Ding, Ding, Ding
« previous entry | next entry »
Jan. 31st, 2008 | 11:53 am
I dislike old boy's club. Not the part where you sit around in gab, but the part where people have to wait at the door to get in.
Something about them rubs me a bit raw.
Open Source has this problem in spades. The bar to commit to the major projects is not problematic in that it is high, it is a problem because it is not written down. It is often fuzzy, or worse personality dependent (so much for the meritocracy that is espoused).
Here is a story for you...
A few weeks ago I went to the Velocity Summit O'Reilly held in San Francisco (not to be confused with the Velocity Conference which I still owe an abstract to Jesse for), and I talked about my current topics, as always MySQL, and as of recent libmemcached.
Somewhere during on of the sessions we got off on a tangent about distributed source control (another favorite topic of mine), and how it made vendor versions simpler to deal with. Which is another way of saying that it encourages forks (not the kind you put in your eye).
Hallelujah!
I'm all for forks. They are a sign of health. They are not really much of a motivator for old boy's clubs though. The group I was with talked about that for a few moments. What was my surprise? The number of other people in the room that were hostile to the old boy's clubs. It was a common annoyance in the room.
Despite the ongoing fear of forks, open source projects remain very stuck in the proprietary world of software when it comes to contributions. Few are good at this, and most become an old boy's club of some sort. It is too much effort to create casual fixes for the average user. Which means it is impossible for the most part to make significant changes.
The Hanging fruit of features are left hanging until they either rot or... well they just keeping hanging around.
When I started working on libmemcached I decided to adopt a new strategy.
In the README I put "three good patches get you commit access". The fabled "you must be at least this...", was put into words. Any three patches and its yours.
So today some three months later? Five committers, and one person who is just one patch away. Most effort is still coming from me, but it is not all of the effort. The best bug fixers are coming in from a user (user's code better then straight developers, it is the nature of need). Documentation has all been strengthened from users.
To me five people is a success. It proved the point to me that it was better to grant access, then act as a gatekeeper for others.
It is a success that has shown off a few new problems.
Release. Just because you can commit, does not mean you can release. This is a problem, since it creates a bottleneck.
Regression. We need better regression testing, and by better I mean more. The current candy in my eye is the "BuildBot" project. I like the concept, but I want to see something more. I want all open source projects being filtered into a network where users can run slaves that do regression testing for them.
Coding Guidelines I've never written down what exactly my coding style is. It is hard for others to follow what they do not know.
The last one is solvable, the other two I am trying to think up solutions for. I've pinged several people in large organizations to plant the seed for what I want to see happen with mass regression. It will take a few more months to see if anyone bites on it.
Release might be solvable alongside regression. For years I've said that internally in MySQL we needed to just generate binaries with each push. Each push is either green or not, and green ones should be acceptable for a release (and if they are not... well then there is a problem with the regression system).
Any open source projects should be able to do this (I'm amazed at how many pulls I see coming off my Mercurial trees, where users now just grab the tip of the tree). The problem with pulling directly from mercurial is that there is no way to know if the build you just pulled is good or not. There are no green lights, and to me this creates a barrier. Solving feedback on regressions to commits would make this go away.
Another thought on this topic.
Three commits might be too many. It is easy to roll back changes in a source control system. Perhaps take a queue from other groups? The folks at wikipedia have figured out a good formula for an on-line encyclopedia.
Maybe, just maybe, revision control should be open access. There is a fear of trojans, but would it be possible for an open project to monitor itself? Wikipedia does a good job of rolling back malicious changes, could a similar gatekeeping system work for source code?
How about a code reviewing captcha?
I am left with the thought that open source is still more about source, and not so much about being "open".
If Open Source wants to really move beyond the proprietary model, it needs to give some thoughts on how to open up the model.
Something about them rubs me a bit raw.
Open Source has this problem in spades. The bar to commit to the major projects is not problematic in that it is high, it is a problem because it is not written down. It is often fuzzy, or worse personality dependent (so much for the meritocracy that is espoused).
Here is a story for you...
A few weeks ago I went to the Velocity Summit O'Reilly held in San Francisco (not to be confused with the Velocity Conference which I still owe an abstract to Jesse for), and I talked about my current topics, as always MySQL, and as of recent libmemcached.
Somewhere during on of the sessions we got off on a tangent about distributed source control (another favorite topic of mine), and how it made vendor versions simpler to deal with. Which is another way of saying that it encourages forks (not the kind you put in your eye).
Hallelujah!
I'm all for forks. They are a sign of health. They are not really much of a motivator for old boy's clubs though. The group I was with talked about that for a few moments. What was my surprise? The number of other people in the room that were hostile to the old boy's clubs. It was a common annoyance in the room.
Despite the ongoing fear of forks, open source projects remain very stuck in the proprietary world of software when it comes to contributions. Few are good at this, and most become an old boy's club of some sort. It is too much effort to create casual fixes for the average user. Which means it is impossible for the most part to make significant changes.
The Hanging fruit of features are left hanging until they either rot or... well they just keeping hanging around.
When I started working on libmemcached I decided to adopt a new strategy.
In the README I put "three good patches get you commit access". The fabled "you must be at least this...", was put into words. Any three patches and its yours.
So today some three months later? Five committers, and one person who is just one patch away. Most effort is still coming from me, but it is not all of the effort. The best bug fixers are coming in from a user (user's code better then straight developers, it is the nature of need). Documentation has all been strengthened from users.
To me five people is a success. It proved the point to me that it was better to grant access, then act as a gatekeeper for others.
It is a success that has shown off a few new problems.
The last one is solvable, the other two I am trying to think up solutions for. I've pinged several people in large organizations to plant the seed for what I want to see happen with mass regression. It will take a few more months to see if anyone bites on it.
Release might be solvable alongside regression. For years I've said that internally in MySQL we needed to just generate binaries with each push. Each push is either green or not, and green ones should be acceptable for a release (and if they are not... well then there is a problem with the regression system).
Any open source projects should be able to do this (I'm amazed at how many pulls I see coming off my Mercurial trees, where users now just grab the tip of the tree). The problem with pulling directly from mercurial is that there is no way to know if the build you just pulled is good or not. There are no green lights, and to me this creates a barrier. Solving feedback on regressions to commits would make this go away.
Another thought on this topic.
Three commits might be too many. It is easy to roll back changes in a source control system. Perhaps take a queue from other groups? The folks at wikipedia have figured out a good formula for an on-line encyclopedia.
Maybe, just maybe, revision control should be open access. There is a fear of trojans, but would it be possible for an open project to monitor itself? Wikipedia does a good job of rolling back malicious changes, could a similar gatekeeping system work for source code?
How about a code reviewing captcha?
I am left with the thought that open source is still more about source, and not so much about being "open".
If Open Source wants to really move beyond the proprietary model, it needs to give some thoughts on how to open up the model.
when in rome()
from:
tanjent
date: Jan. 31st, 2008 08:42 pm (UTC)
Link
Along those lines, perhaps a good guideline is just to say "Try and match my style - if you don't know how, you need to read more of my code".
-tanjent
Reply | Thread
Re: when in rome()
from:
krow
date: Jan. 31st, 2008 08:58 pm (UTC)
Link
That... and most people want to stick to their own naming schemes and resist what others do :)
Reply | Parent | Thread
Re: when in rome()
from:
awfief
date: Feb. 6th, 2008 10:46 pm (UTC)
Link
Make them want it
and
Make it easy
I think people already want to, although that's a problem too. Making it easier is where Brian's going with this. I think having templates is nice, guidelines are nice too.
The wikipedia-like nature is interesting....very often I find myself wanting to change other people's codes, and it may be something minor like "take out the hard coding of that number and put it in a variable" -- minor, but may require a few hours of grepping and making sure I'm not reusing a variable and making sure it's used in all the right places.
So I could see the value in having it be truly open.
As you say, it creates a bottleneck for review and release. Except it doesn't create the bottleneck, it *moves* the bottleneck.
You hit the nail on the head with better regression testing. I think honestly that is where the focus needs to be. Have tons of regression tests, and a patch gets committed to the release branch if the software still "works" after that commit.
Reply | Parent | Thread
commits to tests
from:
dmarti
date: Feb. 1st, 2008 12:07 am (UTC)
Link
Reply | Thread
Re: commits to tests
from:
krow
date: Feb. 1st, 2008 12:30 am (UTC)
Link
Why?
Make your application a part of the regression system. I've floated this several times to different audiences, but I have never had anyone bite on it.
Reply | Parent | Thread
Re: commits to tests
from:
awfief
date: Feb. 6th, 2008 10:46 pm (UTC)
Link
Reply | Parent | Thread
Formatting
from:
acdha
date: Feb. 1st, 2008 12:49 am (UTC)
Link
Reply | Thread
Re: Formatting
from:
krow
date: Feb. 1st, 2008 02:34 am (UTC)
Link
Reply | Parent | Thread
Re: Formatting
from:
acdha
date: Feb. 1st, 2008 02:54 am (UTC)
Link
Reply | Parent | Thread
About Release
from:
shadymist
date: Feb. 1st, 2008 02:54 am (UTC)
Link
If you don't know yet if you can fully trust committed versions, but you don't want to make people wait for you to post them, can you not just allow for a secondary link marking it as "beta" with some note that it's user-updated? This would only be helpful in regards to the first stated issue, but you could possibly have the notes request volunteers to test and comment publicly so newcomers can see the results and decide if it's worth downloading the un-Brian-verified updated version?
Reply | Thread
Re: About Release
from:
awfief
date: Feb. 6th, 2008 10:49 pm (UTC)
Link
ie, have your system have approved people, and those commits by approved people get into the base / next release, no questions asked. Then have those of us who aren't approved make an implicit fork.
So you can have "libmemcache_SHEERI_00001" or whatever. Some system to merge branches might be nice, so I can have my changes from yesterday merged with your changes from today, and do a one-command download to my system. :)
Reply | Parent | Thread