?

Log in

No account? Create an account

Master.info, Re-factoring Code

« previous entry | next entry »
Oct. 7th, 2008 | 09:55 pm

I was just able to do this for the first time:

[root@piggy var]# /tmp/drizzle/drizzled/serialize/master_list_reader ./master.info
HOSTNAME piggy.tangent.org
USERNAME
PASSWORD
PORT 4427
CONNECT RETRY 60
LOG NAME
LOG POSITION 16777216

So what is the big deal?

In the reworking of replication I ran across this bird's nest of a code that exists for the master.info file. Pretty much all of the code dates back to around 2000 (despite the recent work in row based replication, most of the code in replication has been the same for the last eight years).

One of the things no one has ever tackled was getting rid of the master.info file. Even at this point in Drizzle we still have it, though we have moved it to being a protocol buffer file and now can list multiple masters in the file (so yes, that means we will most likely have multi-master support sometime shortly).

In the next few months this file will go away and we will move the data into a table, but for now it was worth spending the two days to clean up the code in order to get rid of the "master database server has died and forgotten to write out the correct data" problems that have existed for so long. I suspect we will evolve the interface at least two more times before we are done. Sure I could shoot for the "end design" and try to be done with it, but I have found that attacking our problems in bite size chunks tends to buy us more distance than just tearing it out and hoping to find the one "right way" for the solution.

I am starting to wonder what the rule of thumb should be for refactoring. The code base we are working from last evolved over a decade ago. This was when Unireg became MySQL. I am starting to think we should be spending anywhere between 70 and 80 percent of our time going forward on just refactoring work. This does not leave a lot of room for features, but I believe that features are a lot less important then what people make them out to be (and in our case we are just working on the micro-kernel, so others can continue innovate on the edge).

The older the code base gets the more important it becomes to do this sort of work, though I am sure someone a decade from now is going to find themselves just as annoyed as I am most days :)

Link | Leave a comment |

Comments {7}

Dreamer of the Day

(no subject)

from: iamo
date: Oct. 8th, 2008 06:41 am (UTC)
Link

I wish I had time to get involved with this somehow :/. Refactoring mysql sounds like a lot of fun.

Reply | Thread

matt

(no subject)

from: sent2matt
date: Oct. 8th, 2008 07:11 am (UTC)
Link

> o yes, that means we will most likely
> have multi-master support sometime shortly

When do you expect this as a patch or something
(not asking of mysql distribution)?

Do you think one should wait for this or in the meanwhile
try to use mysql proxy to do multi-master replication?

Reply | Thread

Brian "Krow" Aker

(no subject)

from: krow
date: Oct. 8th, 2008 07:42 am (UTC)
Link

We are still not in a state where I want anyone using us (not that this stops people). I think we are still a few months off from being there. I've got a list I publish to our mailing list every so often which goes over the details of "this is where we are at".

Reply | Parent | Thread

Arjen Lentz

yes lots of refactoring is good, but...

from: arjen_lentz
date: Oct. 8th, 2008 10:01 am (UTC)
Link

...there has to be a trade-off; otherwise it'll be "forever in progress" but never usable in the real world. And the latter is really really important. Pretty code that doesn't get used is probably buggy, plus for lack of use people would not be interested in innovating on the edge. After all, innovation predominantly happens because of real world need.

Old code does turn sour at some point, so I think the ratio refactoring:new has to take into account how many people refactor how much, so that the amount of old code (say N lines > Y years old) does not grow.
and if that N>Y is done for say 1, 2, 3, 5, 10 years, it's possible to plan ahead with what the focus will need to be in the future with refactoring. When code goes sour, lots of things stall. And the older the code, the bigger (generally) the amount of work and annoyance involved in getting it back in shape.
We don't want to get to the point where noone dares to ever touch a certain piece of code, just because it's too ugly/daunting or even incomprehensible. I think some bits of MySQL may actually already be at that stage, and I hope that they will be tackled sooner rather than later. But there are some really good people now involved with the server code that were not code hacking before, and that's excellent.

Reply | Thread

Brian "Krow" Aker

Re: yes lots of refactoring is good, but...

from: krow
date: Oct. 8th, 2008 04:39 pm (UTC)
Link

Re-factoring is a "forever" project. You have to work on it constantly over time. "Finishing" is all about a train with a schedule. The train pulls out of the station on a clock and what is finished goes in, and what is not finished waits till the next train.

Reply | Parent | Thread

Arjen Lentz

Re: yes lots of refactoring is good, but...

from: arjen_lentz
date: Oct. 8th, 2008 11:47 pm (UTC)
Link

Yep!
Anyway, I think the balance can be struck, measured, and somewhat planned so that it a) doesn't take over completely, and b) doesn't fall too far behind (making sour code) either.

Reply | Parent | Thread

awfief

(no subject)

from: awfief
date: Oct. 8th, 2008 11:11 am (UTC)
Link

Great work! One of the things that's really important from the cloud perspective (and in general, DBA perspective), is that there isn't a lot of stuff that *needs* to deal with OS-level things like files. Having replication information stored in a table makes a lot of sense; it also means that when you have a consistent backup of the database you immediately know where replication left off.

I think if you refactor the code, it will be much easier for others to write the features they want/need. So while the Drizzle core team may not be writing them, they'll be written -- the first plugins are already being written, outside the paid Drizzle core team, inside the Drizzle community.

Reply | Thread