[wordup] Source Control Management

Mon Mar 18 18:06:56 EST 2002

So this is way of the beaten path of what I normally send out and fairly
"geek" oriented but it's an interesting commentary on how the tools you
use shape the community you belong to.  

In some ways it's obvious but I think it's an under discussed
phenomenon.  There's been a lot of discussion about this in the Wiki
world as well, as people talk about how to design a Wiki to foster the
community you want.

Adam.

From: http://slashdot.org/comments.pl?sid=29536&threshold=3&commentsort=3&tid=156&mode=thread&cid=3171386

Been there, done that. (Score:4, Interesting)
by tlambert on Friday March 15, @05:26PM (#3171386) Alter Relationship

As the author of the original 386BSD 0.1 PatchKit software, I have to
say that your "air traffic control" approach will not work.

The 386BSD 0.1 patchkit used a serialization of patch numbers, with
central assignment. The reason for this was that the patch dependency
management was done by manually applying patches posted to Usenet, and
then diffing the modified version of the code against a version with the
previous N-1 applied.

Effectively, it was a "human CVS repository" system.

Ir was necessary, because the latency in the Usenet system meant that
you couldn't "lock down" a file or set of files for some major change:
you had to do what you wanted to do against what you had, which was
almost never "the most currnet concensual version" of the code, and then
hope someone else didn't win the race to "the repository" (at the time,
terry at cs.weber.edu's incoming email, and then, later, Rod Grimes', Nate
Williams', and then Jordan Hubbard's... no one wanted it for very long).

This led to all sorts of problems; the major one was that the patch kit
format was "reverse engineered" (not hard; the patch tools, except the
creation software itself, were widely distributed), and a group started
releasing patches in the "1000+" ID range, under the incorrect
impression that the concern was over the patch namespace collision, not
topological application problems. This eventually led to a big argument,
and other people going off to play in their own sandbox.

You've probably heard of "NetBSD". A couple (not all, of course) were
motivated by communit rejection of the 1000+ numbered patches, which,
while they were not colliding in serial number space, seriously blew out
topological dependency space for modified files.

In any case, that's exactly what you are doing with your code, when you
plan on assigning patch numbers based on expectation of completion.

With the number of people you have, the comments about contested
interfaces being agreed to beforehand, and the comments about you having
no real problem here in the first place are probably accurate.

You can basically take a couple of approaches.

The first is: don't accumulate patches, just check the code in. This
respolves the problem of stale patches by not permitting them to become
stale in the first place.

The second is: "cvs tag" before any major commits, so that there is a
baseline from which to work to resolve conflicts.

Really, you should not be accumulating large patch sets, with as few
people as are involved.

If you have a huge offline latency from a developer or group of
developers (e.g. you send a CDROM to Antarctia, and two months later the
send back a CDROM with their patches on it), or if you have a huge
number of developers, you should reconsider your chioce of tools.

The 386BSD patchkit serialization of patch sequence numbers through a
couple of human beings was a serious mistake. It had the emergent
property of having a tiered set of priviledge. I'm convinced that this
is what resulted in the current "core team/committer/less-than-dirt"
striation in the BSD camps today.

I mention this, because CVS has a similar, though somewhat less
profound, emergent property of "The One True HEAD Branch". By its
nature, it encourages a single direction for all experimentation and all
forward looking thought, denying nourishment to any contradictory lines
of inquiry, by chopping off the roots. CVS is, in a nutshell,
anti-research. It prevents people from going off 90 degrees from where
everyone else is headed, and discovering new territory.

Perhaps you've heard of OpenBSD. It emerged because there was "One True
HEAD Branch" in NetBSD (an early adopter of CVS, in Open Source-land),
and several people felt strongly enough that the focus of the project
should be secure systems research, that the resulting code directions
were incompatible.

Tools issues are at the base of nearly any strong divide you can name in
an Open Source community.

Linux currently has issue, where Linus is investigating the use of Larry
McVoy's BitKeeper (Larry was smart, in that early on, he recognized the
emergent properties tools choices force onto projects, and tried to
design around the problem). It turns out that a single human CVS
repository doesn't scale infinitely.

FreeBSD is in the throes of a "To use Perforce or not to use Perforce"
decision. Perforce supports seperate lines of concurrent developement.

It fosters, as my former boss' boss, Ray Noorda, used to say,
"coopetition": help each other make the best implementation according to
their design, and then may the best design win.

Perforce lets this happen, but it also tends to balkanize developement,
if not everyone is using the tool. There are complaints in FreeBSD that
significant work is taking place in Perforce branches that aren't
visible to normal CVS users. The Perforce users complain back that there
would be no need for Perforce, if the develeopement were permitted in
the main CVS tree -- along with the breakage that would entail. Both
arguments have merit. Right now, there is a truce... more of an
agreement to disagree, and not force the issue today, but a promise that
the battle will be fought to the death at some later date.

For your project, a tool which supports multiple concurrent "One True
HEAD Branches" seems like it fitys the bill (though as I wrote that, I
still asked myself why, with so few people, it was an issue for you in
the first place).

Whether the tool you pick is Perforce, Bitkeeper, or some other tool
that can support that developement model is irrelevent.

What is relevent is that you understand that our tools shape the way we
think about solving problems, and if you have already arrived at an
approach that doesn't -- or *can't* -- fit into the shape dictated by
CVS, then it's probably time to look at another tool.

Not matter what you do, I can guarantee you that layering another, less
adequate, tool on top of an already inadequate tool, will not fix your
problem.

I can also guarantee you that if you can't change your model to fit an
existing tool, you're going to find yourself in the source code control
tools business, instead of the business you intended to be in.

Probably, you should rethink whatever premise it is that's resulting in
large, infrequently integrated patch sets. If it's just your release
engineering department not wanting to do their work on a branch, well,
that's tough. Branch tag for releases as a matter of policy, and move
on. If on the other hand, it's something more profound, perhaps you need
to rethink your assumptions in favor of what the tools can do, vs. what
you would like them to be able to do.

Alternately, welcome to the source code control tool business.

-- Terry