Central Repositories in Mercurial

This entry is part 3 of 3 in the series Bite-Sized Mercurial.

Bite-Sized Mercurial

As a distributed version control system, Mercurial allows us to push and pull changes between any two repositories in an ad hoc manner. In reality, most projects can benefit from a central server (even if it’s just a repository on the lead developer’s machine) — somewhere accessible to keep an authoritative copy of the project.

In this article, I’ll look at one way to work with a remote central repository in order to share code with other developers.

Last time, Joe managed to create and merge branches for both his cutting-edge development and the bug fixes he made to the released software. But what about his colleague, Dan? He’s going to need a copy of Joe’s changes and he’s going to need to share some changes of his own. As mentioned in the introduction, Joe can use the push command to send his changes to a remote repository… except we don’t have one yet.

To create a new repository, Joe can use the init command with a parameter indicating the working directory (or none for the current directory). To add files to a new repository, Joe will add them to the working directory and then use the add command to tell the repository about their existence (without parameters, this will add all files in the working directory and recurse through all subdirectories). This can all be done on a remote server over SSH.

1
2
3
4
5
6
# Joe, working on the server:
hg init new_repo
cd new_repo
# ... Joe copies files into the working directory ...
hg add
hg commit -m 'Initial import to the new repository'

Mercurial can push from one local repository to another, or to a remote repository through SSH, HTTP or even to a shared drive. It even comes with a built-in standalone web-server for ad hoc sharing. Whichever method is used, the push command will allow Joe to add his changes from last time into the repository on the central server. Dan can get a copy of the repository on his computer by cloning the server’s copy.

1
2
3
4
5
# Joe, working on his local repository:
hg push ssh://username@repository_address

# Dan, working on his local machine
hg clone ssh://username@repository_address

Now they both have up-to-date copies of the repository, Joe and Dan can start making changes simultaneously. If Joe finishes first and pushes his changes, what will happen when Dan tries to do the same? Simply put, he won’t be able to. Mercurial will complain that the push will create multiple heads — which would effectively be two anonymous branches.

On encountering this error, Dan can pull Joe’s changes into his own repository, creating the multiple heads there instead. Dan could follow the steps from last time and merge the heads before committing and pushing the merge changeset. From what I can tell, this is usually the recommended approach in Mercurial; I don’t like it because it results in littering the central repository’s history with useless commit messages from merge operations.

The alternative approach, which is often used in that other version control system, is to rebase the local changes on top of the ones we have just pulled in. This is actually a multi-step process that first copies the changesets from one head onto the other (this actually works by merging), then deletes the old changesets (indeed, the entire anonymous branch):

Diagram of Mercurial repository throughout a rebase

Generally, this approach goes against the Mercurial philosophy that the history should be immutable (to avoid losing data), but it is just so much cleaner than the alternative of having merge-commits all over the place. If every developer follows this workflow, then the only time we will see merges should be when working with named release branches, which is useful.

There is one small problem here though: do not rebase changes that have already been pushed. If you do, you only delete the old changes locally and other repositories may end up having two copies of your changesets. This can get messy and should therefore be avoided — I recommend a simple “pull, rebase, push” workflow only for local changes to avoid this kind of mess. Note also that in Mercurial, rebasing is done by an extension that must be enabled before we can use it. See the documentation for details.

That’s it for this time. So far the workflow I’ve discussed is mainly suitable for small-scale projects; next time I’ll look at how we can scale things up to larger projects with more developers. Meanwhile, here’s a quick summary of the new Mercurial commands used in this article:

Command Effect
init initialises a new repository
add adds new files to the repository
clone makes a copy of another repository
rebase moves changesets from one head to another
Release Management in Mercurial

Tags:

Leave a comment