The Ubuntu Server Team has devised a git-based workflow for merges. It relies on a set of tools written to assist in the process, but the underlying driver is a basic git-rebase process. This page will attempt to describe the tools used, and the general workflow, with an example or two. Finally, there will be some notes about corner-cases or caveats to the process that might need special-handling.
Git workflow for merging
Tools: their usage, and where to get them
usage: usd-import [-h] [-o LP_OWNER] [-l LP_USER] [--no-push] [--force-push] [--no-clean] [-v] [-d DIRECTORY] [-P PARENTFILE] package Update a launchpad git tree based upon the state of the Ubuntu and Debian archives positional arguments: package Which package to update in the git tree optional arguments: -h, --help show this help message and exit -o LP_OWNER, --lp-owner LP_OWNER Which launchpad user's tree to update (default: ubuntu-server-dev) -l LP_USER, --lp-user LP_USER Specify the Launchpad user to use (default: <current username>) --no-push Do not push to the remote (default: False) --force-push Forcibly push all tags and branches from the import (default: False) --no-clean Do not clean the temporary directory (default: False) -v, --verbose Increase verbosity (default: False) -d DIRECTORY, --directory DIRECTORY Use git repository at specified location rather than a temporary directory (implies --no-clean). Will create the local repository if needed. (default: None) -P PARENTFILE, --parentfile PARENTFILE File specifying parent-child relationship overrides as: <pkgname> <child version> <parent1 version> <parent2 version> with one override per line. (default: None)
Available from Launchpad git repository.
The primary problem we are trying to solve is how to make merges consistent across the server team, and, ideally easy to review for sponsors/uploaders.
Robie Basak and others came up with a git-based workflow (documented roughly on Github and hopefully clarified here). This workflow 'just' uses git (the only addition to 'stock' git are a few commands (git-dsc-commit, git-merge-changelogs, git-reconstruct-changelog) from git://github.com/basak/ubuntu-git-tools.git)) to effectively rebase the ubuntu tip onto the latest debian tip. That presumes you spent the time yourself to create a repository with commits representing the current states of debian unstable and the ubuntu development release.
The goal for the importer is to have canonical (little c) location for such commits to live, per-package. Everyone can refer to commits and tags in that tree, and they will have well-defined semantics and share SHA1s.
A secondary problem was obtaining the 'complete' history for a given source package. This would allow a user to git-blame a give file and get useful data. So we extended the algorithm (that Robie designed) to be more flexible. After hitting many corner-cases during implementation, we scrapped our complicated algorithm that produced clean trees for a clean algorithm that produces complicated trees.
Feel free to skip the next section if you're not interested in the implementation.
In essence, the algorithm looks at Launchpad's publishing history for versions it hasn't seen before (which if an empty or no local repository is specified and you aren't cloning an existing repository will be all of them). For each such version, it uses pull-lp-source/pull-debian-source equivalents and git-dsc-commit to import them into the git repository. Technically it uses a lower-level command then a proper commit (git write-tree and `git commit-tree`), so that we can get the imported tree, examine it and find its parents and then commit it. The parents for an imported tree are at most 2:
- The last version imported into the same series/pocket, with some knowledge of how to establish a new series/pocket.
- The last debian/changelog entry (using debian/changelog from the just-imported tree) that was successfully imported.
We call these the 'publishing parent' and 'changelog parent' respectively. Now, if we can't find either of these, we will orphan the import and if we only find one, we'll use what we have. But the resulting tree looks like many git-merges, even where there are not ubuntu-merges. This is correct by the algorithm but can be confusing to the original git-rebase workflow, because rebase will try to replay the git-merges and that doesn't work. More on this later.
usage: git-dsc-commit [--tree-only] DSCFILE Untars the version specified by the DSCFILE and commits it to a git repository. optional arguments: --tree-only Only call git-write-tree on the result. git-commit-tree will need to be manually invoked to commit the changes to the repository.
usage: git-reconstruct-changelog COMMITISH Replays git commit messages starting after COMMITISH into debian/changelog.
usage: git-merge-changelogs BASE_COMMITISH NEWA_COMMITISH NEWB_COMMITISH Runs dpkg-mergechangelogs on the debian/changelogs from the passed commitishs.
usage: xgit Either 'enters' or 'exits' an xgit-configured git repository. Such a repository as a git/ and gitwd/ directory structure, where git/ is the GIT_DIR and gitwd/ is the GIT_WORK_TREE. This allows for easier git-dsc-commit invocation and clarity on what is in the working tree and what is not.
Available from Github repository.
usage: usd-merge reconstruct COMMITISH [ONTOISH] Given a usd-import'd tree, attempts to reconstruct the current sequence of commits from `git-merge-base ONTOISH COMMITISH`, where ONTOISH defaults to debian/sid (the last imported Debian unstable version). It attempts to match debian/changelog from COMMITISH, and verifies that the resulting commit is content-identical to COMMITISH. It then tags the resulting commit as reconstruct/<latest version from COMMITISH debian/changelog>.
A simple helper script entitled 'usd-import-reconstruct-merge' [to be renamed to usd-merge] which can take a usd-import'd tree and a commitish and attempts to give you a reconstructed sequence of linear commits that represent the same state as the commitish. It does this by using merge-base to figure out the common ancestor (it assumes onto is debian/sid but it also accepts a parameter) and then playing back the debian/changelog looking for imported tags. It then cherry-picks those tags from oldest to newest against the merge-base and does a quick sanity check that the resulting commit does not differ from the original one.
It tags that as reconstruct/<version> which sort of conflicts with the git-based workflow some have been using previously. In that workflow, reconstruct/<version> is the broken-down sequence of changes including changelog and metadata, where each commit is a single change from the changelog. I have taken to now calling that pointer deconstruct/<version> because it's clever.
So you would break down reconstruct/<version> into its constituent changes (per debian/changelog for each version) and tag the resulting end-commit as deconstruct/<version>.
That can then be broken up by our workflow into a logical/<version> tag and then rebased onto debian/sid (or the specified onto).
Available from Launchpad git repository.