See /Discussion too.

Vision

The benefits of revision control have been well established in software development for decades. Proper management of source code accelerates development, improves flexibility, and reduces the frequency and impact of mistakes. Ironically, some of the largest repositories of source code (Linux distributions) use revision control only indirectly, in a primitive form, or not at all.

By deploying revision control on a grand scale in Ubuntu, we hope to enable more efficient development, expand community development, and improve coordination of work, by simplifying the common tasks of Ubuntu package maintenance.

Goals

Risks

We may find that users do not use the VCS branches, preferring a non-VCS based approach. We raise this risk because there are existing tools to allow individual packages to use VCS packaging, but these are not consistenly used by Ubuntu developers. We believe that there are several reasons for this:

By addressing the lack of a pervasive presence of VCS data for Ubuntu we will enable developers to assume the existence of a VCS as they build tools to work with Ubuntu, helping to bootstrap the direct use of a VCS for packaging. Secondly, as the phases of the implementation are completed, they will each add a network effect to the benefits of the use of a VCS for each package, increasing the returns gained for individual developers by adopting a VCS for packaging in Ubuntu.

Design overview

The design will include the following high-level elements:

Branches

For each package we will create a number of branches:

Ubuntu branches - managed by both automation in the datacentre and Ubuntu developers:

Debian branches - managed by automation in the data centre:

Upstream branches - managed by automation in the data centre:

Tools

Bzr and launchpad provide existing branch management tools for working with branches. Additional logic will be required to load data directly into bzr branches and provide the replacement/enhancement for existing tools such as uupdate.

The general design approach is to extend bzr via plugins to deliver the required functionality, and call through to those bzr commands when we want to make an existing tool work seamlessly with the bzr packaging branches.

One of the key benefits of the branch layout and namespace is that standard toolsets can be used for nearly all common operations:

Automatic import of package versions

The bzr import-dsc command already reliably imports packages; simply putting it in a small shell script with appropriate output-naming conventions and push-pull with launchpad will do all the needed work.

Automatic import of new package releases

The launchpad product release finder is capable of finding new upstream tarballs, but the links to source packages are poorly maintained. bzr import is already capable of importing tarballs for us - we need to write a script that asks launchpad for tarballs for a source package and performs the import.

Implementation Strategy

Many of the benefits of using a VCS for Ubuntu are delivered if Ubuntu itself uses a VCS in isolation; as more links in the chain between Ubuntu and Upstream that are also in a common VCS graph, more benefits can be realised. Importantly, many benefits for Ubuntu can only be released through pervasive consistent implementation of VCS based packaging.

Accordingly we will implement Ubuntu distributed development in a series of phases, starting with Ubuntu itself and working towards the upstream imports currently provided by Launchpad. Each phase will be fully deployed before the next phase is initiated (though development of the software for the next phase can overlap this clear completion line).

Deep history, while very interesting for upstream branches, is much less interesting for the day to day operations of Ubuntu developers, and as such is not planned for the import of the distribution packaging branches.

Phase 1: Ubuntu source code managed in bzr

Phase 2: Merges of Debian using bzr

Phase 3: Current debian in Launchpad/Bazaar

changes.

Phase 4: Mergable upstream releases

Phase 5: Current upstream in Bazaar

Design details

Package namespace

It is important to have a predictable namespace to upload branches too. Ideally launchpad will allow package branches to exist in the namespace - e.g. bzr+ssh://bazaar.launchpad.net/ubuntu/PACKAGE/packaging and bzr+ssh://bazaar.launchpad.net/ubuntu/PACKAGE/orig. Until launchpad has such a namespace, we define one that can be used consistently (and thus migrated by launchpad automatically when launchpad gains a packaging branch namespace):

Client tools to support the workflow

Either the standard Debian tools could be modified to work with the new system, or analogous tools could be provided.

Getting the source

To get the source either "apt-get source" needs to be taught to try for an ubuntu packaging branch first, with fallback to accessing the archive, or a different tool written to do the same thing.

Updating to a new upstream

To update to a new upstream either "uupdate" should be taught to import the tarball as a new revision on the upstream branch and merge that, or provide a different tool do the same thing.

Uploading a version

To upload a new version both the source package needs to be uploaded. Either this can be done with modifications to "dput" (it should be given a branch and revision id in the .changes file via bzr builddeb, so that it can do a bzr push to the correct branch as part of uploads to ubuntu), or by wrapping dput in a command that does both steps.

Ubuntu / Debian imports

The bulk of the import code already exists in the bzr builddeb plugin.

bzr import-package [--initial] <distro> <package> <version>

Import a package into the system. The first time a package is imported, the procedure is:

Once these branches have been set up, importing subsequent package versions is straightforward:

Running this script in a wrapper script that checks for new versions, in a loop will generate a mirror in bzr of all of ubuntu and debian automatically with low latency. Ideally we can hook this into the end of the package build process of soyuz once the kinks are worked out of the system, to provide for instant-availability.

In doing these imports race conditions exist where a user uploads a package and the packaging branch has been separately changed. In these cases the import will:

Upstream Imports

bzr import-upstream <product series> <upstream version number>

Import a new upstream release wholesale as a new revision on the project series tarball branch for the package. This is used in cases where Ubuntu is packaging upstream directly, rather than using Debian's source package, and could be done automatically based on a debian/watch file.

This allows new branches to be based directly on upstream code, and future upstream releases merged in.

The command is a simple wrapper around 'bzr import' to:

This should be run as part of the launchpad services connected to the upstream release finder; however during busy periods developers will likely need to run this themselves (consider when gnome releases and seb snarfs-and-uploads the entire release in a very short timeframe).

Uploading

NB: This is already in hardy except for the ppa changes distribution header support.

bzr build-deb [--to gutsy|hardy|...]

Marshal a Bazaar working tree into a Debian source package suitable for upload. This will tag the release in the bzr branch.

Branching for Ubuntu

branch-package --base debian|debian-orig|upstream-releases|upstream-vcs

Create a new Ubuntu branch based on:

Fixing Direct Uploads

When someone uploads directly without tagging in bzr the package importer will have filed a bug and sent mail about the discrepancy.

apt-get source package
cd working-directory
bzr merge -r <MERGE_STRING_FROM_BUG>
# work on package as normal

Rationale

VCS vs. Tarballs

It is generally acknowledged that a more complete data model is conceivable, wherein the graph also includes revisions from upstream VCS. In such a model, the upstream-releases branch would be derived from the upstream-vcs branch. In addition to being more elegant, this would provide the following tangible benefits:

  1. cherrypicking of patches from upstream-vcs branches to ubuntu branches using Bazaar

  2. reduced storage requirements through shared Bazaar repositories

However, at present, these benefits are not believed to outweigh the following problems:

  1. There is no reliable way to determine the correct upstream VCS revision from which the release tarball is derived.
    • A number of heuristics are possible, including measuring the size of the delta between the tarball and each revision, but none are considered to be accurate enough for unattended use. Automation is essential for scalability, as any approach which requires the confirmation of a human will not scale to hundreds or thousands of packages per developer (as is the case in Ubuntu).
  2. A substantial subset of the upstream VCS code for Ubuntu is not available in a standard VCS
    • While a facility exists for importing code from other VCS implementations into Bazaar, it is limited by:
    • an inherent unreliability derived from the need to reconstruct information missing from the upstream VCS (raising the same scalability concerns as above)
    • a lack of support for upstream branches (a common case which may be addressed in the future)

In addition, a significant subset of Ubuntu packages do not have an upstream VCS, yet still require package maintenance. This means that our solution must be general enough to handle this case, and therefore the proposed solution is believed to provide a useful foundation for future VCS-based work if and when solutions to these problems are found.

Open Issues

Future Directions

This framework will also provide the basis for future experimentation and development in the following areas:

Debian/Ubuntu Patches

An ideal model would also represent patches (e.g. debian/patches) as first-class objects which can be manipulated. This would provide the following tangible benefits:

  1. More intelligent merging
    • With the approach described herein, there are merge conflict scenarios which Bazaar cannot detect because it is not aware of the patches (changes introduced by upstream may cause a patch in debian/patches to fail to apply).

  2. A standard (and automatable) process for iterating, adding, removing and modifying patches in distro-specific formats
    • This would create the possibility for building higher-level tools which work with these patches

However, the following problems make this impractical for us at present:

  1. There is a proliferation of different patch formats in use in Debian, many of which are difficult or impossible to operate on programmatically
  2. There does not yet exist an equivalent representational facility in Bazaar (though "looms" show some promise here)

In light of further developments here, there should be no reason that such a capability could not be added to the design as an extension in the future.


CategoryDistributedDevelopment

DistributedDevelopment/Specification (last edited 2009-11-02 14:40:42 by lec67-4-82-230-53-244)