The Grumpy Groundhog
Created: 20/04/05 by MarkShuttleworth
The Grumpy Groundhog Project aims to produce an "Ubuntu-derived" distribution containing a crack-of-the-day set of packages. This distribution will never actually be released, instead it will be in a state of perpetual development, representing the very cutting edge of upstream and distro packaging.
Upstream development in the open source world moves at a tremendous pace. Many developers like to keep up to date with specific upstream products, but the work involved in building from CVS every day is substantial. With The Grumpy Groundhog Project, Ubuntu provides those developers with a ready source of packages containing the latest upstream code.
These same packages will allow cutting-edge developers to keep track of changes in the upstream codebase that might affect the distribution later down the line. For example, these packages can be auto-built with the latest compiler and toolchain packages to test compatibility with the versions that may be used for the next release of Ubuntu.
The system can also be used to provide early warnings of porting issues on the different architectures the builds will be run for.
Gentoo has packages such as emacs-cvs which update, build and install a checkout from HEAD. Perhaps something can be applied from the way they do this or the consequences.
Scope and Use Cases
Frank is an Ubuntu Developer, and particularly interested in the apache2 package. He would like to be able to monitor the upstream Subversion repository to ensure the Ubuntu patches still apply.
- Harry is an Ubuntu user, and though not a hard-core developer is nevertheless interested in what's going on in the Apache project. He would like to install a snapshot package of the latest Apache code on his desktop to try out the new features before anyone else.
William is an apache2 developer, and wishes to know whether it builds on all platforms with the latest toolchain. He would like to be able to subscribe to failure notifications and be able to view logs of the failed builds.
Frank notices that apache2 did not build today, he reads through the log of the failed build and can see that one of the Ubuntu patches failed to apply because of a conflict with one of yesterday's upstream commits. Using HCT he downloads the source and resolves the conflict by hand, once satisfied that the package builds for him locally he publishes the updated patch into grumpy.
- Andrew is an Ubuntu Developer, responsible for many different packages. He would like to monitor the Debian (or Fedora, etc.) package, and when new changes are published he would like to be able to see the results of applying those changes to the Ubuntu package without performing the merge manually.
Initial import of upstream repositories and ongoing import of new commits is covered by the BazaarImports specification.
Auto-building of packages is covered by the Launchpad AutoBuildManagement specification.
HCT user interaction is covered by the HctTrainingAndFeedback specification.
HCT merge processes are covered by the HctMergeProcess specification.
Grumpy will be implemented as a separate distribution to the ordinary Ubuntu one, for the following reasons:
- To keep the package pools separate; many advanced users and developers take packages directly out of the pool, taking from grumpy should be a deliberate choice, not an accident.
- To keep the version number space separate; releases within the same distribution have complex interactions between version numbers, by placing the packages in a separate distribution it alleviates these.
- To allow multiple releases or pockets within the grumpy distribution; for example one for latest packages built on the breezy toolchain and one for the same packages built on the latest toolchain packages from grumpy itself. This separation allows us to build latest versions of source packages while there are critical toolchain problems. Further releases or pockets could be anticipated, for example one to build just the latest GNOME or KDE patches on breezy.
- If part of the ordinary Ubuntu distribution it would be part of the same mirror network, and due to the package pool system, extraordinarily hard to separate. As the churn will be very high, we don't want to push to mirrors that don't want it.
It's recommended that the distribution be hosted on an alternate machine entirely to the Ubuntu distribution, for example daily.ubuntu.com. The archive will be set up so that APT pins it lower than installed packages by default, as with the Debian experimental distribution.
Grumpy will use the existing build daemon network, and simply use alternate chroots to build the packages in. This allows us to immediately scale to all architectures we have build infrastructure for, and allows us to use the existing Launchpad build system.
The "latest toolchain" chroot can be updated by a daily dist-upgrade within it, perhaps simply from cron.
Successful builds will be uploaded to the appropriate part of the archive.
The simplest way to assemble the source packages that we wish to build is to use the existing build daemon network and an alternate chroot containing HCT.
Obtain desired base source package with hct source
Pull changes from CVS or other distribution with hct pull
- Update changelog and control file.
Assemble new source with hct assemble
- Upload to archive
- Build system automatically triggered
The changelog and control file will need to be updated in a similar manner to the existing merge-o-matic:
- Version numbers would need to be constructed to be greater than the current version in the latest development release, and increasing. This specification suggests that the base version be used with an identifier and the date appended (in YYYYMMDD format). Packages destined for merges could instead append "ubuntuX" as they do currently.
- Each build should be marked in the changelog, and potentially in the control information with unambiguous identification of the additional patches it included. Since this may not mean anything to the upstream maintainer we should also identify which upstream repository version was used as the basis. For CVS there is no global identification so the time and last commit message may be enough.
The version number or changelog marker could also be used to indicate to a bug reporting tool that it's a grumpy package, and not part of the main distribution for reporting purposes.
If the assembly is unsuccessful, for example because the pull or patch application fails, no source will be uploaded so no build will be triggered.
The result of the assembly will not be committed anywhere, and the manifest not added to Launchpad. This avoids causing future conflicts, and is cheap to make yourself.
In order to support the SoyuzDynamicBuilds specification, the assemblers will take custom orders. This will be a queue containing the following information:
- Base Source; which existing source package release should be used as a base.
- Desired Patch; which patch (from another distro, or upstream) should be added.
- Requester; who requested the custom assemble.
On completion of assembly and build, the requester is notified and rather than being uploaded into the grumpy distribution, the result is uploaded into their personal apt archive.
User Interface Requirements
- Launchpad can show the user the binary dependencies that must be installed to get a particular desired package.
- apt will not auto-install upgrades to a package installed from a archive pinned to priority 1. Would have to increase the pin priority to somewhere between 101 and 499.
- Having upgraded to a grumpy package, dpkg tends to never want to downgrade without a lot of work. That work will be even harder if the currently-installed package has broken scripts.
We should try to fix packages to make downgrades work.
Perhaps people should be advised to install into a chroot or similar setup?
By using the existing build daemon system, logs of both the assembly and builds will be available. These can be watched for failures and e-mail notifications triggered if desired. Notifications will only be permitted to people and teams registered within Launchpad.
Web pages could be produced to indicate the overall status of individual packages or the entire distribution. Potential examples for this reporting could include a web-based waterfall of test builds and status, such as http://build.samba.org/ or the existing buildbot system.
Another option for notifications would be the filing of bugs within Malone.
Failures are inherently transient and the bug could be fixed between test builds. For this reason, any notification should take this into account. A web-based waterfall should return to green once a build has completed and an opened Malone bug should be automatically resolved and perhaps a reply to the e-mail sent to say its all fine now. We should make sure e-mails are not repeatedly sent for the same failure.
We may need the ability to exclude certain packages from the distribution, for example GNU have indicated they prefer people to not make CVS builds of emacs available.
On the other hand, some packages such as mplayer, encourage people to build from CVS HEAD. These might be good demonstrations of grumpy power.
Since the code at the top of HEAD is even less trusted than regular releases we have to be careful to build it in a protected chroot, possibly eventually in a virtual machine. Even without malice, it's not unheard of for Makefile or packaging bugs to try rm -rf / or to kill other processes.
- What social issues might arise when we bring Grumpy on stream? Might it be hard to get developers focused on the current frozen release when they can get their fix of daily crack from Grumpy? Might some people want to try and run end-to-end grumpy, or is nobody actually that extreme? Might some foolish users run grumpy builds and suffer data loss bugs, causing either bogus bug reports or loss of reputation for the package?
- Should packages in the toolchain distribution be automatically rebuilt whenever the toolchain is updated? Should that be a separate release or pocket?
- Should grumpy have packages for unstable branches that are not actually part of the current development distrorelease? For example, should it have a gimp2.5 package if the current stable distrorelease just has gimp2.4? Or should it have both gimp2.4 (updated for any new patches to the upstream stable release) as well as gimp2.5 (representing the latest upstream unstable branch)?