ArchiveReorganisation

Differences between revisions 21 and 22
Revision 21 as of 2009-01-14 14:15:52
Size: 16492
Editor: 82-69-40-219
Comment: fix list formatting
Revision 22 as of 2009-01-14 17:00:50
Size: 16280
Editor: 82-69-40-219
Comment: layers == package sets
Deletions are marked like this. Additions are marked like this.
Line 78: Line 78:
Layers correspond to the proposed "package sets" feature in Launchpad. The name was chosen to reflect the larger structure in Ubuntu: layers are typically built on other layers.
Line 136: Line 138:
 * Package sets separate from layers. e.g. Mozilla package set to grant upload privileges. The seeds are package sets, but are distinct as they are used for the ogre model.
  * ColinWatson: I think this is addressed by my conception of "layers" as objects potentially independent from seed collections or Ubuntu flavours; i.e. layers ''are'' package sets (with associated access control).

THIS IS AN UNAPPROVED DRAFT.

Problem statement

Current archive layout

Since its inception, the Ubuntu archive has been split into four components, organised as follows:

Component

Licensing

Supportedness

Uploader team

main

Free software

Supported

ubuntu-core-dev

restricted

Non-free software (drivers)

Supported

ubuntu-core-dev

universe

Free software

Unsupported

ubuntu-dev

multiverse

Non-free software

Unsupported

ubuntu-dev

This scheme was created before Ubuntu was announced publicly, and before the MOTU team was established to take care of the universe and multiverse components. It also predates the creation of Ubuntu derivatives hosted in the Ubuntu archive, such as Xubuntu and Ubuntu Studio. Of late it has been creaking at the seams. Problems have included:

  • It has been unclear whether packages required for CD images of Ubuntu derivatives should go in main. Our documentation is at best unclear on this.

  • The MOTU team is responsible for an extremely wide range of tasks, and is fragmented by needing to cover Ubuntu derivatives as well as general packages untended in other ways.
  • Developers are under pressure to join ubuntu-core-dev in order to upload to relatively small subsets of the archive. For example, developers interested in the use of Ubuntu in education need to join ubuntu-core-dev in order to upload packages on the Ubuntu Education Edition CD images.

  • It is often not clear exactly what Canonical supports. Originally this was simply "everything in main", but a distinction is now required between "maintained for security updates" and "commercially supported by Canonical".

  • Users have very little guidance on what "supported" means, and the available user interfaces only indicate whether a package is maintained for security updates, not whether it is commercially supported (by Canonical or otherwise) or whether it is well-tested with the flavour of Ubuntu they are using (for example, unless they know otherwise, Kubuntu users will likely want to default to using KDE applications rather than GNOME applications).
  • When a package is moved from universe to main, many of the people who previously tended to it lose the ability to upload it.

Much of the metadata needed to solve these problems already exists, but is not exposed to users in any useful form; instead, everything of concern to "supported" derivatives is conflated into "main", losing the ability to express very much more than a binary state.

Tracking of packages

The contents of Ubuntu installation images, together with the basic data used to divide the archive into supported and unsupported components (henceforth simply main and universe for brevity; please read restricted and multiverse as well where appropriate), are tracked using seeds. These are formed as lists of package names that are "interesting" in particular categories; for example, an Ubuntu desktop needs to have the gnome-panel package installed. Dependencies and build-dependencies need not be explicitly listed in seeds, because germinate will expand them as necessary.

Conceptually, therefore, the contents of main are defined by the union of the output of germinate for all supported flavours (i.e. Ubuntu, Kubuntu, etc.). However, in practice it is desirable to have an additional review step before allowing packages into main, as described in the MainInclusionProcess: briefly, the submitter writes a main inclusion report (MIR), which is checked by a small team of reviewers; if passed, archive administrators move the package to main. Since this must, by policy, be followed for all new source packages entering main (with occasional exceptions for "trivial" changes, such as source packages being renamed for new versions of a library), it is thus not generally possible for an uploader to introduce unsuitable dependencies into main either unintentionally or maliciously. This system may briefly be characterised as human moderation of access control changes.

Problems with access control

The ubuntu-core-dev and ubuntu-dev teams were originally intended as rungs on a ladder. Contributors would start out by submitting patches until they were deemed capable of uploading a limited set of packages without prior review; once they had accumulated enough experience there, they would proceed to ubuntu-core-dev where they would be able to upload to the entire archive.

Over time, this has proven to be inadequate in various specific ways, though basically sound in general. For example, in order to be able to upload the kernel, kernel developers must spend time developing extensive packaging experience in order to gain ubuntu-core-dev (and possibly MOTU beforehand), despite the fact that their interests or even jobs are not particularly aligned with this. The Technical Board has on occasion worked around this by ad-hoc exceptions on the condition that people ask for review before uploading packages outside their areas of interest, but this is rather opaque and requires a certain degree of persuasion in each individual case. Such restrictions cannot be enforced. They are only effective based on the restraint of the developer. Similarly, Kubuntu developers must generally demonstrate a general ability to understand the Ubuntu system beyond their own flavour in order to gain ubuntu-core-dev, even though without that they could easily be trusted to do sensible things with packages only on the Kubuntu CD images.

Nothing in the following proposal is intended to diminish the rôle of ubuntu-core-dev. That team should still consist of highly experienced and competent developers with a wide-ranging understanding of the Ubuntu system. They will often be called upon to arbitrate among conflicting requirements from different Ubuntu flavours, or to make changes with an understanding of their effect on a variety of different subsystems. However, this is a team where quality rather than quantity is important, and it makes sense both to reduce the pressure on people to join before they are ready in order to get their work done, and to reduce the demands on ubuntu-core-dev members for sponsorship.

Proposal

This proposal is divided into two parts. The first (and simpler) deals with exceptions useful in specific cases; the second deals with wider issues. The two parts complement each other; they are not intended to be contradictory.

Additional uploaders

As of Launchpad 1.2.5 (though there is no UI yet), it is possible to define additional uploaders for an individual package. For small numbers of packages, this process may well be suitable. For example, it is likely to make sense for experienced kernel team members to be able to upload the kernel regardless of whether they have substantial enough packaging experience to gain ubuntu-core-dev privileges. Similarly, when a single package that has been well-cared-for in universe moves to main (or has its uploader team changed, as in the second part of this proposal) due to the good support it has been receiving from its de facto maintainer, it is usually desirable for that maintainer to be able to continue their work. Additionally, if a single package in universe is primarily cared for by a developer not yet in ubuntu-dev, that individual may be granted direct upload access.

Prospective uploaders would apply to the Technical Board for permission, with such permission generally being granted based on experience with the package in question more than on experience with the rest of the Ubuntu system. Naturally, some level of familiarity with and respect for Ubuntu processes (such as freeze procedures) should be required. As usual, the Technical Board may delegate this as they think appropriate.

As of late 2008, this process is implemented. The Technical Board uses edit_acl.py to manipulate lists of additional uploaders, and anyone with access to the Launchpad API can use this to query additional uploaders.

Improved flavour handling

Rationale

For full-scale Ubuntu derivatives with their own maintenance teams, it is likely that the above process will be inadequate, or at best only workable with a substantial amount of busy-work and automation. Furthermore, it does not solve the problems of expressing support from entities other than Canonical (such as development teams) or of indicating to users which of a set of packages is most likely to work best with the other packages already on their computer.

Good examples of the scaling problems here are Ubuntu Studio and Mythbuntu. Ubuntu Studio has 141 unique source packages in Ubuntu 8.04 LTS (i.e. packages that are part of Ubuntu Studio, but part of no other "set"). Mythbuntu has 124. Kubuntu and the KDE4 remix have 115 between them. Xubuntu has 65. In each case, it makes complete sense for each development group to be able to work on their set of packages, and neither the MOTU community nor the core developer community are likely to be as qualified to do so simply because they largely do not use them directly. Maintaining independent access control lists for each of these would quickly spiral out of control.

However, it is clearly necessary to put measures in place to stop people gaming the system. That is, simply adding a dependency to a package you control should not allow you to upload that other package, or to take control of it from somebody else.

Fortunately, identical measures are already in place for the distinction between main and universe, as described above. Adding a dependency to a package in main does not automatically cause the dependency to propagate to main as well; an archive administrator has to acknowledge it and change the overrides.

Nomenclature

Seeds are aggregated into seed collections. For example, "Kubuntu Jaunty desktop" is a seed; "Kubuntu Jaunty" is a seed collection.

Seed collections are exposed only to developers, and are subject to reorganisation.

Packages are grouped into layers: "core", "Ubuntu desktop", "Ubuntu server", and "Kubuntu" are expected to be examples of layers. A user's installation method typically implies initial interest in one or more layers, and package management frontends should preferentially offer packages from those layers where possible.

Layers are exposed to users via package management, and thus care should be taken when rearranging them. While layer assignments are likely to be generated based on seeds in many or even most cases, the distinction allows us to shield users from rearrangements made purely for the convenience of developers.

A package may be in more than one layer.

Each layer has associated access control (upload permissions and queue administration).

Layers correspond to the proposed "package sets" feature in Launchpad. The name was chosen to reflect the larger structure in Ubuntu: layers are typically built on other layers.

Design

The existing mechanisms for archive administrators to apply overrides to packages placing them in the main or universe component would be replaced with mechanisms for them to assign them to one or more layers (each of which would have an associated uploader team). In parallel, the tools that monitor override discrepancies should be replaced with similar tools to monitor discrepancies in assigned layers. The rules should be as follows:

  • Packages in the "platform" seed collection are assigned to the "core" layer, and may be uploaded only by ubuntu-core-dev.

  • Packages in a single other seed collection are assigned to the usual layer associated with that seed collection, and may be uploaded only by ubuntu-core-dev or the team owning that layer (normally, the same team that can commit to the seeds).

  • Packages in more than one other seed collection are assigned to all the associated layers, and may be uploaded only by ubuntu-core-dev or the teams owning any of those layers. However, archive administrators are expected to investigate packages that want to move into this state, and may consider moving them into the "core" layer if they are in fact important central packages on which many other things depend.

  • Unseeded packages are assigned to no layer (TODO?) and may be uploaded only by ubuntu-core-dev or MOTU. The MOTU community as a whole will be responsible for taking care of packages that are not in any layer, although we expect that many individual MOTU members will also be involved with layer maintenance teams.

Additional uploaders may be added in any of these cases according to the first part of this proposal.

The initial set of recognised layers will be: Ubuntu (desktop), Ubuntu (server), Kubuntu, Ubuntu Education Edition (formerly Edubuntu), Ubuntu Mobile, Xubuntu, Mythbuntu, and Ubuntu Studio. The addition of further layers will require the authorisation of the Technical Board, as that amounts to a mass delegation of the ability to confer upload authority.

While the initial set of layers correspond to installable CD builds ("flavours" of Ubuntu), this is not required, and more fine-grained layers may be added as time goes on.

Unresolved Issues

How does Archive Reorganisation affect langpack handling? Currently the distinction between "main" and "universe" is also used to determine how packages are translated.

BOF Discussion

Design

Self-contained sets of packages have interest to sets of developers

Things to look at

  • How can people know if a given package belongs to a given interest set
  • Why does a given group of developers care about dependencies or build-dependencies?

Implementation

  • Package field for indicating which seed it is in
    • space separated list of task names
    • Layers will be based on seed collections, not seeds
    • something core like gcc binaries will have 'core' in Packages
    • in Packages file so overrides can be changed and Packages regenerated to add a new seed.
    • buildd will know which seeds can be used to build packages from a particular seed, to enforce the ogre model.
    • apt preferences file is the logical place to manage this for apt.
      • list the layers allowed for a build.
      • Launchpad will create the file before the build (little different from current ogre-model with component hierarchy)

MOTU

  • Confers upload rights to packages in no layers?
  • we would no longer have a separate "universe" archive, but rather MOTU and universe would be defined by the set of unseeded packages
  • what do we do with generalists who are shallow (not deep understanding, but interest in broad number of packages)?

Discussion

  • Core are trusted to know what they're not qualified to touch
  • MOTU are also trusted as well
  • don't want to define a group by negative rights
  • a package moved into a layer (eg Kubuntu) is somewhat like this
  • Suppose: we have a group of "Ubuntu Developers" that can upload everywhere
    • layers need to accept responsibility to collaborate

mirror tools

  • need partial mirrors to work well, but then there's difficulty with signing the Packages.gz file
    • may need to split this up anyway, since the Packages.gz file is several megabytes compressed anyway, and redownloading that many times is bad
  • web-mirror-manager spec will deal with this

ArchiveReorganisation (last edited 2012-07-12 01:18:25 by 82-69-40-219)